Jump to content

Search the Community

Showing results for tags '.href'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


  • General
    • Announcements and Site News
    • Administration
  • AutoIt v3
    • AutoIt Help and Support
    • AutoIt Technical Discussion
    • AutoIt Example Scripts
  • Scripting and Development
    • Developer General Discussion
    • Language Specific Discussion
  • IT Administration
    • Operating System Deployment
    • Windows Client
    • Windows Server
    • Office


  • AutoIt Team
    • Beta
    • MVP
  • AutoIt
    • Automation
    • Databases and web connections
    • Data compression
    • Encryption and hash
    • Games
    • GUI Additions
    • Hardware
    • Information gathering
    • Internet protocol suite
    • Maths
    • Media
    • PDF
    • Security
    • Social Media and other Website API
    • Windows
  • Scripting and Development
  • IT Administration
    • Operating System Deployment
    • Windows Client
    • Windows Server
    • Office

Find results in...

Find results that contain...

Date Created

  • Start


Last Updated

  • Start


Filter by number of...


  • Start



Member Title




Found 1 result

  1. Hey guys, So we recently moved our company Knowledge Base to an in-house solution rather than paying a monthly subscription for someone else to host it and use their features. Either way, we have moved from one KB (Knowledge Base) site to another but there is an issue. No one restricted access to the original KB site which meant anyone was able to edit the site as they pleased. This means that some of the old KB's features (picture/video/internal links) are still be utilized and I need to find out which pages have links that inevitably go dead once we stop sending them money. So, I feel like in 90% of the way to getting this working correctly. Steps are fairly simple: Log into the KB Load a list of URL's Pull the HTML and search for "helpjuice" Log the URL has been checked Check for links on this page and check it against the list of URL's Log any new URL's that are missing from the list Now, I have no idea if this is the best way of doing this process but, again, I've made it 90% of the way so far and I would like to figure this specific problem out and if anyone has a better/more effective method of doing this, please point me in the right direction! Here is the Code i have so far. #include <File.au3> #include <FileConstants.au3> #include <WinAPIFiles.au3> #include <IE.au3> #include <MsgBoxConstants.au3> #include <Array.au3> Global $aGlobalLinks[1][3] = [["Link", "Checked", "Hit"]] ;Create Array with Headers ;//Load KB and login by submit the login form. Local $oIE = _IECreate ("https://website.com/kb/") Local $oForm = _IEFormGetObjByName($oIE, "login-form") Local $oEmail = _IEFormElementGetObjByName($oForm, "email") Local $oPassword = _IEFormElementGetObjByName($oForm, "password") _IEFormElementSetValue($oEmail, "email@email.com") _IEFormElementSetValue($oPassword, "password") Sleep(500) _IEFormSubmit($oForm) ;_IEQuit($oIE) ;//Load a second window to confirm it shows we are logged in. Grab a list of links as a jumping off point if the KB_URL.txt is empty ;//Mainly used for Debugging Local $oIE = _IECreate ("https://website.com/kb/") $oLinks = _IELinkGetCollection($oIE) For $oLink In $oLinks _ArraySearch($aGlobalLinks, $oLink.href) If @error = 6 Then Local $aFill[1][3] = [[$oLink.href, "No", "No"]] _ArrayAdd($aGlobalLinks, $aFill) EndIf Next ;//Close second window of IE _IEQuit($oIE) ;//Attempt to load the KB_URL.txt LoadFile() ;//Setting $r to 1 since the txt file and the $aGlobalLinks will have a header in index 0 Global $r = 1 Do If $aGlobalLinks[$r][1] = "No" Then ;Skip Entries that have already been checked Local $oIE = _IECreate($aGlobalLinks[$r][0], 0, 1, 0) ;Create new window with the first 'unchecked' link entry _IELoadWait($oIE, 100, 2500) ;Wait 2.5 seconds for page to load If @error = 6 Then ;@error = 6 means the Timeout was met..Send {ESC} to stop the loading ConsoleWrite("Webpage timed out...Sending ESC..." & @CRLF) Send("{ESC}") Sleep(500) EndIf Local $sHTML = _IEDocReadHTML($oIE) ;Read the HTML and look for any trace of 'helpjuice' Local $result = StringRegExp($sHTML, ".*?helpjuice.*?", 0) ConsoleWrite("RegExt result: " & $result & @CRLF) If $result = 1 Then ;A result of 1 means there is a match $aGlobalLinks[$r][1] = "Yes" ;Set 'Checked' to 'Yes' $aGlobalLinks[$r][2] = "Yes" ;Set 'Hit' to 'Yes' ConsoleWrite("~~~~~~~~~~~~~ H I T ~~~~~~~~~~~~~ " & $aGlobalLinks[$r][0] & @CRLF) Else $aGlobalLinks[$r][1] = "Yes" ;Set 'Checked' to 'Yes' EndIf $oLinks = _IELinkGetCollection($oIE) ;Grab a list of links from the existing site ;_ArrayDisplay($oLinks) For $oLink In $oLinks ;Loop through all the links found and add any new links to the end of $aGlobalLinks ConsoleWrite("String: " & String($oLink) & @CRLF) ConsoleWrite("Looking for: " & $oLink.href & @CRLF) _ArraySearch($aGlobalLinks, $oLink.href) If @error = 6 Then Local $aFill[1][3] = [[$oLink.href, "No", "No"]] ;Set 'Checked' and 'Hit' to 'No' because this link has not been checked yet _ArrayAdd($aGlobalLinks, $aFill) EndIf Next _IEQuit($oIE) ;This next part is to reduce the amount times we write to the file ;Changing the '10' to '100' means the script will save the changes every 100 entries If IsInt($r/10) = 1 Then SaveFile() ConsoleWrite("File Saved: " & $r & "/" & UBound($aGlobalLinks) & @CRLF) EndIf ConsoleWrite("Completed " & $r & "/" & UBound($aGlobalLinks) & @CRLF) EndIf $r += 1 Until $r > UBound($aGlobalLinks) ;_ArrayDisplay($aGlobalLinks) Func SaveFile() $oFile = FileOpen(@ScriptDir & "\KB_URLs.txt", 2) FileClose($oFile) _FileWriteFromArray(@ScriptDir & "\KB_URLs.txt", $aGlobalLinks, 0, UBound($aGlobalLinks), ",") EndFunc Func LoadFile() _FileReadToArray(@ScriptDir & "\KB_URLs.txt", $aGlobalLinks, $FRTA_NOCOUNT, ",") _ArrayDisplay($aGlobalLinks) EndFunc The error I'm getting has to do with the function _IELinkGetCollection. I used the examples in the AutoIT Help section and there is multiple uses of $oLink.href. I haven't been able to find much on when/how to use the .href. Here is the Console Output of the error: RegExt result: 0 Looking for: https://website.com/kb/ Looking for: https://website.com/kb/207557-abc-bank-homepage# Looking for: https://website.com/kb/ Looking for: https://website.com/kb/19011-partners-and-isos Looking for: https://website.com/kb/46470-onsite-install-partners Looking for: https://website.com/en/small-business/payments-and-processing/abc-merchant-services.html Looking for: https://website.com/Clover Looking for: https://website.com/screens/signup/?integrations_id=12345 Looking for: https://website.com/instruction/import-an-inventory-menu-spreadsheet/?userDevice=web Looking for: https://website.com/instruction/import-an-inventory-menu-spreadsheet/?userDevice=web Looking for: https://website.com/instruction/import-an-inventory-menu-spreadsheet/?userDevice=web Looking for: https://website.com/appmarket/apps/Z6GMBJ5HCBEQA?clientCountry=US Looking for: https://website.com/kb/207557-abc-bank-homepage#panel3a Looking for: https://website.com/kb/207557-abc-bank-homepage#panel4a Looking for: https://website.com/kb/207557-abc-bank-homepage#panel5a "C:\Users\Jon\Desktop\KB Scrub\HTMLDOC_Test.au3" (65) : ==> The requested action with this object has failed.: ConsoleWrite("Looking for: " & $oLink.href & @CRLF) ConsoleWrite("Looking for: " & $oLink^ ERROR Any insight is appreciated!
  • Create New...