Jump to content

InetGet was working previously, but not extracting full html


Recommended Posts

I have an AutoIT script It monitors 2 websites for content that applys to me and the services that I provide. One site is : www.Freelancer.com The other: www.PeoplePerHour.com Both sites publish new jobs on their site hourly or so. My AutoIT app, will view those sites and present new jobs to me in a grid that pops up on my screen. Lately, the app has stopped showing me any jobs from PeoplePerHour.

 

For freelancer.com,  Inetget is giving full html but for peopleperhour, now its not coming.

Func _CheckPPH()
    Local Static $hTimer = 0
    Local Static $hDownload = 0
    Local $aTitlesandUrls = 0
    Local Static $sTempFile = ""
    If $hTimer = 0 Then $hTimer = TimerInit()
    If $hDownload = 0 Then
        $sTempFile = _WinAPI_GetTempFileName(@TempDir)
        ConsoleWrite("Checking PPH..." & @CRLF)
        ConsoleWrite(">Downloading..." & @CRLF)
;~         $hDownload = InetGet("http://www.peopleperhour.com/freelance-jobs", $sTempFile, $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
        $hDownload = InetGet("http://www.peopleperhour.com/freelance-jobs", $sTempFile, $INET_FORCERELOAD)
;~         Return 0
    EndIf
;~     Sleep(30)
;~     Local $isCompleted = InetGetInfo($hDownload, $INET_DOWNLOADCOMPLETE)
;~     Local $isError = InetGetInfo($hDownload, $INET_DOWNLOADERROR)
;~     Sleep(30)
;~     If TimerDiff($hTimer) > 3000 And $isError Then
;~         ConsoleWrite("!PPH Fail" & @CRLF)
;~         InetClose($hDownload)
;~         $hDownload = 0
;~         Return 0
;~     EndIf
;~     Sleep(30)
    Local $Show = 0
;~     If TimerDiff($hTimer) > 3000 And $isCompleted Then
    If $hDownload > 0 Then
        ConsoleWrite("+Downloaded..." & @CRLF)
        Local $sPPHHtml = FileRead($sTempFile)
        $aTitlesandUrls = _StringBetween($sPPHHtml, '"title">' & @LF, 'time>')
;~         _ArrayDisplay($aTitlesandUrls)
        Local $aPPH[0][4]
        Local $sTitle = ""
        Local $sUrl = ""
        Local $sID = ""
        Local $sDate = ""
        Local $iRet=0
        Sleep(30)
        For $i = 0 To UBound($aTitlesandUrls) - 1
            $sTitle = _StringBetween($aTitlesandUrls[$i], '<a title="', '" class')
            $sUrl = _StringBetween($aTitlesandUrls[$i], 'href="', '">')
            $sDate = _GetDate($aTitlesandUrls[$i])
            If IsArray($sTitle) And IsArray($sUrl) Then
                $sID = _GetID($sUrl[0])
;~                 _ArrayAdd($aPPH, $sDate & "|" & $sTitle[0] & "|" & $sUrl[0] & "|" & $sID)
                $iRet = _BuildPopupsPPH($sID, $sDate, "PPH: " & $sTitle[0], $sUrl[0])
                If $iRet Then $Show+=1
            EndIf
        Next

        Sleep(30)
;~         If $Show > 0 Then ShowLatestJobs()
;~         _ArrayDisplay($aPPH)
        FileDelete($sTempFile)
        InetClose($hDownload)
        $hDownload = 0
        $hTimer = 0
        Return $Show
    EndIf
    Sleep(30)
EndFunc   ;==>_CheckPPH

Link to post
Share on other sites

Is this topic related to your previous topic?  If so, why did you start another topic?  Also, why didn't you answer my question in the previous topic?  Is it because you knew that harvesting data from the sites that you referred to above is prohibited by their terms of use which would also mean that helping you to do so here would be prohibited?

 

Edited by TheXman
Typo
Link to post
Share on other sites
9 minutes ago, Jahar said:

For previous one, you have asked me to go thru scripts given as examples.

No, I asked you why were asking for help to access a non-existent domain.

 

 

Link to post
Share on other sites
  • Moderators

@Jahar As stated above (and in the other thread) both sites you specify have verbiage in their TOS that states scraping or crawling of their site pages is not permitted. Case closed, please do not open another thread on this topic.

Edited by JLogan3o13

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Link to post
Share on other sites
Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By PeterVerbeek
      This topic give you access to an AutoIt functions library I maintain which is called PAL, Peter's AutoIt Library. The latest version 1.26 contains 214 functions divided into these topics:
      window, desktop and monitor GUI, mouse and color GUI controls including graphical buttons (jpg, png) logics and mathematics include constants string, xml string and file string dialogues and progress bars data lists: lists, stacks, shift registers and key maps (a.ka. dictionaries) miscellaneous: logging/debugging, process and system info Change log and files section  on the PAL website (SourceForge).
      A lot of these functions were created in the development of Peace, Peter's Equalizer APO Configuration Extension, which is a user interface for the system-wide audio driver called Equalizer APO.
    • By Hermes
      Hi, I am struggling in setting the value of a textarea based on the value of clipboard (that contains a long web page source codes). If I use _WD_SetElementValue, it freezes after some time, or appears to be pressing tab and goes out of focus. I can also use send keys but i need the script to run in the background.
      Here is the full script:
      #Include "Chrome.au3" #Include "wd_core.au3" #Include "wd_helper.au3" #Include "WinHttp.au3" #include <MsgBoxConstants.au3> #include <WinAPIFiles.au3> #include <Array.au3> #include <AutoItConstants.au3> #include <WinAPIFiles.au3> #include <GDIPlus.au3> #include <Excel.au3> Local $sDesiredCapabilities, $sSession SetupChrome() _WD_Startup() $sSession = _WD_CreateSession($sDesiredCapabilities) _WD_LoadWait($sSession) _WD_Navigate($sSession, "http://demo.borland.com/testsite/stadyn_largepagewithimages.html") _WD_LoadWait($sSession) Global $sSource = _WD_GetSource($sSession) Local $Paste = ClipPut($sSource) Local $sData = ClipGet() Local $aArray = 0, _ $iOffset = 1 While 1 $aArray = StringRegExp($sData, '(?s)<p>.*</p>', $STR_REGEXPARRAYMATCH, $iOffset) If @error Then ExitLoop $iOffset = @extended For $i = 0 To UBound($aArray) - 1 Local $Paste = ClipPut($aArray[$i]) Local $sRegExData = ClipGet() ;MsgBox(0, "", "$sRegExData = " & $sRegExData) Next WEnd _WD_Navigate($sSession, "https://www.w3schools.com/tags/tryit.asp?filename=tryhtml5_textarea_placeholder") _WD_WaitElement($sSession, $_WD_LOCATOR_ByCSSSelector, "iframe#iframeResult") Local $sElement1 = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "iframe#iframeResult") _WD_FrameEnter($sSession, $sElement1) _WD_WaitElement($sSession, $_WD_LOCATOR_ByXPath, "//html/body/textarea") $textarea = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//html/body/textarea") _WD_ElementAction($sSession, $textarea, 'click') ;WD SetElementValue(SsSession, Stextarea, $sRegExData) <-- I can do this but the focus goes out, or the browser freezes _WD_FrameLeave($sSession) sleep(2000) Send("^v") _WD_LoadWait($sSession) _WD_Shutdown() Func SetupChrome() _WD_Option('Driver', 'chromedriver.exe') _WD_Option('Port', 9515) _WD_Option('DriverParams', '--log-path="' & @ScriptDir & '\chrome.log"') $sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["start-maximized","disable-infobars"]}}}}' EndFunc ;==>SetupChrome Can someone help me please, or re-direct me to the right path? TIA!
    • By DJ143
      I have a autoit exe file which is used in upload/browse file functionality.  This has been integrated with selenium framework and I am invoking the autoit exe using Java process and runtime. 
      Now the issue is when I run the scripts and invoke the autoit exe in local it works perfectly.  But when I use selenium grid or jenkins to run the scripts in another windows server it is not working.
      Can anyone please suggest any solution for this?
    • By Hermes
      Hello, the script below will read column A from an excel file - and if a value matches in the browser, it will click the corresponding link and click on a specific button to paste the data, then writes "Completed" in Column B. It will continue to read from the excel file and do the same thing for all the remaining rows.
      #Include "Chrome.au3" #Include "wd_core.au3" #Include "wd_helper.au3" #Include "WinHttp.au3" #include <MsgBoxConstants.au3> #include <File.au3> #include <IE.au3> #include <Array.au3> #include <INet.au3> #include <AutoItConstants.au3> #include <WinAPIFiles.au3> #include <GDIPlus.au3> #include <Excel.au3> #Include "WinHttp.au3" #Include "_HtmlTable2Array.au3" Local $sDesiredCapabilities, $sSession SetupChrome() _WD_Startup() $sSession = _WD_CreateSession($sDesiredCapabilities) _WD_LoadWait($sSession) _WD_Navigate($sSession, "table1.html") _WD_LoadWait($sSession) _WD_WaitElement($sSession, $_WD_LOCATOR_ByXPath, "//table[@class='main']") Local $sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//table[@class='main']") ;ConsoleWrite ("mat-table " & $sElement & @CRLF) Local $aArray1 = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, ".//td[contains(@class,'data')]", $sElement, True) sleep(1000) For $i = 0 to UBound($aArray1) - 1 $aArray1[$i] = _WD_ElementAction($sSession, $aArray1[$i], 'text') Next ;_ArrayDisplay($aArray1) ;Email variables $SmtpServer = "" ; address for the smtp-server to use - REQUIRED $FromName = "Hermes" ; name from who the email was sent $FromAddress = "sender@gmail.com" ; address from where the mail should come $ToAddress = "recipient@gmail.com" ; destination address of the email - REQUIRED, use commas (,) to add more email addresses $Subject = "File not found" ; subject from the email - can be anything you want it to be $Body = "File not found!" ; the messagebody from the mail - can be left blank but then you get a blank mail $AttachFiles = "" ; the file(s) you want to attach seperated with a ; (Semicolon) - leave blank if not needed $CcAddress = "" ; address for cc - leave blank if not needed $BccAddress = "" ; address for bcc - leave blank if not needed $Importance = "High" ; Send message priority: "High", "Normal", "Low" $Username = "" ; username for the account used from where the mail gets sent - REQUIRED $Password = "" ; password for the account used from where the mail gets sent - REQUIRED $IPPort = 25 ; port used for sending the mail $ssl = 0 ; enables/disables secure socket layer sending - put to 1 if using httpS $tls = 0 ; enables/disables TLS when required Local $oAppl = _Excel_Open() Local $sWorkbook = "c:\test.xlsx" Local $oWorkbook = _Excel_BookOpen($oAppl, $sWorkbook) ;open excel and pass both parameters If FileExists($sWorkbook) Then ;Check if the file exist. Local $oAppl = _Excel_Open() Local $sWorkbook = "c:\test.xlsx" Local $oWorkbook = _Excel_BookOpen($oAppl, $sWorkbook) ;open excel and pass both parameters Local $aArray2 = _Excel_RangeRead($oWorkbook,Default,$oWorkbook.ActiveSheet.Usedrange.Columns("A:A")) Local $iIdx Local $Skipline = 0 ;0==> first line Do Local $temprf For $i = 0 To UBound($aArray2) - 1 $temprf &= $aArray2[$i] _WD_WaitElement($sSession, $_WD_LOCATOR_ByXPath, ".//a[contains(@class,'edit') and contains(text(),'Edit')]") Local $aElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, ".//a[contains(@class,'edit') and contains(text(),'Edit')]", $sElement, True) $iIdx = _ArraySearch($aArray1, $aArray2[$i]) If @error Then ContinueLoop _WD_ElementAction($sSession, $aElement[$iIdx], 'click') If $i < $Skipline Then ContinueLoop $oRange = $oWorkbook.ActiveSheet.Range("B" & $i + 1 & ":XFD" & $i + 1) _Excel_RangeCopyPaste($oWorkbook.Activesheet, $oRange) ;Paste Local $oTest4 = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "pastebutton") _WD_ElementAction($sSession, $oTest4, 'click') Sleep(1000) ;Save Button Local $save3 = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "button.button") _WD_ElementAction($sSession, $save3, 'click') _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Completed", "B" & $i+1) sleep(1000) Next Until (Not @error) _Excel_Close($oWorkbook) Else _INetSmtpMailCom($SmtpServer, $FromName, $FromAddress, $ToAddress, $Subject, $Body, $CcAddress, $BccAddress, $Importance, $Username, $Password, $IPPort, $ssl, $tls) Exit EndIf _WD_LoadWait($sSession) ;Attaching files to emails Func _INetSmtpMailCom($s_SmtpServer, $s_FromName, $s_FromAddress, $s_ToAddress, $s_Subject = "", $as_Body = "", $s_CcAddress = "", $s_BccAddress = "", $s_Importance="Normal", $s_Username = "", $s_Password = "", $IPPort = 25, $ssl = 0, $tls = 0) Local $objEmail = ObjCreate("CDO.Message") $objEmail.From = '"' & $s_FromName & '" <' & $s_FromAddress & '>' $objEmail.To = $s_ToAddress Local $i_Error = 0 Local $i_Error_desciption = "" If $s_CcAddress <> "" Then $objEmail.Cc = $s_CcAddress If $s_BccAddress <> "" Then $objEmail.Bcc = $s_BccAddress $objEmail.Subject = $s_Subject If StringInStr($as_Body, "<") And StringInStr($as_Body, ">") Then $objEmail.HTMLBody = $as_Body Else $objEmail.Textbody = $as_Body & @CRLF EndIf $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2 $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/smtpserver") = $s_SmtpServer If Number($IPPort) = 0 then $IPPort = 25 $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = $IPPort ;Authenticated SMTP If $s_Username <> "" Then $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/smtpauthenticate") = 1 $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/sendusername") = $s_Username $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/sendpassword") = $s_Password EndIf ; Set security params If $ssl Then $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/smtpusessl") = True If $tls Then $objEmail.Configuration.Fields.Item ("http://schemas.microsoft.com/cdo/configuration/sendtls") = True ;Update settings $objEmail.Configuration.Fields.Update ; Set Email Importance Switch $s_Importance Case "High" $objEmail.Fields.Item ("urn:schemas:mailheader:Importance") = "High" Case "Normal" $objEmail.Fields.Item ("urn:schemas:mailheader:Importance") = "Normal" Case "Low" $objEmail.Fields.Item ("urn:schemas:mailheader:Importance") = "Low" EndSwitch $objEmail.Fields.Update ; Sent the Message $objEmail.Send $objEmail="" EndFunc ;==>_INetSmtpMailCom Local $aDir = _FileListToArrayRec(@TempDir, "scoped_dir*;chrome_*", $FLTAR_FOLDERS, $FLTAR_NORECUR, $FLTAR_NOSORT, $FLTAR_FULLPATH) Sleep(2000) For $i = 1 To $aDir[0] DirRemove($aDir[$i], $DIR_REMOVE) Next _WD_LoadWait($sSession) _WD_Shutdown() Func SetupChrome() _WD_Option('Driver', 'chromedriver.exe') _WD_Option('Port', 9515) _WD_Option('DriverParams', '--log-path="' & @ScriptDir & '\chrome.log"') $sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["start-maximized","disable-infobars"]}}}}' EndFunc ;==>SetupChrome If the excel file doesn't exists in the folder, it will send an email to a specific recipient.
      What i am trying figure out now is if the excel crashes while the script/loop is running, I want to relaunch the excel file continue to the last row before the excel crashed. So if the value of column B is not marked as "completed", it should continue from that row
      Appreciate any help that I can get to achieve this.
      table1.html test.xlsx
    • By adityaparakh
      Hello ,
      I am trying to use Websockets in AutoIt.
      It is to fetch live stock market prices , API is provided and documentation available for python language.
      The link for the code snippet is :
      https://symphonyfintech.com/xts-market-data-front-end-api-v2/#tag/Introduction
      https://symphonyfintech.com/xts-market-data-front-end-api-v2/#tag/Instruments/paths/~1instruments~1subscription/post
       
      https://github.com/symphonyfintech/xts-pythonclient-api-sdk
       
      Second Link is to subscribe to a list of ExchangeInstruments.
      Now I would like to get live stock ltp (LastTradedPrice) for a few stocks whose "ExchangeInstrumentID" I know.
      I am able to use the WinHttp object to perform actions using simple codes like below :
      I have the secretKey and appkey and can generate the needed token. And get the unique ExchangeInstrumentID.

      Below code is just for example of how I am using WinHttp. Unrelated to socket part.
      Global $InteractiveAPItoken = IniRead(@ScriptDir & "\Config.ini", "token", "InteractiveAPItoken", "NA") $baseurl = "https://brokerlink.com/interactive/" $functionurl = "orders" $oHTTP = ObjCreate("winhttp.winhttprequest.5.1") $oHTTP.Open("POST", $baseurl & $functionurl, False) $oHTTP.SetRequestHeader("Content-Type", "application/json;charset=UTF-8") $oHTTP.SetRequestHeader("authorization", $InteractiveAPItoken) $pD = '{ "exchangeSegment": "NSEFO", "exchangeInstrumentID": ' & $exchangeInstrumentID & ', "productType": "' & $producttype & '", "orderType": "MARKET", "orderSide": "' & $orderside & '", "timeInForce": "DAY", "disclosedQuantity": 0, "orderQuantity": ' & $qty & ', "limitPrice": 0, "stopPrice": 0, "orderUniqueIdentifier": "' & $orderidentifier & '"}' $oHTTP.Send($pD) $oReceived = $oHTTP.ResponseText $oStatusCode = $oHTTP.Status
          
          
      But am struggling to understand and use socket.
      Would be of great help if you can have a look at the link mentioned above and help with the code sample for AutoIt.
      To connect and listen to a socket.
      Thanks a lot
       
×
×
  • Create New...