Jump to content

InetGet was working previously, but not extracting full html


Recommended Posts

I have an AutoIT script It monitors 2 websites for content that applys to me and the services that I provide. One site is : www.Freelancer.com The other: www.PeoplePerHour.com Both sites publish new jobs on their site hourly or so. My AutoIT app, will view those sites and present new jobs to me in a grid that pops up on my screen. Lately, the app has stopped showing me any jobs from PeoplePerHour.

 

For freelancer.com,  Inetget is giving full html but for peopleperhour, now its not coming.

Func _CheckPPH()
    Local Static $hTimer = 0
    Local Static $hDownload = 0
    Local $aTitlesandUrls = 0
    Local Static $sTempFile = ""
    If $hTimer = 0 Then $hTimer = TimerInit()
    If $hDownload = 0 Then
        $sTempFile = _WinAPI_GetTempFileName(@TempDir)
        ConsoleWrite("Checking PPH..." & @CRLF)
        ConsoleWrite(">Downloading..." & @CRLF)
;~         $hDownload = InetGet("http://www.peopleperhour.com/freelance-jobs", $sTempFile, $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
        $hDownload = InetGet("http://www.peopleperhour.com/freelance-jobs", $sTempFile, $INET_FORCERELOAD)
;~         Return 0
    EndIf
;~     Sleep(30)
;~     Local $isCompleted = InetGetInfo($hDownload, $INET_DOWNLOADCOMPLETE)
;~     Local $isError = InetGetInfo($hDownload, $INET_DOWNLOADERROR)
;~     Sleep(30)
;~     If TimerDiff($hTimer) > 3000 And $isError Then
;~         ConsoleWrite("!PPH Fail" & @CRLF)
;~         InetClose($hDownload)
;~         $hDownload = 0
;~         Return 0
;~     EndIf
;~     Sleep(30)
    Local $Show = 0
;~     If TimerDiff($hTimer) > 3000 And $isCompleted Then
    If $hDownload > 0 Then
        ConsoleWrite("+Downloaded..." & @CRLF)
        Local $sPPHHtml = FileRead($sTempFile)
        $aTitlesandUrls = _StringBetween($sPPHHtml, '"title">' & @LF, 'time>')
;~         _ArrayDisplay($aTitlesandUrls)
        Local $aPPH[0][4]
        Local $sTitle = ""
        Local $sUrl = ""
        Local $sID = ""
        Local $sDate = ""
        Local $iRet=0
        Sleep(30)
        For $i = 0 To UBound($aTitlesandUrls) - 1
            $sTitle = _StringBetween($aTitlesandUrls[$i], '<a title="', '" class')
            $sUrl = _StringBetween($aTitlesandUrls[$i], 'href="', '">')
            $sDate = _GetDate($aTitlesandUrls[$i])
            If IsArray($sTitle) And IsArray($sUrl) Then
                $sID = _GetID($sUrl[0])
;~                 _ArrayAdd($aPPH, $sDate & "|" & $sTitle[0] & "|" & $sUrl[0] & "|" & $sID)
                $iRet = _BuildPopupsPPH($sID, $sDate, "PPH: " & $sTitle[0], $sUrl[0])
                If $iRet Then $Show+=1
            EndIf
        Next

        Sleep(30)
;~         If $Show > 0 Then ShowLatestJobs()
;~         _ArrayDisplay($aPPH)
        FileDelete($sTempFile)
        InetClose($hDownload)
        $hDownload = 0
        $hTimer = 0
        Return $Show
    EndIf
    Sleep(30)
EndFunc   ;==>_CheckPPH

Link to post
Share on other sites

Is this topic related to your previous topic?  If so, why did you start another topic?  Also, why didn't you answer my question in the previous topic?  Is it because you knew that harvesting data from the sites that you referred to above is prohibited by their terms of use which would also mean that helping you to do so here would be prohibited?

 

Edited by TheXman
Typo
Link to post
Share on other sites
9 minutes ago, Jahar said:

For previous one, you have asked me to go thru scripts given as examples.

No, I asked you why were asking for help to access a non-existent domain.

 

 

Link to post
Share on other sites
  • Moderators

@Jahar As stated above (and in the other thread) both sites you specify have verbiage in their TOS that states scraping or crawling of their site pages is not permitted. Case closed, please do not open another thread on this topic.

Edited by JLogan3o13

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Link to post
Share on other sites
Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By HoangDung
      This is the function that returns the result from cmd, initially i connect to the network wait then i make a call to the above _GetDOSOutput($sCommand) function i want to wait 1 period of time netsh wlan connect name="name" actually but after starting to execute the netsh wlan show interfaces command i tried adding a timeout command it seems to have ignored the timeout command?
      #include <WindowsConstants.au3> #include <Constants.au3> Func _GetDOSOutput($sCommand) Run('"' & @ComSpec & '" /c ' & $sCommand, "", @SW_HIDE, $STDERR_CHILD + $STDOUT_CHILD) Local $sOutput = '' Local $iPID = Run('"' & @ComSpec & '" /c ' & $sCommand, "", @SW_HIDE, $STDERR_CHILD + $STDOUT_CHILD) While 1 $sOutput &= StdoutRead($iPID, False, False) If @error Then ExitLoop EndIf Sleep(10) WEnd Return $sOutput EndFunc Local $sCommand= 'netsh wlan delete profile name="wait" & netsh wlan connect name="name" interface="Wi-fi" & netsh wlan show interfaces' MsgBox(0,0,_GetDOSOutput($sCommand))  
    • By jiaojiaodubai
      I am happy to see that the latest version of AutoIt3 can already be searched, downloaded and installed in the Windows 10+built-in package manager (Winget). Therefore, we can automatically install AutoIt3 itself by writing Powershell scripts, which helps to quickly deploy the working environment of AutoIt3 on new machines.

      Now, although AutoIt3 can be found and installed in Winget, its dedicated editor, SciTE4AutoIt, does not support this. I submitted a package request for SciTE4AutoIt to the Winget development team on GitHub and provided a download link. Then its developer replied to me:
      In other words, Winget developers believe that our website prevents Winget from directly downloading the .exe file of SciTE4AutoIt.

      In fact, I don't know much about the technical details he said, but I think we can make some changes to make SciTE4AutoIt more accessible.
       
    • By D3fr0s7
      I'm trying to make one tray item delete another, but when I do this, all tray items that were created after the deleted item don't work as intended, as if their controlID's were all shifted down one value, and their corresponding tray items now (after deletion) run the code of the tray item before it. Am I missing something? Is there a better way to accomplish what I'm trying to do?
      #include <TrayConstants.au3> #include <Array.au3> HotKeySet ( "{ESC}", "Abort" ) Opt ( "TrayMenuMode", 3 ) TraySetState($TRAY_ICONSTATE_SHOW) ; Show the tray menu. Global $aTray[8] ; Defines array to hold tray items. $aTray[0] = TrayCreateItem ( "Test 1 (Name Test 5)" ) $aTray[1] = TrayCreateItem ( "Test 2 (Delete Test 5)" ) $aTray[2] = TrayCreateItem ( "Test 3 (Restore Test 5)" ) $aTray[3] = TrayCreateItem ( "Test 4 (Check if Test 5 is blank or space)" ) $aTray[4] = TrayCreateItem ( "Test 5 Delete Me" ) $aTray[5] = TrayCreateItem ( "Test 6 (Check Test 5 Text)" ) $aTray[6] = TrayCreateItem ( "Test 7 (Read Values)" ) $aTray[7] = TrayCreateItem ( "Test 8 (Count Blanks)" ) While 1 Switch TrayGetMsg() Case $aTray[0] ; "Test 1" Change Test 5 Text. If TrayItemGetText ( $aTray[0] ) <> "" Then Global $TrayText = InputBox ( "Test", "Choose text for Test 5", "Test 5 Delete Me" ) TrayItemSetText ( $aTray[4], $TrayText) EndIf Case $aTray[1] ; "Test 2" Deletes "Test 5". If TrayItemGetText ( $aTray[1] ) <> "" Then Global $TrayDeletedName = TrayItemGetText ( $aTray[4] ) TrayItemDelete ( $aTray[4] ) _ArrayInsert ( $aTray, 4 ) EndIf Case $aTray[2] ; "Test 3" Restores "Test 5". If TrayItemGetText ( $aTray[2] ) <> "" Then $aTray[4] = TrayCreateItem ( $TrayDeletedName ) EndIf Case $aTray[3] ; "Test 4" Check if Test 5 value is blank, space, or filled. If TrayItemGetText ( $aTray[3] ) <> "" Then If TrayItemGetText ( $aTray[4] ) = "" Then MsgBox ( 0, "Test", "Test 5 is blank" ) ElseIf TrayItemGetText ( $aTray[4] ) = " " Then MsgBox ( 0, "Test", "Test 5 is not blank (space)" ) Else MsgBox ( 0, "Test", "Test 5 is assigned a value" ) EndIf EndIf Case $aTray[4] ; "Test 5" (Item to test for, during, and after deletion). If TrayItemGetText ( $aTray[4] ) <> "" Then MsgBox ( 0, "Test", "I'm here!" ) EndIf Case $aTray[5] ; "Test 6" Displays Text from Test 5 item. If TrayItemGetText ( $aTray[5] ) <> "" Then $Test5Text = TrayItemGetText ( $aTray[4] ) MsgBox ( 0, "Test", "Test 5 Text: " & $Test5Text ) EndIf Case $aTray[6] ; "Test 7" Displays all item values. If TrayItemGetText ( $aTray[6] ) <> "" Then MsgBox ( 0, "Test", "$aTray[0]: " & $aTray[0] & @CRLF & _ "$aTray[1]: " & $aTray[1] & @CRLF & _ "$aTray[2]: " & $aTray[2] & @CRLF & _ "$aTray[3]: " & $aTray[3] & @CRLF & _ "$aTray[4]: " & $aTray[4] & @CRLF & _ "$aTray[5]: " & $aTray[5] & @CRLF & _ "$aTray[6]: " & $aTray[6] & @CRLF & _ "$aTray[7]: " & $aTray[7] & @CRLF ) EndIf Case $aTray[7] ; "Test 8" Counts all blanks in tray values. If TrayItemGetText ( $aTray[7] ) <> "" Then Global $blankCount = _ArrayFindAll ( $aTray, "" ) If $blankCount = -1 Then If @error = 6 Then MsgBox ( 0, "Test", "Error, No blanks present") EndIf Else MsgBox ( 0, "Test", "# of blanks: " & $blankCount ) EndIf EndIf EndSwitch WEnd Func Abort() Exit EndFunc Here is a test script I created to try to troubleshoot the problem on my own, with no luck. pay specific attention to "Test 2" ($aTray[1]), "Test 5" ($aTray[4]), and how every tray item after "Test 5" ($aTray[4]) behaves after deletion. Clicking "Test 2" will delete tray item "Test 5", after deletion every item runs the code of the tray item that was established before it (ex. "Test 3" and "Test 4" run their respective code, "Test 5" no longer exists, "Test 6" runs "Test 7", "Test 7" runs "Test 8"), and the last item ("Test 8" $aTray[7]) has no effect when the tray item is clicked. 

      I understand that deleting the tray item changes the controlID, but I don't know in what way it does, and therefore how I can fix it to be able to achieve what I want it to. I appreciate any help or guidance with this problem.

      To clarify, what I'm ultimately trying to do is create a 'while' loop with switch case functions that can exist without necessarily being linked to a tray item, so that I can add and delete them at liberty using the script's functions, without having to differentiate switch case functions with if functions (if $aTray[x] exists, then use this set of switch case functions, etc.). 

      Please, I am in pain. Water come school me again pls
    • By PeterVerbeek
      This topic give you access to an AutoIt functions library I maintain which is called PAL, Peter's AutoIt Library. The latest version 1.26 contains 214 functions divided into these topics:
      window, desktop and monitor GUI, mouse and color GUI controls including graphical buttons (jpg, png) GUI numberbox controls for integer, real, binary and hexadecimal input logics and mathematics include constants string, xml string and file string dialogues and progress bars data lists: lists, stacks, shift registers and key maps (a.ka. dictionaries) miscellaneous: logging/debugging, process and system info Change log and files section  on the PAL website (SourceForge).
      A lot of these functions were created in the development of Peace, Peter's Equalizer APO Configuration Extension, which is a user interface for the system-wide audio driver called Equalizer APO.
    • By rudi
      Hello,
      is there a way to use inetget() to catch the content of an 404 error page returned by the web server?
       
      $URL="https://www.autoitscript.com/ThisPathDoesntExist" $content=InetGet($url,"c:\temp\xxx.html",1+2) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $content = "' & $content & """" & @CRLF & "@Extended: """ & @extended & """" & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console  
      >"C:\Program Files (x86)\AutoIt3\SciTE\..\AutoIt3.exe" "C:\Program Files (x86)\AutoIt3\SciTE\AutoIt3Wrapper\AutoIt3Wrapper.au3" /run /prod /ErrorStdOut /in "C:\temp\löschmich\xxx.au3" /UserParams +>15:27:05 Starting AutoIt3Wrapper v.18.708.1148.0 SciTE v.4.1.0.0 Keyboard:00000407 OS:WIN_10/ CPU:X64 OS:X64 Environment(Language:0407) CodePage:0 utf8.auto.check:4 +> SciTEDir => C:\Program Files (x86)\AutoIt3\SciTE UserDir => C:\Users\admin.AD\AppData\Local\AutoIt v3\SciTE\AutoIt3Wrapper SCITE_USERHOME => C:\Users\admin.AD\AppData\Local\AutoIt v3\SciTE >Running AU3Check (3.3.14.5) from:C:\Program Files (x86)\AutoIt3 input:C:\temp\löschmich\xxx.au3 +>15:27:05 AU3Check ended.rc:0 >Running:(3.3.14.5):C:\Program Files (x86)\AutoIt3\autoit3.exe "C:\temp\löschmich\xxx.au3" --> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop @@ Debug(6) : $content = "0" @Extended: "0" >Error code: 13 +>15:27:05 AutoIt3.exe ended.rc:0 +>15:27:05 AutoIt3Wrapper Finished. >Exit code: 0 Time: 0.9361  
      The browser (I use Chrome) is displaying this 404 page: (That's what I'd like to catch)
      Not Found The requested URL /ThisPathDoesntExist was not found on this server. html code (Browser ctrl+u):
      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL /ThisPathDoesntExist was not found on this server.</p> </body></html>  
      Wireshark response 404 packet:
      Hypertext Transfer Protocol HTTP/1.1 404 Not Found\r\n Server: nginx\r\n Date: Wed, 06 Apr 2022 13:34:26 GMT\r\n Content-Type: text/html; charset=iso-8859-1\r\n Content-Length: 217\r\n Connection: keep-alive\r\n Vary: Accept-Encoding\r\n \r\n [HTTP response 1/1] [Time since request: 0.056074000 seconds] [Request in frame: 1476] [Request URI: http://www.autoitscript.com/ThisPathDoesntExist] File Data: 217 bytes Line-based text data: text/html (7 lines) <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL /ThisPathDoesntExist was not found on this server.</p> </body></html> any suggestions appreciated,
      <edit> also tried _inetgetsource() and inetread() </edit>
      Rudi
×
×
  • Create New...