pedmacedo Posted January 23, 2011 Share Posted January 23, 2011 (edited) I know this sounds odd, but I couldn't think of anything else. I made a script that enters on a site and gets some info from this page. There are 92000 pages in total. After getting the data from exactly 300 pages, the script stops getting the data. Here is the code: #include <Excel.au3> #include <String.au3> #include <Array.au3> #include <INet.au3> $excel_read = _ExcelBookOpen("C:\Users\Macedo\Desktop\correct_links.xlsx") $excel_write = _ExcelBookOpen("C:\Users\Macedo\Desktop\info.xlsx") $line = 1 $name = "" While $line < 92480 $HTML = _INetGetSource(_ExcelReadCell($excel_read,$line,1)) $name = _StringBetween($HTML,'<title>',',') If @error == 1 Then $name = "Unavailable" SetError(0) Else $name = StringStripWS($name[0],4) EndIf _ExcelWriteCell($excel_write,$name,$line+1,1) $line = $line + 1 WEnd After 300 pages, $name is always set to "Unavailable", which means @error == 1. First I thought there was something special with page number 300, but that's not it. If I start getting info from, let's say, page 250, the script will work till page 550. Any ideas guys? Thanks in advance! Edited January 23, 2011 by pedmacedo Link to comment Share on other sites More sharing options...
Developers Jos Posted January 23, 2011 Developers Share Posted January 23, 2011 (edited) The @error test is for _stringBetween() not _InetGetSource. So have you checked what _InetGetSource() is returning when it goes wrong and what its error is? Edited January 23, 2011 by Jos SciTE4AutoIt3 Full installer Download page - Beta files Read before posting How to post scriptsource Forum etiquette Forum Rules Live for the present, Dream of the future, Learn from the past. Link to comment Share on other sites More sharing options...
JohnOne Posted January 23, 2011 Share Posted January 23, 2011 I'd hazzard a guess that the website is putting the blockers on you after that many downloads, after all you begin to look like a bot. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
pedmacedo Posted January 23, 2011 Author Share Posted January 23, 2011 The @error test is for _stringBetween() not _InetGetSource.So have you checked what _InetGetSource() is returning when it goes wrong and what its error is?I will try to do that, thanks for the suggestion.I'd hazzard a guess that the website is putting the blockers on you after that many downloads, after all you begin to look like a bot.I hope that's not the case I will try using IE UDF just to confirm this hypothesis. Link to comment Share on other sites More sharing options...
guinness Posted January 23, 2011 Share Posted January 23, 2011 (edited) I hope that's not the case I will try using IE UDF just to confirm this hypothesis.That won't Fix the problem, if you "Ping" a Website 300 times in 10 seconds of course it will look suspicious! Edited January 23, 2011 by guinness UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
jchd Posted January 23, 2011 Share Posted January 23, 2011 It could be that AutoIt or IE is getting out of allowed FD or other kind of handle. To get rid of a possible issue with _Excel stuf, could you first load an array with the Excel names then change your _InetGetSource line to supply the name from the array. Then if it still coughs after 300 calls, we're sure the issue is with _Inet* not _Excel* supplying wrong data (unlikely but not 100% excluded). This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
kylomas Posted January 23, 2011 Share Posted January 23, 2011 pedmacedo, _inetgetsource is just running inetread. I've used inetread to iterate over 1000+ pages on occasion. I don't know what the Excel function is but would hazard a guess that the server is throttling traffic. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
jchd Posted January 23, 2011 Share Posted January 23, 2011 That has been suggested before and is most likely what's happening. I use _InetRead or friends routinely but I'm unsure if I reach 300 downloads in a single run, that's why I was attempting to simplify hings a bit more. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
kylomas Posted January 23, 2011 Share Posted January 23, 2011 jhcd, yes, had an invalid premise that lead to bad conclusions a couple hours ago. Guinness corrected the code and reminded me that "assumptions make an ass out of you and me". kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
JohnOne Posted January 23, 2011 Share Posted January 23, 2011 Something similar happend to me a while back while trying to automate a free webservice. Tried to run a task to many times and it failed after a while, when I physically navigated to the site, I was met with a blank page, so I figured it was IP related. It may well be something similar, I think they call it hammering. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now