goldenix Posted April 14, 2010 Share Posted April 14, 2010 (edited) Most of the files (90%) I download are ok, The rest are corrupted. I checked the files on server, they are ok.I also Tried to download by only 1 file with 1 line code like: inetget($url, 01.jpg) And the download was fine. So im confused now what on earth is going on?I got cable connection.Is there anything I can do about it? Edited April 16, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
jchd Posted April 15, 2010 Share Posted April 15, 2010 Does this happen of one particular server? Can you post short code so we can try replicating the issue? This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
goldenix Posted April 15, 2010 Author Share Posted April 15, 2010 (edited) Does this happen of one particular server? Can you post short code so we can try replicating the issue? It tends to happen from time to time with random servers. but 1 thing is there common. it never happens if I use 1 line code. It always happens when I download several images one after another. Basically it downloads all thumbnails from the page & put them into a folder. expandcollapse popup; #FUNCTION# ;=============================================================================== ; AutoIt Version: 3.3.6.0 ; Description ...: download 1 gallery, Link must point to thumbnail view ; Author ........: goldenix ;~ _ArrayDisplay($_Arrayline,'') ; ;========================================================================================== #Include <Array.au3> #include <INet.au3> Global $Final_page = True Global $gallery_name = '' Global $ln = 1 $url = 'http://lu.scio.us/hentai/albums/chobitswallpaper/page/1' $o = 1 $url = StringTrimRight($url,1) while 1 _Go($url & $o) $o = $o +1 If $Final_page = True then ExitLoop WEnd ;~ ===================================================================== func _Go($url) $Final_page = True $HTMLSource = _INetGetSource($url) ;## Put the source into an array $_Arrayline $_Arrayline = StringSplit($HTMLSource, @LF) for $i = 1 to $_Arrayline[0] ;## Get Gallery Title If StringInStr($_Arrayline[$i],'meta name="title') Then _Get_gallery_name($_Arrayline[$i]) ;## search Array Line for thumb_100_ If StringInStr($_Arrayline[$i],'thumb_100_') Then _Extract_img_url_from_line($_Arrayline[$i]) ;## Check if last page If StringInStr($_Arrayline[$i],'last.png') Then $Final_page = False ;## Thumbs end hire If StringInStr($_Arrayline[$i],'Back to the List') Then ExitLoop Next EndFunc Func _Get_gallery_name($line) $Split = StringSplit($line, 'Album:',1) $Split = StringSplit($Split[2],' - Page',1) ;~ _ArrayDisplay($Split,'') $gallery_name = $Split[1] ConsoleWrite($Split[1] & @CRLF) EndFunc Func _Extract_img_url_from_line($_arr_line) $Split = StringSplit($_arr_line, 'src="',1) $Split = StringSplit($Split[2],'"',1) $new_src = stringreplace($Split[1], 'thumb_100_','') _Download($new_src) EndFunc Func _Download($new_src) $ext = StringSplit($new_src,'.',1) ; get file extension $ext = '.' & $ext[$ext[0]] $split = StringSplit($new_src,'/',1) ; get filename $filetosave_as = $split[$split[0]] If $ln < 10 Then $filetosave_as = '00' & $ln & $ext If $ln >= 10 Then $filetosave_as = '0' & $ln & $ext If $ln >= 100 Then $filetosave_as = '' & $ln & $ext $dir = 'lu.scio.us - Downloads' DirCreate($dir) DirCreate($dir & '\' & $gallery_name) If Not FileExists($dir & '\' & $gallery_name & '\' & $filetosave_as) Then InetGet($new_src, $dir & '\' & $gallery_name & '\' & $filetosave_as,1,0) Sleep(500) EndIf ConsoleWrite($filetosave_as & @CRLF) $ln = $ln +1 EndFunc Edited April 15, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
ripdad Posted April 15, 2010 Share Posted April 15, 2010 1. If you have a router, sometimes a router will blink (reset) 2. You probably need to close your inetget connection after "each download" and put a little sleep in between to slow it down some. Example Code: Local $FileName = InetGet($url_item, $sName, 1, 1) Sleep(250) Do Sleep(250) Until InetGetInfo($FileName, 2) Sleep(250) InetClose($FileName) "The mediocre teacher tells. The Good teacher explains. The superior teacher demonstrates. The great teacher inspires." -William Arthur Ward Link to comment Share on other sites More sharing options...
jchd Posted April 15, 2010 Share Posted April 15, 2010 I've tried some variants: InetGet with options 1 then 17, InetRead. The first time with InetGet about 1/3 of files are incomplete. Subsequent downloads (with reload forced) were complete except one file, got with Inetget. If Not FileExists($dir & '\' & $gallery_name & '\' & $filetosave_as) Then Local $bin $bin = InetRead($new_src) FileWrite($dir & '\' & $gallery_name & '\' & $filetosave_as, $bin) EndIf No Sleep is needed: your browser doesn't Sleep(500) between GETs I hope! It's possible that the incomplete downloads are due to a problem with server cacheing. You could try to double download and throw away the first copy possibly incomplete. Make sure to force reload on second time. It's ugly but could work more reliably. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
ripdad Posted April 15, 2010 Share Posted April 15, 2010 Line 14 is missing ' at end $url = 'http://lu.scio.us/hentai/albums/chobitswallpaper/page/1 Yes, it might be server problems "The mediocre teacher tells. The Good teacher explains. The superior teacher demonstrates. The great teacher inspires." -William Arthur Ward Link to comment Share on other sites More sharing options...
Tvern Posted April 15, 2010 Share Posted April 15, 2010 Did you try InetGetInfo() to check for errors? If the corrupted files give an error it will be easy to just retry those. Link to comment Share on other sites More sharing options...
goldenix Posted April 15, 2010 Author Share Posted April 15, 2010 (edited) Did you try InetGetInfo() to check for errors? If the corrupted files give an error it will be easy to just retry those. Okey I rewrote the script with InetGetInfo Added new function at the end that will DL & check the file. Result: 95% of files are corrupt. Like you see no errors. In the end I do filesize checking. So I could rewrite it, to re download the corrupt file until I get good one. But this would be stupid. What if I need to re download each file 100 times? Imagine the time & resource waist. Obviously the server is not going to fix itself. Also, if I use Download manager or Browser, none of the images is being displayed or downloaded corrupt. So Inetget does not check the packets integrity? why cant it download files intact? Since inetget is buggy, are there any other alternatives to download files? Piece of the log: 011.jpg Size: 759078 Complete->: False Successful->: False @error: 0 @extended: 0 Filesize dont Match: 741.287109375 65.783203125 012.jpg Size: 855656 Complete->: False Successful->: False @error: 0 @extended: 0 Filesize dont Match: 835.6015625 65.7822265625 013.jpg Size: 992410 Complete->: False Successful->: False @error: 0 @extended: 0 Filesize dont Match: 969.150390625 702.8125 014.jpg Size: 816747 Complete->: False Successful->: False @error: 0 @extended: 0 015.jpg Size: 880380 Complete->: False Successful->: False @error: 0 @extended: 0 Rewritten script: expandcollapse popup#Include <Array.au3> #include <INet.au3> Global $Final_page = True Global $gallery_name = '' Global $ln = 1 $url = 'http://lu.scio.us/hentai/albums/chobitswallpaper/page/1' $o = 1 $url = StringTrimRight($url,1) while 1 _Go($url & $o) $o = $o +1 If $Final_page = True then ExitLoop WEnd ;~ ===================================================================== func _Go($url) $Final_page = True $HTMLSource = _INetGetSource($url) ;## Put the source into an array $_Arrayline $_Arrayline = StringSplit($HTMLSource, @LF) for $i = 1 to $_Arrayline[0] ;## Get Gallery Title If StringInStr($_Arrayline[$i],'meta name="title') Then _Get_gallery_name($_Arrayline[$i]) ;## search Array Line for thumb_100_ If StringInStr($_Arrayline[$i],'thumb_100_') Then _Extract_img_url_from_line($_Arrayline[$i]) ;## Check if last page If StringInStr($_Arrayline[$i],'last.png') Then $Final_page = False ;## Thumbs end hire If StringInStr($_Arrayline[$i],'Back to the List') Then ExitLoop Next EndFunc Func _Get_gallery_name($line) $Split = StringSplit($line, 'Album:',1) $Split = StringSplit($Split[2],' - Page',1) ;~ _ArrayDisplay($Split,'') $gallery_name = $Split[1] ;~ ConsoleWrite($Split[1] & @CRLF) EndFunc Func _Extract_img_url_from_line($_arr_line) $Split = StringSplit($_arr_line, 'src="',1) $Split = StringSplit($Split[2],'"',1) $new_src = stringreplace($Split[1], 'thumb_100_','') _Download_prepare($new_src) EndFunc Func _Download_prepare($new_src) $ext = StringSplit($new_src,'.',1) ; get file extension $ext = '.' & $ext[$ext[0]] $split = StringSplit($new_src,'/',1) ; get filename $filetosave_as = $split[$split[0]] If $ln < 10 Then $filetosave_as = '00' & $ln & $ext If $ln >= 10 Then $filetosave_as = '0' & $ln & $ext If $ln >= 100 Then $filetosave_as = '' & $ln & $ext $dir = 'lu.scio.us - Downloads' DirCreate($dir) DirCreate($dir & '\' & $gallery_name) If Not FileExists($dir & '\' & $gallery_name & '\' & $filetosave_as) Then _Download($new_src, $dir & '\' & $gallery_name & '\' & $filetosave_as, $filetosave_as) Sleep(200) EndIf $ln = $ln +1 EndFunc Func _Download($new_src, $dest, $filetosave_as) $hDownload = InetGet($new_src, $dest,1,1) Do $aData = InetGetInfo($hDownload) ; Get all information. $trim_adata = stringleft($aData[0]/1024/1024, 5) ToolTip($trim_adata & ' Mb' , 0, 0) Sleep(250) Until InetGetInfo($hDownload, 2) ; Check if the download is complete. InetClose($hDownload) ; Close the handle to release resourcs. ;## check if filesize from server & after DL match $size = FileGetSize($dest) If $size == $aData[1] Then ConsoleWrite($filetosave_as & _ " Size: " & $aData[1] & _ " Complete->: " & $aData[2] & _ " Successful->: " & $aData[3] & _ " @error: " & $aData[4] & _ " @extended: " & $aData[5] & @CRLF) Else ConsoleWrite('!' & $filetosave_as & _ " Size: " & $aData[1] & _ " Complete->: " & $aData[2] & _ " Successful->: " & $aData[3] & _ " @error: " & $aData[4] & _ " @extended: " & $aData[5] & _ " Filesize dont Match: " & $aData[1]/1024 & ' ' & $size/1024 & @CRLF) ; convert to KiloBytes EndIf EndFunc Edited April 15, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
jchd Posted April 15, 2010 Share Posted April 15, 2010 Not obvious that the way you check is correct. I've had correct and repettive results with the following, aimed at a server that did have random latency issues. You''l have to adapt it, but you get the idea. There's no risk trying, after all. Local $TmpFile = _TempFile(@TempDir, '~basename~', '.html', 20) Local $timer, $hInet, $status For $retry = 1 to 10 $timer = TimerInit() $hInet = InetGet($url & $NumColis, $TmpFile, 17, 1) ; background direct download no cache Do Sleep(50) $status = InetGetInfo($hInet, -1) If $status[2] Then ExitLoop ; download is complete Until TimerDiff($timer) >= 30000 ; allow 30s for download InetClose($hInet) ; free handle If $status[3] Then ; download is said successful $text = FileRead($TmpFile) ;~ ConsoleWrite($text & @LF) ; $str = StringRight($text, 10) ; If StringRegExp($str, "(\}\s*){4}", 0) Then ; was my own check that html page got received in full. That won't work for you ; ExitLoop ; EndIf Else ConsoleWrite("Website didn't answer within allowed time." & @LF) EndIf ;~ Sleep(500) Next FileDelete($TmpFile) If $retry > 10 Then _WarnBox("Server is stoned") ; a warning MsgBox Return 1 EndIf This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
goldenix Posted April 15, 2010 Author Share Posted April 15, 2010 Not obvious that the way you check is correct.I've had correct and repettive results with the following, aimed at a server that did have random latency issues.You''l have to adapt it, but you get the idea. There's no risk trying, after all.So latency 30 seconds? I tried, & every file was still corrupt & DL took insanely long. 20 minutes to DL 10, -1Mb files or so My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
jchd Posted April 15, 2010 Share Posted April 15, 2010 No, that seems to mean that for some reason, the server doesn't cause the "download is complete" flag aka $Status[2] and that's probably why it times out. IMHO, the right way to understand what happens is to capture the full download session with Wireshark and dissect protocol to see what goes wrong. I don't believe InetGet, InetRead are that broken, just to annoy you. Moreover, these must be wrappers around Windows functions and it's unlikely that everything is dead buggy at this level. OTOH, broken or poorly setup servers are legion, so... This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
goldenix Posted April 16, 2010 Author Share Posted April 16, 2010 (edited) ok Re downloading the same file 61 times, seems to have solved the problem. But still its almost 60MB extra traffic per 30 files Edited April 16, 2010 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
jchd Posted April 16, 2010 Share Posted April 16, 2010 That's strange. Must be an overloaded server. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now