pillbug Posted February 1, 2009 Share Posted February 1, 2009 (edited) Hi, I've been working on optimizing the _INetGetSource( and I am not seeing much improvement. If I have a 1 MB download connection, and each html file is about 10K, I should be able to download 100 searched results per second. Unfortunately, I am not even close to this. Although eventually I don't plan to use this for google searches, but a list of websites, I've created an example of searches in google with the search word "hello" I suspect the main problem is having to connect to wininet.dll If anyone sees ways to improve the program, please let me know on how to speed this process. expandcollapse popup$begin = TimerInit() $h_DLL = DllOpen('wininet.dll') $ai_IO = DllCall($h_DLL, 'int', 'InternetOpen', 'str', 'AutoIt v3', 'int', 0, 'int', 0, 'int', 0, 'int', 0) If @error Or $ai_IO[0] = 0 Then SetError(1) Return '' EndIf $v_Struct = DllStructCreate('udword'); udword=unsigned 32 bit integer for $i = 0 to 399 $s_URL= 'http://www.google.com/search?hl=en&q=hello&start=' & $i*10 & '&sa=N' $ai_IOU = DllCall($h_DLL, 'int', 'InternetOpenUrl', 'int', $ai_IO[0], 'str', $s_URL, 'str', '', 'int', 0, 'int', 0x80000000, 'int', 0) If @error Or $ai_IOU[0] = 0 Then $source= '' EndIf $s_Buf = '' DllStructSetData($v_Struct, 1, 1) for $z=1 to 3 $ai_IRF = DllCall($h_DLL, 'int', 'InternetReadFile', 'int', $ai_IOU[0], 'str', '', 'int', 4096, 'ptr', DllStructGetPtr($v_Struct)) $s_Buf &= StringLeft($ai_IRF[2], DllStructGetData($v_Struct, 1)); next DllCall($h_DLL, 'int', 'InternetCloseHandle', 'int', $ai_IOU[0]) $source= $s_Buf TrayTip('', 'Source: ' & $i, 3) if $i = 5 then TrayTip('', 'Source: ' & stringLeft($source, 5), 3) next DllCall($h_DLL, 'int', 'InternetCloseHandle', 'int', $ai_IO[0]) DllClose($h_DLL) $dif =Timerdiff($begin) msgbox(0,"Time Difference", $dif) Edited February 1, 2009 by pillbug Link to comment Share on other sites More sharing options...
Inverted Posted February 1, 2009 Share Posted February 1, 2009 If I have a 1 MB download connection, and each html file is about 10K, I should be able to download 100 searched results per second.Way to oversimplify ! I don't have the knowledge to help you, maybe someone else will provide more useful info. I just wanted to say that your thinking is wrong, there are are more connection latencies involved. And the fractions of a second needed to interface with a dll is negligible. Link to comment Share on other sites More sharing options...
BrettF Posted February 1, 2009 Share Posted February 1, 2009 Maybe something like this, but it was still way slower than what you posted. Maybe explore the forums and see if there is any other examples of getting the source of a page... $begin = TimerInit() $oHTTP = ObjCreate("winhttp.winhttprequest.5.1") $term = "hello" For $i = 0 To 10 $s_URL = 'http://www.google.com/search?hl=en&q=' & $term & '&start=' & $i * 10 & '&sa=N' ClipPut ($s_URL) $oHTTP.Open("GET", $s_URL, False) $oHTTP.Send() $source = $oHTTP.ResponseText ConsoleWrite ("Source for document " & $i & @CRLF) Next $dif = TimerDiff($begin) MsgBox(0, "Time Difference", $dif & @CRLF & "Average (ms) = " & Int ($dif/$i)) Vist my blog!UDFs: Opens The Default Mail Client | _LoginBox | Convert Reg to AU3 | BASS.au3 (BASS.dll) (Includes various BASS Libraries) | MultiLang.au3 (Multi-Language GUIs!)Example Scripts: Computer Info Telnet Server | "Secure" HTTP Server (Based on Manadar's Server)Software: AAMP- Advanced AutoIt Media Player | WorldCam | AYTU - Youtube Uploader Tutorials: Learning to Script with AutoIt V3Projects (Hardware + AutoIt): ArduinoUseful Links: AutoIt 1-2-3 | The AutoIt Downloads Section: | SciTE4AutoIt3 Full Version! Link to comment Share on other sites More sharing options...
AzKay Posted February 1, 2009 Share Posted February 1, 2009 I already tried that one Brett, For me, it ended up being 17 seconds for all 10; when his code was like, 4 seconds for 10. So I didnt bother posting. My only suggestion, would be maybe making your own function using the TCP* stuff. Something like; TCPConnect() For $i = 0 To 10 TCPSend() While 1 ;some other thing to log the source WEnd Next So that your not having to reopen a connection every time, Though I dont know if that works, as ive never tried. But I think it should work. # MY LOVE FOR YOU... IS LIKE A TRUCK- # Link to comment Share on other sites More sharing options...
TerarinK Posted February 1, 2009 Share Posted February 1, 2009 Remember this is single thread code your working with and it takes time to excute this code so right there your download rate is slowing down or I should say not fully being used. First off how fast is your computer because every mm adds and having 100 at about 10 mm is 1000 mm, oh a second more or less. Anyways your forgets the handshake that you must pass through then the data your sending each taking up those precious mm, in the end you likely to get 5-25 but no more remember the more the more time it takes to retrieve them unless you go thread wise. Go to search and lookup "Thread" that should get you in the direction you should get to maximize your full download rate 0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E Link to comment Share on other sites More sharing options...
TerarinK Posted February 1, 2009 Share Posted February 1, 2009 I already tried that one Brett, For me, it ended up being 17 seconds for all 10; when his code was like, 4 seconds for 10. So I didnt bother posting.My only suggestion, would be maybe making your own function using the TCP* stuff.Something like;TCPConnect()For $i = 0 To 10TCPSend()While 1;some other thing to log the sourceWEndNextSo that your not having to reopen a connection every time, Though I dont know if that works, as ive never tried.But I think it should work.Check on a topic concerning that, however it is for http but you want it to act like a ftphttp://www.autoitscript.com/forum/index.ph...p;hl=tcpCONNECT 0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E Link to comment Share on other sites More sharing options...
pillbug Posted February 4, 2009 Author Share Posted February 4, 2009 (edited) So I looked into TCP.I modified some of the code from http://www.autoitscript.com/forum/index.ph...0617&hl=tcpHowever, TCP is 2x slower:TCP:6035.04243275544 DLL:3014.04479785984Do any TCP experts see any ways to make TCP faster or the DLL faster?expandcollapse popup$begin_tcp = TimerInit() TCPFUNCTION() $dif_tcp =Timerdiff($begin_tcp) $begin_dll = TimerInit() DLLFUNCTION() $dif_dll =Timerdiff($begin_dll) Consolewrite("TCP:" & $dif_tcp & " DLL:" & $dif_dll) Func TCPFunction() TCPStartup(); initializing service for $i = 0 to 4 $URL= 'http://www.google.com/search?hl=en&q=hello&start=' & $i*10 & '&sa=N' $URL = StringRegExpReplace($URL, '\A(http://|https://)(.*?)/?\Z', '$2'); dropping http// https// if are there e.g. will return www.autoitscript.com/forum/index.php?showforum=9 for us Local $dom = StringRegExpReplace($URL, '\A(.*?)/.*', '$1'); this part is domain name (www.autoitscript.com) Local $ip = TCPNameToIP($dom); will need this to connect to server If $ip = "" Then Return -1 Local $get = StringRegExpReplace($URL, '\A(.*?)/(.*)', '$2'); we want this (forum/index.php?showforum=9) If $get = $dom Then $get = ''; in case requiring main page Local $header = 'GET /' & $get & ' HTTP/1.1' & @CRLF _ & 'User-Agent: Test' & @CRLF _ & 'Host: 127.0.0.1' & @CRLF & @LF; something about us and what we want ending with @CRLF & @LF Local $socket = TCPConnect($ip, 80); connecting to server If $socket = -1 Then Return -2; will not check any more errors from here on (you do it tongue.gif) TCPSend($socket, $header); sending request Local $rcv, $out, $x, $sw, $r, $lenght While 1 If $rcv <> '' Then If $x <> 1 Then $lenght = Number(StringRegExpReplace($rcv, '(?s)(.*?)Content-Length: (\d+)(.*)', '$2') + StringLen(StringLeft($rcv, StringInStr($rcv, @CRLF & @CRLF))) + 3) EndIf $x = 1 EndIf $rcv = TCPRecv($socket, 1024); receiving data from server $out &= $rcv; adding to what we already have If $x = 1 Then If $rcv = '' Then $sw += 1 If $sw = 10000 Then ExitLoop; sometimes there is no end, so we'll have to end it Else $sw = 0 EndIf EndIf If $lenght <> 0 Then If StringLen($out) = $lenght Then; some servers are done when they send ammount of data that they previously declared ExitLoop EndIf EndIf If StringRight($rcv, 5) = 0 & @CRLF & @CRLF Then; some servers will end with this ExitLoop EndIf WEnd TCPCloseSocket($socket); closing socket ;Return $out TrayTip('', 'Source: ' & $i & $out, 3) next TCPShutdown(); stopping service EndFunc Func DLLFUNCTION() $h_DLL = DllOpen('wininet.dll') $ai_IO = DllCall($h_DLL, 'int', 'InternetOpen', 'str', 'AutoIt v3', 'int', 0, 'int', 0, 'int', 0, 'int', 0) If @error Or $ai_IO[0] = 0 Then SetError(1) Return '' EndIf $v_Struct = DllStructCreate('udword'); udword=unsigned 32 bit integer for $i = 0 to 4 $s_URL= 'http://www.google.com/search?hl=en&q=hello&start=' & $i*10 & '&sa=N' $ai_IOU = DllCall($h_DLL, 'int', 'InternetOpenUrl', 'int', $ai_IO[0], 'str', $s_URL, 'str', '', 'int', 0, 'int', 0x80000000, 'int', 0) If @error Or $ai_IOU[0] = 0 Then $source= '' EndIf $s_Buf = '' DllStructSetData($v_Struct, 1, 1) for $z=1 to 3 $ai_IRF = DllCall($h_DLL, 'int', 'InternetReadFile', 'int', $ai_IOU[0], 'str', '', 'int', 4096, 'ptr', DllStructGetPtr($v_Struct)) $s_Buf &= StringLeft($ai_IRF[2], DllStructGetData($v_Struct, 1)); next DllCall($h_DLL, 'int', 'InternetCloseHandle', 'int', $ai_IOU[0]) $source= $s_Buf TrayTip('', 'Source: ' & $i, 3) if $i = 5 then TrayTip('', 'Source: ' & stringLeft($source, 5), 3) next DllCall($h_DLL, 'int', 'InternetCloseHandle', 'int', $ai_IO[0]) DllClose($h_DLL) ENDFUNC Edited February 6, 2009 by pillbug Link to comment Share on other sites More sharing options...
ProgAndy Posted February 5, 2009 Share Posted February 5, 2009 Try to split the DLL-function into 3 parts: 1): open DLL and call InternetOpen 2): Open URL, download and Close URL-Handle 2a): repeat step 2 as often as you want. 3): at the end of the Script, close InternetOpen and DLL *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes Link to comment Share on other sites More sharing options...
pillbug Posted February 6, 2009 Author Share Posted February 6, 2009 Try to split the DLL-function into 3 parts:1): open DLL and call InternetOpen2): Open URL, download and Close URL-Handle2a): repeat step 2 as often as you want.3): at the end of the Script, close InternetOpen and DLLI am not sure how my code is different from what you said...Could you give me some code to show what you mean? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now