rootx Posted November 28, 2016 Posted November 28, 2016 (edited) I would like to download the first 5 images in a folder. THX. #include <INet.au3> #include <String.au3> #include <Array.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $sSource = _INetGetSource("http://www.google.com/search?q=" & $sKeyWord & "&tbm=isch") $aImgURL = _StringBetween($sSource, 'src="', '"') For $x = 1 to UBound($aImgURL)-1 ConsoleWrite($aImgURL[$x]&@CRLF) Next Edited November 30, 2016 by rootx
Danyfirex Posted November 28, 2016 Posted November 28, 2016 Hello. and your issue is...? Saludos Danysys.com AutoIt... UDFs: VirusTotal API 2.0 UDF - libZPlay UDF - Apps: Guitar Tab Tester - VirusTotal Hash Checker Examples: Text-to-Speech ISpVoice Interface - Get installed applications - Enable/Disable Network connection PrintHookProc - WINTRUST - Mute Microphone Level - Get Connected NetWorks - Create NetWork Connection ShortCut
rootx Posted November 28, 2016 Author Posted November 28, 2016 28 minutes ago, Danyfirex said: Hello. and your issue is...? Saludos get the name of the img and save it whit the correct type and name.
j0kky Posted November 28, 2016 Posted November 28, 2016 You can download 'em using InetGet, they don't have a standard name, but to know the extension you should search for their magic number. Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
rootx Posted November 28, 2016 Author Posted November 28, 2016 1 hour ago, j0kky said: You can download 'em using InetGet, they don't have a standard name, but to know the extension you should search for their magic number. thx, but the question is how to intercept the url of the source and not the thumbnail, does anyone have any idea ?? THX #include <INet.au3> #include <String.au3> #include <Array.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $sSource = _INetGetSource("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt") $aImgURL = _StringBetween($sSource, 'src="', '"') For $x = 1 to UBound($aImgURL)-1 ConsoleWrite($aImgURL[$x]&@CRLF) InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg") Next
j0kky Posted November 28, 2016 Posted November 28, 2016 (edited) Try to save $sSource to an .html file and open it, you will see it differs from the page you're seeing while visiting the same url with browser: https://www.google.ch/search?q=pug&as_st=y&hl=it&tbs=ift:jpg,isz:ex,iszw:800,iszh:600&tbm=isch&source=lnt&gws_rd=ssl In my opinion you should play with: _IEDocReadHTML Edited November 28, 2016 by j0kky Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
rootx Posted November 29, 2016 Author Posted November 29, 2016 22 hours ago, j0kky said: Try to save $sSource to an .html file and open it, you will see it differs from the page you're seeing while visiting the same url with browser: https://www.google.ch/search?q=pug&as_st=y&hl=it&tbs=ift:jpg,isz:ex,iszw:800,iszh:600&tbm=isch&source=lnt&gws_rd=ssl In my opinion you should play with: _IEDocReadHTML _IEDocReadHTML doesn't work. but.... #include <IE.au3> #include <MsgBoxConstants.au3> #include <Inet.au3> #include <Array.au3> #include <File.au3> #include <String.au3> $x = _INetGetSource("http://www.google.ch/search?as_st=y&tbm=isch&hl=it&as_q=pug&as_epq=&as_oq=&as_eq=&cr=&as_sitesearch=&safe=images&tbs=ift:jpg") FileWrite(@ScriptDir&"\9.html",$x) Local $aRetArray _FileReadToArray(@ScriptDir&"\9.html", $aRetArray) ;_ArrayDisplay($aRetArray, "Default Search") Local $aArray = _StringBetween($x, 'href="', '"') ; _ArrayDisplay($aArray, "Default Search") For $xs = 1 to UBound($aArray)-1 ConsoleWrite($aArray[$xs]&@CRLF) Next the source code isn't correct... beacuse if you read from the browser you find easly... this /imgres?imgurl=http%3A%2F%2Fcdn3-www.dogtime.com%2Fassets%2Fuploads%2F2011%2F01%2Ffile_23124_pug-460x290.jpg&imgrefurl=http%3A%2F%2Fdogtime.com%2Fdog-breeds%2Fpug&docid=BTPG4yF8_O0fQM&tbnid=8FbyFFzHno3BCM%3A&vet=1&w=460&h=290&hl=it&safe=images&bih=715&biw=1156&ved=0ahUKEwif1eWAys7QAhUDzxQKHc39AREQMwgdKAAwAA&iact=mrc&uact=8 But Autoit extract... this http://dogtime.com/dog-breeds/pug&sa=U&ved=0ahUKEwiU-sLNzc7QAhUBfhoKHYuWAP4QwW4IGDAA&usg=AFQjCNFtqNOflzABBIVCR79FpfulvDD6Pw Why??? Any Idea? I need to read raw source html. THX
j0kky Posted November 30, 2016 Posted November 30, 2016 (edited) 15 hours ago, rootx said: _IEDocReadHTML doesn't work. What does it mean, exatly? #include <String.au3> #include <ie.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt") $sSource = _IEDocReadHTML($obj) FileWrite("log.html", $sSource) $aImgURL = _StringBetween($sSource, '"ou":"', '"') For $x = 1 to UBound($aImgURL)-1 ConsoleWrite($aImgURL[$x]&@CRLF) ;InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg") Next Edited November 30, 2016 by j0kky rootx 1 Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
rootx Posted November 30, 2016 Author Posted November 30, 2016 1 hour ago, j0kky said: What does it mean, exatly? #include <String.au3> #include <ie.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt") $sSource = _IEDocReadHTML($obj) FileWrite("log.html", $sSource) $aImgURL = _StringBetween($sSource, '"ou":"', '"') For $x = 1 to UBound($aImgURL)-1 ConsoleWrite($aImgURL[$x]&@CRLF) ;InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg") Next Ok but there is a way to have a regExp to intercept start with [http://] end with [.jpg] that because some url have a strange path.... 4 example.... "http://vignette1.wikia.nocookie.net/dogs/images/4/47/Gadget_the_pug_expressive_eyes.jpg/revision/latest?cb\u003d20110813111020" I added a regex to save the file with the original name. #include <String.au3> #include <ie.au3> Global $sSource, $aImgURL, $sKeyWord DirCreate(@ScriptDir&"\img") $folder = (@ScriptDir&"\img\") $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt") $sSource = _IEDocReadHTML($obj) FileWrite("log.html", $sSource) $aImgURL = _StringBetween($sSource, '"ou":"', '"') For $x = 1 to UBound($aImgURL)-1 ConsoleWrite($aImgURL[$x]&@CRLF) InetGet($aImgURL[$x],$folder&StringRegExpReplace($aImgURL[$x], '.*/([^-]+).*', "$1")) Next _IEQuit($obj)
j0kky Posted November 30, 2016 Posted November 30, 2016 (edited) StringRegExp($aImgURL[$x], '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))', 1) You have the limitation to insert between parentesis each known image extension. Anyhow implementing an error checking line is a good idea, because if there is an extension you haven't expected, your script will fail. Edited November 30, 2016 by j0kky now it catches https too Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
Danyfirex Posted November 30, 2016 Posted November 30, 2016 An alternative way without using IE. expandcollapse popup#include <Array.au3> #include <String.au3> Global Const $HTTP_STATUS_OK = 200 Local $sKeyWord = "house" Local $sURL = "http://www.google.com/search?q=" & $sKeyWord & "&tbm=isch" Local $sData = HttpGet($sURL) ;~ ConsoleWrite($sData & @CRLF) Local $aMetas = _StringBetween($sData, '"rg_meta">', '</div>') ;~ _ArrayDisplay($aMetas) Local $sUrlImage = "" Local $sImageName = "" Local $sExtension = "" If IsArray($aMetas) Then If UBound($aMetas) >= 5 Then For $i = 0 To 4 ConsoleWrite(">Image Number: " & $i + 1 & @CRLF) $sUrlImage = _GetImageUrl($aMetas[$i]) $sImageName = _GetImageName($aMetas[$i]) ;maybe you want to get the name from image url instead of metadata $sExtension = _GetImageExtension($aMetas[$i]) ConsoleWrite($sUrlImage & @CRLF) ConsoleWrite($sImageName & @CRLF) ConsoleWrite($sExtension & @CRLF) ConsoleWrite(@CRLF) Next EndIf EndIf Func _GetImageName($sData) Local $aData = _StringBetween($sData, '"s":"', '"') If IsArray($aData) Then Return $aData[0] EndFunc ;==>_GetImageName Func _GetImageUrl($sData) Local $aData = _StringBetween($sData, '"ou":"', '"') If IsArray($aData) Then Return $aData[0] EndFunc ;==>_GetImageUrl Func _GetImageExtension($sData) Local $aData = _StringBetween($sData, '"ity":"', '"') If IsArray($aData) Then Return $aData[0] EndFunc ;==>_GetImageExtension Func HttpGet($sURL) Local $oHTTP = ObjCreate("WinHttp.WinHttpRequest.5.1") $oHTTP.Open("GET", $sURL, False) $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0") $oHTTP.SetRequestHeader("Content-Type", "text/plain; charset=utf-8") If (@error) Then Return SetError(1, 0, 0) $oHTTP.Send() If (@error) Then Return SetError(2, 0, 0) If ($oHTTP.Status <> $HTTP_STATUS_OK) Then Return SetError(3, 0, 0) Return SetError(0, 0, $oHTTP.ResponseText) EndFunc ;==>HttpGet Make sure to clean up the file name. Saludos rootx and gcue 2 Danysys.com AutoIt... UDFs: VirusTotal API 2.0 UDF - libZPlay UDF - Apps: Guitar Tab Tester - VirusTotal Hash Checker Examples: Text-to-Speech ISpVoice Interface - Get installed applications - Enable/Disable Network connection PrintHookProc - WINTRUST - Mute Microphone Level - Get Connected NetWorks - Create NetWork Connection ShortCut
j0kky Posted November 30, 2016 Posted November 30, 2016 Pay attention to rely on "ity" field, when url is composed by: link [dot] extension [forward slash] something else it is not setted up. Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
rootx Posted November 30, 2016 Author Posted November 30, 2016 expandcollapse popup#include <String.au3> #include <ie.au3> #include <WinAPIFiles.au3> #include <InetConstants.au3> #include <Array.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt") $sSource = _IEDocReadHTML($obj) FileWrite("log.html", $sSource) $aImgURL = _StringBetween($sSource,'imgurl=', '&') ;_ArrayDisplay($aImgURL) For $x = 1 to UBound($aImgURL)-1 FileWrite(@ScriptDir&"\1.txt",StringReplace(StringReplace($aImgURL[$x],"%3A",":"),"%2F","/")&@CRLF) $url = StringReplace(StringReplace($aImgURL[$x],"%3A",":"),"%2F","/") Next $file = FileReadToArray(@ScriptDir&"\1.txt") For $s = 1 to UBound($file)-1 $last = StringSplit($file[$s], '/') $ls = UBound($last)-1 ConsoleWrite(StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls]&@CRLF) If StringLeft($file[$s],5) = "https" Then ConsoleWrite(StringRegExp($file[$s],'(?i)(https://.*\.(jpg|bmp|cms|jpeg))', 1)[0]&@CRLF) InetGet($file[$s],@ScriptDir&"\x\"&StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls]) Else ConsoleWrite(StringRegExp($file[$s],'(?i)(http://.*\.(jpg|bmp|cms|jpeg))', 1)[0]&@CRLF) InetGet($file[$s],@ScriptDir&"\x\"&StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls]) EndIf Next _IEQuit($obj) !!! only one error.... ueRSGNo.jpg%3F1 I changed the save file path name and the https... case.... now I downloaded 88 file correctly... Any suggestion to improve it? THX PS: how can run ie hidden? I need to grab only the images Thx
j0kky Posted November 30, 2016 Posted November 30, 2016 (edited) This is my version without all those StringReplace: #include <String.au3> #include <ie.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt", 0, 0) $sSource = _IEDocReadHTML($obj) $aImgURL = _StringBetween($sSource, '"ou":"', '"') For $x = 1 to UBound($aImgURL) - 1 ;$sPattern = '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))' ; http?://.../name.ext $sPattern = '(?i).*/(.*\.(jpg|bmp|cms|jpeg))' ; name.ext $aRegEx = StringRegExp($aImgURL[$x], $sPattern, 1) If @error Then ContinueLoop ConsoleWrite($aRegEx[0] & @CRLF) InetGet($aImgURL[$x], @ScriptDir & "\" & $aRegEx[0]) Next _IEQuit($obj) Edited November 30, 2016 by j0kky rootx 1 Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
rootx Posted November 30, 2016 Author Posted November 30, 2016 2 hours ago, j0kky said: This is my version without all those StringReplace: #include <String.au3> #include <ie.au3> Global $sSource, $aImgURL, $sKeyWord $sKeyWord = "pug" $type = "jpg" $width = "800" $height = "600" $obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt", 0, 0) $sSource = _IEDocReadHTML($obj) $aImgURL = _StringBetween($sSource, '"ou":"', '"') For $x = 1 to UBound($aImgURL) - 1 ;$sPattern = '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))' ; http?://.../name.ext $sPattern = '(?i).*/(.*\.(jpg|bmp|cms|jpeg))' ; name.ext $aRegEx = StringRegExp($aImgURL[$x], $sPattern, 1) If @error Then ContinueLoop ConsoleWrite($aRegEx[0] & @CRLF) InetGet($aImgURL[$x], @ScriptDir & "\" & $aRegEx[0]) Next _IEQuit($obj) Nice, downloaded 94 jpg, the winer is you. THX
j0kky Posted November 30, 2016 Posted November 30, 2016 Glad it helped! Spoiler Some UDFs I created: Winsock UDF STUN UDF WinApi_GetAdaptersAddresses _WinApi_GetLogicalProcessorInformation Bitwise with 64 bit integers An useful collection of zipping file UDFs
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now