Azevedo Posted February 22, 2013 Posted February 22, 2013 Hey, I'm building this script to download images from google images search. Baiscally i have this part: $url = "https://www.google.com/search?hl=en&tbm=isch&q=flinstones" $sData = InetRead( $url ) $stream = BinaryToString( $sData ) Which stores the html content into a variable $stream. The images to download a are in the pattern: "imgurl=http://........jpg" My question is: How do I parse those image URLs using RegEx inside $stream in an array like image[1]="http://...." image[2]="http://...." thanks
jdelaney Posted February 22, 2013 Posted February 22, 2013 (edited) I'd use an xml dom object, or load into a hidden IE, and use: _IEImgGetCollection loop through the collection and get the .src You will have to wait for the RegExp sample ..here #include <Array.au3> $html = '<img class="ipsUserPhoto ipsUserPhoto_mini" alt="Are my AutoIt EXEs really infected? - last post by JLogan3o13" src="http://src1g?_r=1342793095"/>' & @CRLF & _ '<img class="ipsUserPhoto ipsUserPhoto_mini" alt="Are my AutoIt EXEs really infected? - last post by JLogan3o13" src="http://src1g?_r=1342793095"/>' $array = StringRegExp($html, '\<img.*src\=\"(.*)\"', 3) Edited February 22, 2013 by jdelaney IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
GMK Posted February 22, 2013 Posted February 22, 2013 (edited) That may only get the thumbnails. To get the original images, this may serve you well: #include <Inet.au3> #include <Array.au3> $sURL = "https://www.google.com/search?hl=en&tbm=isch&q=flinstones" $sData = _InetGetSource($sURL) $aImages = StringRegExp($sData, 'imgurl=(.*?)&', 3) _ArrayDisplay($aImages) Edited February 22, 2013 by GMK
Azevedo Posted February 22, 2013 Author Posted February 22, 2013 Thanks Delaney, GMK That may only get the thumbnails. To get the original images, this may serve you well: #include <Inet.au3> #include <Array.au3> $sURL = "https://www.google.com/search?hl=en&tbm=isch&q=flinstones" $sData = _InetGetSource($sURL) $aImages = StringRegExp($sData, 'imgurl=(.*?)&', 3) _ArrayDisplay($aImages) Wonderful, thats what I was looking for!
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now