Jump to content
Sign in to follow this  
rootx

Best way to use google image search? [SOLVED]

Recommended Posts

I would like to download the first 5 images in a folder. THX.

#include <INet.au3>
#include <String.au3>
#include <Array.au3>


Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"

$sSource = _INetGetSource("http://www.google.com/search?q=" & $sKeyWord & "&tbm=isch")

$aImgURL = _StringBetween($sSource, 'src="', '"')


For $x = 1 to UBound($aImgURL)-1
    ConsoleWrite($aImgURL[$x]&@CRLF)
Next

 

Edited by rootx

Share this post


Link to post
Share on other sites
28 minutes ago, Danyfirex said:

Hello. and your issue is...?

 

Saludos

get the name of the img and save it whit the correct type and name.

Share this post


Link to post
Share on other sites
1 hour ago, j0kky said:

You can download 'em using InetGet, they don't have a standard name, but to know the extension you should search for their magic number.

thx, but the question is how to intercept the url of the source and not the thumbnail, does anyone have any idea ?? THX

#include <INet.au3>
#include <String.au3>
#include <Array.au3>


Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$sSource = _INetGetSource("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt")

$aImgURL = _StringBetween($sSource, 'src="', '"')


For $x = 1 to UBound($aImgURL)-1
    ConsoleWrite($aImgURL[$x]&@CRLF)
    InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg")
Next

 

Share this post


Link to post
Share on other sites

Try to save $sSource to an .html file and open it, you will see it differs from the page you're seeing while visiting the same url with browser:

https://www.google.ch/search?q=pug&as_st=y&hl=it&tbs=ift:jpg,isz:ex,iszw:800,iszh:600&tbm=isch&source=lnt&gws_rd=ssl

In my opinion you should play with:

_IEDocReadHTML

 

Edited by j0kky

Share this post


Link to post
Share on other sites
22 hours ago, j0kky said:

Try to save $sSource to an .html file and open it, you will see it differs from the page you're seeing while visiting the same url with browser:

https://www.google.ch/search?q=pug&as_st=y&hl=it&tbs=ift:jpg,isz:ex,iszw:800,iszh:600&tbm=isch&source=lnt&gws_rd=ssl

In my opinion you should play with:

_IEDocReadHTML

 

_IEDocReadHTML doesn't work. but....

#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <Inet.au3>
#include <Array.au3>
#include <File.au3>
#include <String.au3>



$x = _INetGetSource("http://www.google.ch/search?as_st=y&tbm=isch&hl=it&as_q=pug&as_epq=&as_oq=&as_eq=&cr=&as_sitesearch=&safe=images&tbs=ift:jpg")

FileWrite(@ScriptDir&"\9.html",$x)
Local $aRetArray
_FileReadToArray(@ScriptDir&"\9.html", $aRetArray)

;_ArrayDisplay($aRetArray, "Default Search")
 Local $aArray = _StringBetween($x, 'href="', '"')

 ; _ArrayDisplay($aArray, "Default Search")

    For $xs = 1 to UBound($aArray)-1
        ConsoleWrite($aArray[$xs]&@CRLF)
    Next

the source code isn't correct... beacuse if you read from the browser you find easly... this

/imgres?imgurl=http%3A%2F%2Fcdn3-www.dogtime.com%2Fassets%2Fuploads%2F2011%2F01%2Ffile_23124_pug-460x290.jpg&imgrefurl=http%3A%2F%2Fdogtime.com%2Fdog-breeds%2Fpug&docid=BTPG4yF8_O0fQM&tbnid=8FbyFFzHno3BCM%3A&vet=1&w=460&h=290&hl=it&safe=images&bih=715&biw=1156&ved=0ahUKEwif1eWAys7QAhUDzxQKHc39AREQMwgdKAAwAA&iact=mrc&uact=8

But Autoit extract... this

http://dogtime.com/dog-breeds/pug&amp;sa=U&amp;ved=0ahUKEwiU-sLNzc7QAhUBfhoKHYuWAP4QwW4IGDAA&amp;usg=AFQjCNFtqNOflzABBIVCR79FpfulvDD6Pw

Why??? Any Idea? I need to read raw source html. THX


 

Share this post


Link to post
Share on other sites
15 hours ago, rootx said:

_IEDocReadHTML doesn't work.

What does it mean, exatly?

#include <String.au3>
#include <ie.au3>


Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt")
$sSource = _IEDocReadHTML($obj)
FileWrite("log.html", $sSource)

$aImgURL = _StringBetween($sSource, '"ou":"', '"')


For $x = 1 to UBound($aImgURL)-1
    ConsoleWrite($aImgURL[$x]&@CRLF)
    ;InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg")
Next

 

Edited by j0kky

Share this post


Link to post
Share on other sites
1 hour ago, j0kky said:

What does it mean, exatly?

#include <String.au3>
#include <ie.au3>


Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt")
$sSource = _IEDocReadHTML($obj)
FileWrite("log.html", $sSource)

$aImgURL = _StringBetween($sSource, '"ou":"', '"')


For $x = 1 to UBound($aImgURL)-1
    ConsoleWrite($aImgURL[$x]&@CRLF)
    ;InetGet($aImgURL[$x],@ScriptDir&"\"&$x&".jpg")
Next

 

 

Ok but there is a way to have a regExp to intercept  start with [http://]   end with [.jpg] that because some url have a strange path.... 4 example....

"http://vignette1.wikia.nocookie.net/dogs/images/4/47/Gadget_the_pug_expressive_eyes.jpg/revision/latest?cb\u003d20110813111020"

I added a regex to save the file with the original name.

#include <String.au3>
#include <ie.au3>


Global $sSource, $aImgURL, $sKeyWord

DirCreate(@ScriptDir&"\img")

$folder = (@ScriptDir&"\img\")

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt")
$sSource = _IEDocReadHTML($obj)
FileWrite("log.html", $sSource)

$aImgURL = _StringBetween($sSource, '"ou":"', '"')


    For $x = 1 to UBound($aImgURL)-1
        ConsoleWrite($aImgURL[$x]&@CRLF)
        InetGet($aImgURL[$x],$folder&StringRegExpReplace($aImgURL[$x], '.*/([^-]+).*', "$1"))
    Next

_IEQuit($obj)

 

Share this post


Link to post
Share on other sites
StringRegExp($aImgURL[$x], '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))', 1)

You have the limitation to insert between parentesis each known image extension. Anyhow implementing an error checking line is a good idea, because if there is an extension you haven't expected, your script will fail.

Edited by j0kky
now it catches https too

Share this post


Link to post
Share on other sites

An alternative way without using IE.

 

#include <Array.au3>
#include <String.au3>
Global Const $HTTP_STATUS_OK = 200

Local $sKeyWord = "house"
Local $sURL = "http://www.google.com/search?q=" & $sKeyWord & "&tbm=isch"
Local $sData = HttpGet($sURL)
;~ ConsoleWrite($sData & @CRLF)

Local $aMetas = _StringBetween($sData, '"rg_meta">', '</div>')
;~ _ArrayDisplay($aMetas)

Local $sUrlImage = ""
Local $sImageName = ""
Local $sExtension = ""

If IsArray($aMetas) Then
    If UBound($aMetas) >= 5 Then
        For $i = 0 To 4
            ConsoleWrite(">Image Number: " & $i + 1 & @CRLF)
            $sUrlImage = _GetImageUrl($aMetas[$i])
            $sImageName = _GetImageName($aMetas[$i]) ;maybe you want to get the name from image url instead of metadata
            $sExtension = _GetImageExtension($aMetas[$i])
            ConsoleWrite($sUrlImage & @CRLF)
            ConsoleWrite($sImageName & @CRLF)
            ConsoleWrite($sExtension & @CRLF)
            ConsoleWrite(@CRLF)
        Next
    EndIf
EndIf

Func _GetImageName($sData)
    Local $aData = _StringBetween($sData, '"s":"', '"')
    If IsArray($aData) Then Return $aData[0]
EndFunc   ;==>_GetImageName

Func _GetImageUrl($sData)
    Local $aData = _StringBetween($sData, '"ou":"', '"')
    If IsArray($aData) Then Return $aData[0]
EndFunc   ;==>_GetImageUrl

Func _GetImageExtension($sData)
    Local $aData = _StringBetween($sData, '"ity":"', '"')
    If IsArray($aData) Then Return $aData[0]
EndFunc   ;==>_GetImageExtension


Func HttpGet($sURL)
    Local $oHTTP = ObjCreate("WinHttp.WinHttpRequest.5.1")
    $oHTTP.Open("GET", $sURL, False)
    $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0")
    $oHTTP.SetRequestHeader("Content-Type", "text/plain; charset=utf-8")
    If (@error) Then Return SetError(1, 0, 0)
    $oHTTP.Send()
    If (@error) Then Return SetError(2, 0, 0)
    If ($oHTTP.Status <> $HTTP_STATUS_OK) Then Return SetError(3, 0, 0)
    Return SetError(0, 0, $oHTTP.ResponseText)
EndFunc   ;==>HttpGet

Make sure to clean up the file name.

Saludos 

Share this post


Link to post
Share on other sites
#include <String.au3>
#include <ie.au3>
#include <WinAPIFiles.au3>
#include <InetConstants.au3>
#include <Array.au3>
Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"


$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt")
$sSource = _IEDocReadHTML($obj)
FileWrite("log.html", $sSource)

$aImgURL = _StringBetween($sSource,'imgurl=', '&amp;')

;_ArrayDisplay($aImgURL)

For $x = 1 to UBound($aImgURL)-1
    FileWrite(@ScriptDir&"\1.txt",StringReplace(StringReplace($aImgURL[$x],"%3A",":"),"%2F","/")&@CRLF)
    $url = StringReplace(StringReplace($aImgURL[$x],"%3A",":"),"%2F","/")
Next

$file = FileReadToArray(@ScriptDir&"\1.txt")


For $s = 1 to UBound($file)-1

    $last = StringSplit($file[$s], '/')
    $ls = UBound($last)-1
    ConsoleWrite(StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls]&@CRLF)

    If StringLeft($file[$s],5) = "https" Then
        ConsoleWrite(StringRegExp($file[$s],'(?i)(https://.*\.(jpg|bmp|cms|jpeg))', 1)[0]&@CRLF)
        InetGet($file[$s],@ScriptDir&"\x\"&StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls])
    Else
        ConsoleWrite(StringRegExp($file[$s],'(?i)(http://.*\.(jpg|bmp|cms|jpeg))', 1)[0]&@CRLF)
        InetGet($file[$s],@ScriptDir&"\x\"&StringSplit($file[$s], '/', $STR_ENTIRESPLIT)[$ls])
    EndIf
Next
_IEQuit($obj)

!!! only one error.... ueRSGNo.jpg%3F1 I changed the save file path name and the https... case.... now I downloaded 88 file correctly... Any suggestion to improve it? THX

PS: how can run ie hidden? I need to grab only the images Thx

Share this post


Link to post
Share on other sites

This is my version without all those StringReplace:

#include <String.au3>
#include <ie.au3>

Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt", 0, 0)
$sSource = _IEDocReadHTML($obj)

$aImgURL = _StringBetween($sSource, '"ou":"', '"')

For $x = 1 to UBound($aImgURL) - 1
    ;$sPattern = '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))' ; http?://.../name.ext
    $sPattern = '(?i).*/(.*\.(jpg|bmp|cms|jpeg))' ; name.ext
    $aRegEx = StringRegExp($aImgURL[$x], $sPattern, 1)
    If @error Then ContinueLoop
    ConsoleWrite($aRegEx[0] & @CRLF)
    InetGet($aImgURL[$x], @ScriptDir & "\" & $aRegEx[0])
Next

_IEQuit($obj)

 

Edited by j0kky

Share this post


Link to post
Share on other sites
2 hours ago, j0kky said:

This is my version without all those StringReplace:

#include <String.au3>
#include <ie.au3>

Global $sSource, $aImgURL, $sKeyWord

$sKeyWord = "pug"
$type = "jpg"
$width = "800"
$height = "600"

$obj = _IECreate("http://www.google.ch/search?q="& $sKeyWord &"&as_st=y&hl=it&tbs=ift:"&$type&",isz:ex,iszw:"&$width&",iszh:"&$height&"&tbm=isch&source=lnt", 0, 0)
$sSource = _IEDocReadHTML($obj)

$aImgURL = _StringBetween($sSource, '"ou":"', '"')

For $x = 1 to UBound($aImgURL) - 1
    ;$sPattern = '(?i)(http.?://.*\.(jpg|bmp|cms|jpeg))' ; http?://.../name.ext
    $sPattern = '(?i).*/(.*\.(jpg|bmp|cms|jpeg))' ; name.ext
    $aRegEx = StringRegExp($aImgURL[$x], $sPattern, 1)
    If @error Then ContinueLoop
    ConsoleWrite($aRegEx[0] & @CRLF)
    InetGet($aImgURL[$x], @ScriptDir & "\" & $aRegEx[0])
Next

_IEQuit($obj)

 

Nice,  downloaded 94 jpg, the winer is you. THX

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By coronatuss
      Hello everyone,
      Im developing an script to check the size of all .jpg stored in windows folder.
      The problem is that it doesn´t work properly with rotated images. 
      With _GDIPlus_ImageGetWidth() and _GDIPlus_ImageGetHeight()  I get the Width and Height of images as if they had not been rotated, and I need to know how they are rotated (orientation and grades).
      Any help is welcome! If more info is needed, please tell. 😁
    • By nacerbaaziz
      hello autoit team
      please i've a question for you.
      am creating a audio player
      and in this audio player i want to show the current trac info
      such as the total time and the position ... etc
      i know i can show it as label
      but the screen reader for the blind read the text every change
      because it have a screen scan
      what i want is to show this informations but such image or icon
      i mean i need to create
      GUICtrlCreatepic or GUICtrlCreateicon ....
      or some thing as that
      and show this informations as image on it
      i think that i can do that with the 
      _GDIPlus functions
      but i couldn't find the currect way to do it
      i tried the _GDIPlus_GraphicsDrawString
      but i couldn't know how it work
      what i need is a small example that create a GUI
      and add a multy line text to it as graphic or image.
      so i need a simple way because it will changed every sec
      i hope any one can help me to do that
      global $GUI = GUICreate("text", 400, 400) global $label = GUICtrlCreateLabel(GetText(), 10, 10, 380, 380) GUISetState() do sleep(100) until GUIGetMSG() = -3 exit func GetText() return StringFormat("file name is test.mp3 \r\n total time is 00:30:00 \r\n position is 00:05:50") endFunc  
    • By therks
      So I don't have any code cooked up yet as this is still in the theoretical stage. Just looking for some advice.
      My current idea is to resize the image to 1x1 pixel, probably using _GDIPlus_ImageResize, and read the color of the resulting pixel to obtain an admittedly very general sense of brightness. Does anyone think this could work?
      We have a CCTV system running at home using some home made cameras (raspberry pi) and we're trying to automate toggling settings for day/night time. We started with an AutoIt script that checked sunrise and sundown times for our location (calculations pulled from here) and toggled the settings based on that. Unfortunately our weather varies wildly, so it can get light/dark far outside normal sunrise/sundown times. Now we're hoping to periodically grab a still from the camera and toggle the light mode based on that.
×
×
  • Create New...