Sign in to follow this  
Followers 0
The Kandie Man

_Google - A Completely Updated Version

14 posts in this topic

I was bored like w0uter, so i decided to make a completely updated version of his _Google function.

You can find w0uter's original _google() function here:

http://www.autoitscript.com/forum/index.ph...&hl=_Google

It was quite a nice idea but it used Internet Explorer objects and an elaborate set loops to function(no offense w0uter, i don't know if the _InetGetSource function was around yet).

I also noticed that his was showing a # symbol as the first search result. I suspect a change in the way google lists its results as the reason for this.

This new _Google() function doesn't care about Internet Explorer since it uses an the _InetGetSource() function. This new _Google function also has a parameter to allow you to change the number of results that the function returns. For example, maybe you only want the first 5 google search result urls. Or maybe you want the first 100. My new _Google function also has only one loop, so it should be faster(don't quote me on that).

Here is my attempt at making official Autoit documentation for this function:

http://www.thekandieman.com/documentation/autoit/_Google.htm

Here is the function:

#include <inet.au3>
Func _Google($s_q, $i_resultnum = 10)
    If NOT IsInt($i_resultnum) Then
        SetError(3)
        Return 0
    Endif
    Local $a_URLS, $s_Source
    $s_Source = _INetGetSource('http://www.google.com/search?q=' & StringReplace(StringReplace($s_q, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)
    If @ERROR Then
        SetError(1)
        Return 0
    EndIf
    $a_URLS = StringRegExp($s_Source, '<(?i)a class=l href="(.*?)">', 3)
    IF @Error Then
        SetError(2)
        Return 0
    Endif
    Local $a_URLSReturn[UBound($a_URLS) + 1]
    $a_URLSReturn[0] = UBound($a_URLS)
    Local $i_counter
    For $i_counter = 1 To UBound($a_URLS)
        $a_URLSReturn[$i_counter] = $a_URLS[$i_counter - 1]
    Next
    Return $a_URLSReturn
EndFunc   ;==>_Google

I hope you guys like it.

-The Kandie Man


"So man has sown the wind and reaped the world. Perhaps in the next few hours there will no remembrance of the past and no hope for the future that might have been." & _"All the works of man will be consumed in the great fire after which he was created." & _"And if there is a future for man, insensitive as he is, proud and defiant in his pursuit of power, let him resolve to live it lovingly, for he knows well how to do so." & _"Then he may say once more, 'Truly the light is sweet, and what a pleasant thing it is for the eyes to see the sun.'" - The Day the Earth Caught Fire

Share this post


Link to post
Share on other sites



10X! This is very useful for me, and works fine!

Share this post


Link to post
Share on other sites

Thanks, glad you guys like it. W0uters version was quite a hit but it was showing age. Glad i could whip up a new one. :P


"So man has sown the wind and reaped the world. Perhaps in the next few hours there will no remembrance of the past and no hope for the future that might have been." & _"All the works of man will be consumed in the great fire after which he was created." & _"And if there is a future for man, insensitive as he is, proud and defiant in his pursuit of power, let him resolve to live it lovingly, for he knows well how to do so." & _"Then he may say once more, 'Truly the light is sweet, and what a pleasant thing it is for the eyes to see the sun.'" - The Day the Earth Caught Fire

Share this post


Link to post
Share on other sites

I set 100 results, but the search "Evrobul", returns only 70. Why?

Share this post


Link to post
Share on other sites

This returns 100 results for me:

#include <inet.au3>
#include <Array.au3>

Local $a_Urls
$a_Urls =_Google("Evrobul", 100)
Msgbox(0,"Results",$a_Urls[0])
_ArrayDisplay( $a_Urls,"Search Result URLs:")

Func _Google($s_query, $i_resultnum = 10)
    If NOT IsInt($i_resultnum) Then
        SetError(3)
        Return 0
    Endif
    Local $a_URLS, $s_Source
    $s_Source = _INetGetSource('http://www.google.com/search?q=' & StringReplace(StringReplace($s_query, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)
    If @ERROR Then
        SetError(1)
        Return 0
    EndIf
    $a_URLS = StringRegExp($s_Source, '<(?i)a class=l href="(.*?)">', 3)
    IF @Error Then
        SetError(2)
        Return 0
    Endif
    Local $a_URLSReturn[UBound($a_URLS) + 1]
    $a_URLSReturn[0] = UBound($a_URLS)
    Local $i_counter
    For $i_counter = 1 To UBound($a_URLS)
        $a_URLSReturn[$i_counter] = $a_URLS[$i_counter - 1]
    Next
    Return $a_URLSReturn
EndFunc   ;==>_Google

"So man has sown the wind and reaped the world. Perhaps in the next few hours there will no remembrance of the past and no hope for the future that might have been." & _"All the works of man will be consumed in the great fire after which he was created." & _"And if there is a future for man, insensitive as he is, proud and defiant in his pursuit of power, let him resolve to live it lovingly, for he knows well how to do so." & _"Then he may say once more, 'Truly the light is sweet, and what a pleasant thing it is for the eyes to see the sun.'" - The Day the Earth Caught Fire

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

This is all code:

#include <inet.au3>
#include <Array.au3>



dim $matches[1]
$MatchPages=""
$MatchPagesForWord = ""
$keywords=IniRead("settings.ini","default","keywords","XX")
$keywords=StringSplit($keywords,",")
$url=IniRead("settings.ini","default","url","XX")
Dim $result[$url+10]
$pages=IniRead("settings.ini","default","pages","XX")


$i=1
Do
    $result=_Google($keywords[$i],100)
    _ArrayDisplay($result,"as")
    $o=1
    
    $matchline = $keywords[$i] & ": " & $MatchPagesForWord
    $MatchPages = $MatchPages & @CRLF & $matchline
    Do
        $match=StringInStr($result[$o],$url)
        if $match<>0 Then
            $MatchPagesForWord=$MatchPagesForWord & " " & $o
            $MatchPages = $MatchPages & $MatchPagesForWord
        EndIf
        $MatchPagesForWord=""
        $o=$o+1
    Until $o=$pages

    
    $i=$i+1
Until $keywords[0]=$i-1
MsgBox(0,0,$MatchPages)




Func _Google($s_q, $i_resultnum = 10)
    If NOT IsInt($i_resultnum) Then
        SetError(3)
        Return 0
    Endif
    Local $a_URLS, $s_Source
    $s_Source = _INetGetSource('http://www.google.com/search?q=' & StringReplace(StringReplace($s_q, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)
    If @ERROR Then
        SetError(1)
        Return 0
    EndIf
    $a_URLS = StringRegExp($s_Source, '<(?i)a class=l href="(.*?)">', 3)
    IF @Error Then
        SetError(2)
        Return 0
    Endif
    Local $a_URLSReturn[UBound($a_URLS) + 1]
    $a_URLSReturn[0] = UBound($a_URLS)
    Local $i_counter
    For $i_counter = 1 To UBound($a_URLS)
        $a_URLSReturn[$i_counter] = $a_URLS[$i_counter - 1]
    Next
    Return $a_URLSReturn
EndFunc  ;==>_Google

the ini file:

[default]

keywords=evrobul

url=evrobul.org

pages=100

Edited by littleclown

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Gives all 100 results for me. What exactly are you trying to do? Find a string withing the search results and then display which elements have a matching string?

Edited by The Kandie Man

"So man has sown the wind and reaped the world. Perhaps in the next few hours there will no remembrance of the past and no hope for the future that might have been." & _"All the works of man will be consumed in the great fire after which he was created." & _"And if there is a future for man, insensitive as he is, proud and defiant in his pursuit of power, let him resolve to live it lovingly, for he knows well how to do so." & _"Then he may say once more, 'Truly the light is sweet, and what a pleasant thing it is for the eyes to see the sun.'" - The Day the Earth Caught Fire

Share this post


Link to post
Share on other sites

Yes. This is usefull, if you have a site and optimize it for search engines - SEO (Search Engine Optimization). This script shows the positions of some site, for couple of searches (I don't know but the StringInStr does not work fine and miss some results).

When I run this script in _arraydisplay there are just 70 lines, and thats why the script crash:

Array variable has incorrect number of subscripts or subscript dimension range exceeded.:

$match=StringInStr($result[$o],$url)

I can use $result[0], to know how many results are in the array, but the problem with 70 results is somewhere there.

The shot is from _arraydisplay

post-15972-1166338873_thumb.png

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Some search strings returns 100 results, but exactly "evrobul" returns 70. All results of "evrobul" is not many - about 100 (maybe you will add the option to show all pages, and to do not skip nothing (pwst=1&filter=0 in url maybe)).

EDIT:

Yes, with this line in your func, the pages are 100:

$s_Source = _INetGetSource('http://www.google.com/search?pwst=1&filter=0&q=' & StringReplace(StringReplace($s_q, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)

but the problem is that this result is not realistic - the users don't use this kind of search :P.

Maybe google use diferrent databases for me and you and my google give me just 70 results (but when I search with browser I see 100).

Edited by littleclown

Share this post


Link to post
Share on other sites

Some search strings returns 100 results, but exactly "evrobul" returns 70. All results of "evrobul" is not many - about 100 (maybe you will add the option to show all pages, and to do not skip nothing (pwst=1&filter=0 in url maybe)).

EDIT:

Yes, with this line in your func, the pages are 100:

$s_Source = _INetGetSource('http://www.google.com/search?pwst=1&filter=0&q=' & StringReplace(StringReplace($s_q, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)

but the problem is that this result is not realistic - the users don't use this kind of search :P.

Maybe google use diferrent databases for me and you and my google give me just 70 results (but when I search with browser I see 100).

It is quite possible. Ultimately though, if you search for something and specify a certain number of results, it will return that number of results. The exception is that no results are found or that there aren't that many search results to begin with.

So no, this isn't a bug, just the way google works.


"So man has sown the wind and reaped the world. Perhaps in the next few hours there will no remembrance of the past and no hope for the future that might have been." & _"All the works of man will be consumed in the great fire after which he was created." & _"And if there is a future for man, insensitive as he is, proud and defiant in his pursuit of power, let him resolve to live it lovingly, for he knows well how to do so." & _"Then he may say once more, 'Truly the light is sweet, and what a pleasant thing it is for the eyes to see the sun.'" - The Day the Earth Caught Fire

Share this post


Link to post
Share on other sites

Yes. You are right.

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

When you use this func with non English language like Bulgarian, the script use the options from the cookie for google language propeties. But when search bulgarian words and the google interface is English, the result are just a few. This is coz Google don't format the non-english words property, but when google nows your language, make words to look like "%F2%F3%F5%EB%E8" - google loves this kind of symbols and works perferct with them! :P. Thats why I sagest this modification, where the language is settable from func:

Func _Google($s_q, $i_resultnum = 10, $lng="en")
    If NOT IsInt($i_resultnum) Then
        SetError(3)
        Return 0
    Endif
    Local $a_URLS, $s_Source
    $s_Source = _INetGetSource('http://www.google.com/search?hl='&$lng&'&q=' & StringReplace(StringReplace($s_q, "+", "%2B"), " ", "+") & '&num=' & $i_resultnum)
    If @ERROR Then
        SetError(1)
        Return 0
    EndIf
    $a_URLS = StringRegExp($s_Source, '<(?i)a class=l href="(.*?)">', 3)
    IF @Error Then
        SetError(2)
        Return 0
    Endif
    Local $a_URLSReturn[UBound($a_URLS) + 1]
    $a_URLSReturn[0] = UBound($a_URLS)
    Local $i_counter
    For $i_counter = 1 To UBound($a_URLS)
        $a_URLSReturn[$i_counter] = $a_URLS[$i_counter - 1]
    Next
    Return $a_URLSReturn
EndFunc ;==>_Google
Edited by littleclown

Share this post


Link to post
Share on other sites

Very cool!

I've done it in PHP before, but it's not working anymore! :D

To get the title of the page, I suggested this Regexp:

<(?i)a class=l href="(.*?)">(.*?)</a>

Some strips from <b> tag and your mod works :P

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0