Sign in to follow this  
Followers 0
picea892

How to find the favicon [Solved]

10 posts in this topic

#1 ·  Posted (edited)

As the title says. How do you find where the favicon is located, it isn't always in the root directory. Is there a way to search the source code to find it? An example is below:

;www.theglobeandmail.com
$hDownload1=InetGet('http://beta.images.theglobeandmail.com/images/gam/favicon.ico', @DesktopDir&'\Icon1.jpg')
;www.filehippo.com
$hDownload2=InetGet('http://www.filehippo.com//favicon.ico', @DesktopDir&'\Icon2.jpg')
InetClose($hDownload1)
InetClose($hDownload2)
Edited by picea892

Share this post


Link to post
Share on other sites



Search thru the source code for a line like this: <link rel="shortcut icon" href="favicon.ico">


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

This is the best I can come up with. It's pretty slow but works for any website I have tested. Anyone have a better idea?

#include <INet.au3>
#Include <String.au3>
get_favicon('http://www.theglobeandmail.com')
func get_favicon($page)
    $source=_INetGetSource($page)
    $resultarray= _StringBetween($source ,'href="', '"')
    for $i =0 to UBound($resultarray)
        if StringInStr($resultarray[$i],"favicon.ico") Then
            $hDownload1=InetGet($resultarray[$i], @DesktopDir&'\Icon1.jpg')
            ExitLoop
        EndIf
    Next
EndFunc

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

First check the default location before parsing the source, save the icon to an url specific filename (.ico is different from .jpg). And a return value is always appreciated :(...

#include <INet.au3>
#Include <String.au3>

MsgBox(0,"",get_favicon('http://www.filehippo.com'))
MsgBox(0,"",get_favicon('http://www.theglobeandmail.com'))

func get_favicon($page, $targetdir = @DesktopDir)

    $hDownload1 = InetGet($page & "favicon.ico", $targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico")

    $source=_INetGetSource($page)
    $resultarray= _StringBetween($source ,'href="', '"')
    for $i =0 to UBound($resultarray)
        if StringInStr($resultarray[$i],"favicon.ico") Then
            $hDownload1 = InetGet($resultarray[$i], $targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico")
            ExitLoop
        EndIf
    Next

    if FileExists($targetdir & "\favicon_" &  StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico") Then
        Return True
    Else
        Return False
    EndIf

EndFunc
Edited by KaFu

Share this post


Link to post
Share on other sites

Thanks KaFu, Those are some great improvements! It seems like it is difficult to create one script which will work for any webpage it comes across. Any guesses where the Facebook.com icon is saved? Delicious.com has a favicon but using a relative path, the script doesn't get the icon...not sure why not.

Maybe this isn't possible on a large scale due to the lack of consistency of webpages.

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

#include <INet.au3>
#include <String.au3>

$favicon_url = 'http://www.filehippo.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.theglobeandmail.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.facebook.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://delicious.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))

Func get_favicon($favicon_page, $favicon_targetdir = @ScriptDir)

    $hDownload1 = InetGet($favicon_page & "/favicon.ico", $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")

    If Not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then

        $source = _INetGetSource($favicon_page)
        $resultarray = _StringBetween($source, 'href="', '"')

        If IsArray($resultarray) Then
            For $i = 0 To UBound($resultarray) - 1
                ConsoleWrite($resultarray[$i] & @CRLF)
                If StringInStr($resultarray[$i], ".ico") Then
                    if StringLeft($resultarray[$i],"/") then $resultarray[$i] = $favicon_page & $resultarray[$i]
                    $hDownload1 = InetGet($resultarray[$i], $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
                    ExitLoop
                EndIf
            Next
        EndIf

    EndIf

    If FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        Return True
    Else
        Return False
    EndIf

EndFunc   ;==>get_favicon

Guess there's a better way to extract relative locations...

Edit:

Corrected

For $i = 0 To UBound($resultarray)

to

For $i = 0 To UBound($resultarray) - 1

otherwise script will crash if no .ico was found in array

Edited by KaFu

Share this post


Link to post
Share on other sites

Thanks for all this help, it is pretty fast and works on any site I can think of trying....guess you proved me wrong.... I Appreciate the help.

Share this post


Link to post
Share on other sites

Thanks for all this help, it is pretty fast and works on any site I can think of trying....guess you proved me wrong.... I Appreciate the help.

You're welcome :). There will be sites where this won't work :(, esp. the relative path will crumble some. Edited above post slightly.

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

One final update. If a user submits a subpage such as http://www.autoitscript.com/forum/index.php?showtopic=112857 it will search for favicon at http://www.autoitscript.com which is more likely to yield success and be faster. Also added a check to see if .ico already exists before any operations are undertaken.

Picea892

#include <INet.au3>
#include <String.au3>

$favicon_url = 'http://www.filehippo.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.theglobeandmail.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.facebook.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://delicious.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))

Func get_favicon($favicon_page, $favicon_targetdir = @ScriptDir)
If not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
dim $searcharray[5]=[".com",".ca",".org",".net",".edu"]
for $i=0 to UBound($searcharray)-1
    $stringpos=StringInStr($favicon_page,$searcharray[$i])
    if $stringpos<>0 Then
        $favicon_page=StringTrimRight($favicon_page,(stringlen($favicon_page)-($stringpos+StringLen($searcharray[$i])-1))) ;get first page
    EndIf
Next
    $hDownload1 = InetGet($favicon_page & "/favicon.ico", $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
    If Not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        $source = _INetGetSource($favicon_page)
        $resultarray = _StringBetween($source, 'href="', '"')

        If IsArray($resultarray) Then
            For $i = 0 To UBound($resultarray) - 1
                ConsoleWrite($resultarray[$i] & @CRLF)
                If StringInStr($resultarray[$i], ".ico") Then
                    if StringLeft($resultarray[$i],"/") then $resultarray[$i] = $favicon_page & $resultarray[$i]
                    $hDownload1 = InetGet($resultarray[$i], $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
                    ExitLoop
                EndIf
            Next
        EndIf
    EndIf
    If FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        Return True
    Else
        Return False
EndIf

    EndIf
return "Already Exists" 
EndFunc   ;==>get_favicon
Edited by picea892

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0