Jump to content

How to find the favicon [Solved]


Recommended Posts

As the title says. How do you find where the favicon is located, it isn't always in the root directory. Is there a way to search the source code to find it? An example is below:

;www.theglobeandmail.com
$hDownload1=InetGet('http://beta.images.theglobeandmail.com/images/gam/favicon.ico', @DesktopDir&'\Icon1.jpg')
;www.filehippo.com
$hDownload2=InetGet('http://www.filehippo.com//favicon.ico', @DesktopDir&'\Icon2.jpg')
InetClose($hDownload1)
InetClose($hDownload2)
Edited by picea892
Link to comment
Share on other sites

This is the best I can come up with. It's pretty slow but works for any website I have tested. Anyone have a better idea?

#include <INet.au3>
#Include <String.au3>
get_favicon('http://www.theglobeandmail.com')
func get_favicon($page)
    $source=_INetGetSource($page)
    $resultarray= _StringBetween($source ,'href="', '"')
    for $i =0 to UBound($resultarray)
        if StringInStr($resultarray[$i],"favicon.ico") Then
            $hDownload1=InetGet($resultarray[$i], @DesktopDir&'\Icon1.jpg')
            ExitLoop
        EndIf
    Next
EndFunc
Link to comment
Share on other sites

First check the default location before parsing the source, save the icon to an url specific filename (.ico is different from .jpg). And a return value is always appreciated :(...

#include <INet.au3>
#Include <String.au3>

MsgBox(0,"",get_favicon('http://www.filehippo.com'))
MsgBox(0,"",get_favicon('http://www.theglobeandmail.com'))

func get_favicon($page, $targetdir = @DesktopDir)

    $hDownload1 = InetGet($page & "favicon.ico", $targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico")

    $source=_INetGetSource($page)
    $resultarray= _StringBetween($source ,'href="', '"')
    for $i =0 to UBound($resultarray)
        if StringInStr($resultarray[$i],"favicon.ico") Then
            $hDownload1 = InetGet($resultarray[$i], $targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico")
            ExitLoop
        EndIf
    Next

    if FileExists($targetdir & "\favicon_" &  StringReplace(StringReplace(StringReplace($page,":",""),"//","_"),"http_www.","") & ".ico") Then
        Return True
    Else
        Return False
    EndIf

EndFunc
Edited by KaFu
Link to comment
Share on other sites

Thanks KaFu, Those are some great improvements! It seems like it is difficult to create one script which will work for any webpage it comes across. Any guesses where the Facebook.com icon is saved? Delicious.com has a favicon but using a relative path, the script doesn't get the icon...not sure why not.

Maybe this isn't possible on a large scale due to the lack of consistency of webpages.

Link to comment
Share on other sites

#include <INet.au3>
#include <String.au3>

$favicon_url = 'http://www.filehippo.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.theglobeandmail.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.facebook.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://delicious.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))

Func get_favicon($favicon_page, $favicon_targetdir = @ScriptDir)

    $hDownload1 = InetGet($favicon_page & "/favicon.ico", $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")

    If Not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then

        $source = _INetGetSource($favicon_page)
        $resultarray = _StringBetween($source, 'href="', '"')

        If IsArray($resultarray) Then
            For $i = 0 To UBound($resultarray) - 1
                ConsoleWrite($resultarray[$i] & @CRLF)
                If StringInStr($resultarray[$i], ".ico") Then
                    if StringLeft($resultarray[$i],"/") then $resultarray[$i] = $favicon_page & $resultarray[$i]
                    $hDownload1 = InetGet($resultarray[$i], $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
                    ExitLoop
                EndIf
            Next
        EndIf

    EndIf

    If FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        Return True
    Else
        Return False
    EndIf

EndFunc   ;==>get_favicon

Guess there's a better way to extract relative locations...

Edit:

Corrected

For $i = 0 To UBound($resultarray)

to

For $i = 0 To UBound($resultarray) - 1

otherwise script will crash if no .ico was found in array

Edited by KaFu
Link to comment
Share on other sites

Thanks for all this help, it is pretty fast and works on any site I can think of trying....guess you proved me wrong.... I Appreciate the help.

You're welcome :). There will be sites where this won't work :(, esp. the relative path will crumble some. Edited above post slightly.
Link to comment
Share on other sites

One final update. If a user submits a subpage such as http://www.autoitscript.com/forum/index.php?showtopic=112857 it will search for favicon at http://www.autoitscript.com which is more likely to yield success and be faster. Also added a check to see if .ico already exists before any operations are undertaken.

Picea892

#include <INet.au3>
#include <String.au3>

$favicon_url = 'http://www.filehippo.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.theglobeandmail.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://www.facebook.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))
$favicon_url = 'http://delicious.com'
MsgBox(0, $favicon_url, $favicon_url & @crlf & @crlf & "Favicon extracted: " & get_favicon($favicon_url))

Func get_favicon($favicon_page, $favicon_targetdir = @ScriptDir)
If not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
dim $searcharray[5]=[".com",".ca",".org",".net",".edu"]
for $i=0 to UBound($searcharray)-1
    $stringpos=StringInStr($favicon_page,$searcharray[$i])
    if $stringpos<>0 Then
        $favicon_page=StringTrimRight($favicon_page,(stringlen($favicon_page)-($stringpos+StringLen($searcharray[$i])-1))) ;get first page
    EndIf
Next
    $hDownload1 = InetGet($favicon_page & "/favicon.ico", $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
    If Not FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        $source = _INetGetSource($favicon_page)
        $resultarray = _StringBetween($source, 'href="', '"')

        If IsArray($resultarray) Then
            For $i = 0 To UBound($resultarray) - 1
                ConsoleWrite($resultarray[$i] & @CRLF)
                If StringInStr($resultarray[$i], ".ico") Then
                    if StringLeft($resultarray[$i],"/") then $resultarray[$i] = $favicon_page & $resultarray[$i]
                    $hDownload1 = InetGet($resultarray[$i], $favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico")
                    ExitLoop
                EndIf
            Next
        EndIf
    EndIf
    If FileExists($favicon_targetdir & "\favicon_" & StringReplace(StringReplace(StringReplace($favicon_page, ":", ""), "//", "_"), "http_www.", "") & ".ico") Then
        Return True
    Else
        Return False
EndIf

    EndIf
return "Already Exists" 
EndFunc   ;==>get_favicon
Edited by picea892
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...