Jump to content

Getting the correct map names from an URL


AlmarM
 Share

Recommended Posts

Hiya!

Whats the easiest way to get the correct map names from an URL?

Example:

http://www.example-site.com/images/background.gif -> DesktopDir/site/images/background.gif

http://www.example-site.com/images/buttons/hover.gif -> DesktopDir/site/images/buttons/hover.gif

http://www.example-site.com/img/navbar/shine.gif -> DesktopDir/site/img/navbar/shine.gif

http://www.example-site.com/spacer.gif -> DesktopDir/site/spacer.gif

What im trying to do is:

  • Grab whole HTML code from an URL
  • Save it to a .html
  • Get all images inside the site
  • Save those images into the correct maps

Hope it's clear. :)

AlmarM

Edited by AlmarM

Minesweeper

A minesweeper game created in autoit, source available.

_Mouse_UDF

An UDF for registering functions to mouse events, made in pure autoit.

2D Hitbox Editor

A 2D hitbox editor for quick creation of 2D sphere and rectangle hitboxes.

Link to comment
Share on other sites

$url = "DesktopDir/site/" & StringReplace($url, "http://www.example-site.com/", "")

For a generic URL, find the 3rd / and then StringMid it.

Thanks for that!

But what if the images are saved on a different URL?

Example:

http://www.spele.nl/
http://proxy.spele.nl/img/1/9/7/1/9/s.jpg

I can't do a StringReplace on the base URL here, any fast solutions to get the img base URL?

Minesweeper

A minesweeper game created in autoit, source available.

_Mouse_UDF

An UDF for registering functions to mouse events, made in pure autoit.

2D Hitbox Editor

A 2D hitbox editor for quick creation of 2D sphere and rectangle hitboxes.

Link to comment
Share on other sites

provided they are in tags on the source URL you could grab them all this way.

#include <IE.au3>

$oIE = _IECreate ("http://www.foxnews.com/" , 0 , 0)
$oImgs = _IEImgGetCollection($oIE)
$iNumImg = @extended

For $oImg In $oImgs
    msgbox (0, '' , $oImg.src)
    Next
Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

This works against your example but I don't know how it will be against the actual page.

$sExample = "http://www.example-site.com/images/background.gif" & @CRLF
$sExample &= "http://www.example-site.com/images/buttons/hover.gif" & @CRLF
$sExample &= "http://www.example-site.com/img/navbar/shine.gif" & @CRLF
$sExample &= "http://www.example-site.com/spacer.gif"

$aImages = StringRegExp($sExample, "(?i)http://.+?/(.+?\.[gjbp][a-z2]{2,3})", 3)
If NOT @Error Then
    For $i = 0 To Ubound($aImages) -1
        ;; Do something here
    Next
EndIf

EDIT: Modified the expression to also catch the very rare .jp2 files.

EDIT 2: If I missed any file extensions just add the first character of the extension into the [gjbp] group.

EDIT 3: Fixed the example string by adding @CRLF

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

In the example GEOSoft provided I keep getting >> :)

images/background.gifh

img/navbar/shine.gifh

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

I'll fix that. It's because I left out the @CRLFs

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Thanks for that GEO!

But when I visit this site: http://www.spele.nl/ and use _IEImgGetCollection it returns image links with

http://proxy.spele.nl/

How do I get the "http://xxx.xxx.xx/" from each IEImgGetCollection link?

Minesweeper

A minesweeper game created in autoit, source available.

_Mouse_UDF

An UDF for registering functions to mouse events, made in pure autoit.

2D Hitbox Editor

A 2D hitbox editor for quick creation of 2D sphere and rectangle hitboxes.

Link to comment
Share on other sites

Luckily I already had this saved and I think it should be what you want. It will handle http:, https: and ftp: with or without the www and it stops at the third slash if it exists. That is the first one after the // if it exists.

$sSRE="(?i)href\s*=[\x22\x27]?([fh]t+ps?://[\w]*\.?.+\.[\a-z]{2,3}/?).*"
This will work fine on

href="http://www.autoitscript.com/forum/topic/125066"
href="https://www.autoitscript.com"
href="http://dundats.mvps.org/
href="ftp://microsoft.com"
href="http://proxy.spele.nl/
"

If you need to get something like href="../" then try this (untested)

$sURL="http://proxy.spele.nl/"
$sSRE="(?i)href\s*=[\x22\x27]?([fh]t+ps?://[\w]*\.?.+\.[\a-z]{2,3}/?).*"
$aHref = StringRegExp(StringRegExtPeplace($sSource, "(href\s*=[\x22\x27]?)[./]+", $sURL), $sSre, 3)

EDIT: Since you are using _IEImageGetCollection() it probably doesn't return the href part so change the expression to

$sSRE = "(?i)(?m:^)([fh]t+ps?://[\w]*\.?.+\.[\a-z]{2,3}/?).*"
Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...