Jump to content
Sign in to follow this  
Tamas

get url from html source

Recommended Posts

Tamas

Hi All!

I have a simple question. How can i get URL address from html source code? I would like to choose by name the class in the code.

For example:

html code:

<H1 class=vezeto><A href="/x.php?id=inxcl&amp;url=http%3A%2F%2Fsportgeza.hu%2Fforma1%2F2010%2F05%2F09%2Fwebber_megallithatatlan_volt_barcelonaban%2F">Webber megállíthatatlan volt Barcelonában</A></H1>

what i need:

the link from the 'vezeto' class, this:

/x.php?id=inxcl&amp;url=http%3A%2F%2Fsportgeza.hu%2Fforma1%2F2010%2F05%2F09%2Fwebber_megallithatatlan_volt_barcelonaban%2F

in the code from " to "

This link is always changing. I need a script that always reads the link.

Thanks for the help!

Share this post


Link to post
Share on other sites
hawky358

I am assuming it's a php generated page, so the layout of the stuff around the link will always remain the same.

You can work this section into your code

If there are multiple vezeto, then you'll have to do some filtering

$file = "1.html"
$source = FileRead($file)

$start = StringInStr($source,'<H1 class=vezeto><A href="')
$end = StringInStr($source,'"',-1,2,$start)

$link = StringMid($source,$start,$end-$start)
$link = StringReplace($link,'<H1 class=vezeto><A href="',"")
Edited by hawky358

Share this post


Link to post
Share on other sites
Tamas

this is a good solution, but if it would be possible without the file management, that would be the best! :idea: maybe with the _IEBodyReadHTML, or something.. (INet.au3) im trying to find the solution too :)

Share this post


Link to post
Share on other sites
hawky358

maybe with the _IEBodyReadHTML

The thing I don't like about the _IE... functions is that it opens an IE window.

Do you want no files whatsoever?

This way you download the file, process it then delete it.

I used Inetget() without background, but you can incorporate it to do background downloading if you want to.

InetGet("http://www.google.com", "temp.html",1)

;do some work here;

FileDelete("temp.html")

If you REALLY don't want ANY files to be written I guess you can use _IE....

Edited by hawky358

Share this post


Link to post
Share on other sites
Tamas

Thank you!!! :idea: it's resolved without file managament with YOUR HELP (string concat)

so thx! :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×