Sign in to follow this  
Followers 0
pedmacedo

Getting the HTML of a page

4 posts in this topic

Hello guys, I am having a little problem in this code and hope someone could help me:

$HTML = _IEDocReadHTML ($ie)
$end = _StringBetween($HTML,"</h1><p>","</p><ul")

This is the relevant part of the script, the rest of it is ie creating and ie navigating stuff.

I want to get the string between the substrings "</h1><p>" and "</p><ul" from the HTML of a page. There is only one occurrence of each of these substrings inside the HTML code, yet it does not work, the value of $end is 0.

The interesting part is that this line works(same HTML)

$title = stringBetween($HTML,"<title>","</title>")

Anyone know what could be causing this? :x

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

did you write your own stringinbetween function or is this a typo?

from line = $title = stringBetween($HTML,"<title>","</title>")

kylomas

EDIT (spelling)

make that stringbetween

also, _stringbetween returns an array, see doc

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

I made some mess here, let me explain. I created my own stringBetween function since I wasn't able to make _StringBetween work. When I use my function to find the string betwteen the title tags of a page it works.

Now, thanks to you guys, I learned how to use _StringBetween (I can get the string between the title tags, for example)

But when I try to use any of these 2 functions to get the string between "</h1><p>","</p><ul" it fails (nothing is stored on the variable)

I wasn't able to figure out what could be causing this... The 2 substrings are there and it works for one case, why not for the other? o.O

EDIT: Ok guys, I think I got it. The HTML returned by _IEdocReadHTML is modified, that's the problem, the substrings are there on the original html, but not on the modified one.

Edited by pedmacedo

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0