Jump to content
Sign in to follow this  
leegold

Is There a Way to Extract Specific Elements From an HTML Web Page

Recommended Posts

leegold

Hi,

Using FF.au3. I want to extract from a web page all HTML elements with the general form of

<p class="row"...>,,,</p> there are many elements like this on the page and I want to get them all.

I was thinking to use _FFXPath but the page is not Well Formed  XHTML/XML. I also see FFReadHTML but as AFAIK it only copies all children of html or body tags(?)

I could use FFReadHTML and then use regex to get all the elements I want. But is there an easier way?

Thanks,

Lee G.

I

Share this post


Link to post
Share on other sites
Kidney

i have always used the FFReadHTML and then used StringInStr along with StringMid

allows you to pull everything between <p class="row"...> and </p> and then store it in an array.

the thing that takes the longest is the FFReadHTML

 

i would also suggest writing the HTML to a text file for debugging reasons. once u get the script working 100%, cut out that part of the code.

Share this post


Link to post
Share on other sites
allSystemsGo

Can you provide the URL so that others can take a look at it?

Share this post


Link to post
Share on other sites
Morthawt

I don't know about anything to do with any third party UDF's you may be using but when I need to extract elements from web pages I break the page down gradually into chunks using _StringBetween() to get to the information I need from the web page. An example of how I do this is here: '?do=embed' frameborder='0' data-embedContent>>

I don't know if that is of any use to you but I thought I would share that just in case.

Edited by Morthawt

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×