Jump to content

Extract table data


ait
 Share

Recommended Posts

Hi I was wondering if anybody could help me with this problem.

I am trying to get the links for a bunch of reports on a webserver. The page with the reports has a table similar to:

<td>Report 1</td>
<td>
<a href="link to the report&type=PDF">PDF</a>
</br>
<a href="link to the report&type=CSV">CSV</a>
</td>
<td>Description of report 1</td>
</tr>
<tr>

<td>Report 2</td>
<td>
<a href="link to the report&type=CSV">CSV</a>
</td>
<td>Description of report 2</td>
</tr>
<tr>

<td>Report 3</td>
<td>
<a href="link to the report&type=PDF">PDF</a>
</br>
<a href="link to the report&type=CSV">CSV</a>
</td>
<td>Description of report 3</td>
</tr>
<tr>

Using the _IELinkGetCollection I have managed to write to a file the links where the links inner text is PDF or CSV but what would be helpful is if I could prefix each link with the table data for the report title so I know what report link is which.

At the minute I just have two loops, the first gets each link that is PDF and prefixs "PDF LINK x" and then I do the same again for the CSV reports.

If someone could point me in the right direction, or even provide some code, that would be fantastic.

Link to comment
Share on other sites

As I can see every link contain the type of file. If you have a collection of links you can use string management functions to get the type of file and use it as prefix.

For example if you have this link, you can get the type from link and use it as prefix.

link to the report&type=PDF

When the words fail... music speaks.

Link to comment
Share on other sites

Hi Andreik,

Thank you for the reply but unfortunately this wouldn't help me.

The "link to report&type=PDF" is a long value for example report=9tbf19c6789af5345874c45e20f674d9ac&type=PDF so even if I strip that away I wouldn't know what report the link was for

Edited by ait
Link to comment
Share on other sites

Use _IETableGetCollection to get the tables.

Loop through till you find the one you need.

Then, get all .childnodes to get the <tr>

Loop through those, and get all the .childnodes again to get the <td>

Then loop through those, and get the .getattributes('yourattributename'). That will return the link, or string, or whatever it is.

Or, use the xpath to the td's in my signature, and it will cut out all but the last step

http://www.w3schools.com/htmldom/dom_methods.asp

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...