Sign in to follow this  
Followers 0
oathy

Help writing a program (Newbie User)

7 posts in this topic

Hey, I need a little help putting together a program that reads the HTML source code of a website for a bunch of information.

I have written an AutoIt script that goes logs into a website that displays the current dynamic IP Addresses of all our branch offices, referenced by their DSL line number (called a DX number in our case). What I need to be able to do, is then take the outputted text file from the source code, and pull out all the DX number and IP address information.

To break it down...

The source code outputs information like this (about 600 times, but always in the same format) :

<TD>dx0017094@as-adsl.dircon.co.uk</TD>
<TD>10.0.13.100<BR><SMALL><EM>NAT->195.17.3.38/39 - RG2</EM></SMALL><BR><FONT size=1></FONT></TD>
<TD width="30%">
<CENTER><A href="java script:vnc('10.0.13.100')"><IMG alt=VNC src="/images/vnc.gif"></A></CENTER></TD></TR>
<TR bgColor=#d3dae3>

What I need to be able to do is pull out these two bits of text

- dx0017094 (these numbers are different but always start with "dx00")

- 10.0.13.100 (again, different numbers but always start with "10.0." And I only need 1 instance of this as the HTML Code outputs 2 versions, one for a text, and then the other for a javascript program connect for VNC)

Once it's found all that info, I would like to output them to a new text file without all the HTML code, just the DX Numbers and IP addresses.

I've tried fiddling around with strings and FileRead and FileReadLine but I haven't programmed in about five years, and even then I wasn't all that astute. Any help you guys could offer would be immensely appreciated.

Share this post


Link to post
Share on other sites



Posted Image


Get Beta versions Here Get latest SciTE editor Here AutoIt 1-2-3 by Valuater - A great starting point.

Time you enjoyed wasting is not wasted time ......T.S. Elliot
Suspense is worse than disappointment................Robert Burns
God help the man who won't help himself, because no-one else will...........My Grandmother

Share this post


Link to post
Share on other sites

Posted Image

Care to point me in the right direction then? I've been lurking on here for a week or so and saw a lot of other similar posts.

Share this post


Link to post
Share on other sites
The support forum would be the correct place but do not repost as a mod will probably move this one.


Get Beta versions Here Get latest SciTE editor Here AutoIt 1-2-3 by Valuater - A great starting point.

Time you enjoyed wasting is not wasted time ......T.S. Elliot
Suspense is worse than disappointment................Robert Burns
God help the man who won't help himself, because no-one else will...........My Grandmother

Share this post


Link to post
Share on other sites

Thanks kindly.

Share this post


Link to post
Share on other sites

Hey, I need a little help putting together a program that reads the HTML source code of a website for a bunch of information.

I have written an AutoIt script that goes logs into a website that displays the current dynamic IP Addresses of all our branch offices, referenced by their DSL line number (called a DX number in our case). What I need to be able to do, is then take the outputted text file from the source code, and pull out all the DX number and IP address information.

To break it down...

The source code outputs information like this (about 600 times, but always in the same format) :

<TD>dx0017094@as-adsl.dircon.co.uk</TD>
<TD>10.0.13.100<BR><SMALL><EM>NAT->195.17.3.38/39 - RG2</EM></SMALL><BR><FONT size=1></FONT></TD>
<TD width="30%">
<CENTER><A href="java script:vnc('10.0.13.100')"><IMG alt=VNC src="/images/vnc.gif"></A></CENTER></TD></TR>
<TR bgColor=#d3dae3>

What I need to be able to do is pull out these two bits of text

- dx0017094 (these numbers are different but always start with "dx00")

- 10.0.13.100 (again, different numbers but always start with "10.0." And I only need 1 instance of this as the HTML Code outputs 2 versions, one for a text, and then the other for a javascript program connect for VNC)

Once it's found all that info, I would like to output them to a new text file without all the HTML code, just the DX Numbers and IP addresses.

I've tried fiddling around with strings and FileRead and FileReadLine but I haven't programmed in about five years, and even then I wasn't all that astute. Any help you guys could offer would be immensely appreciated.

Are the two items of interest always on the first two lines, and always in between the same HTML characters? If so, it should be fairly easy to perform string manipulation to find and extract what you want and write them out in a formatted fashion to another text file. (eg: Open your input file, look for dx00 on the first line and copy all characters until you get to the </TD> characters into your first variable, and for the next line you have read in, find 10.0 and copy until you hit the <BR> characters into the second variable. Open your output file and write the two variables to it.)

Show us what you have written and maybe we can offer more specific advice.

Share this post


Link to post
Share on other sites

Something to play with.

From the top of my head, so you might have to do some corrections.

#include <misc.au3>
; Load data
Local $data = FileRead("<your file>")
;Create regexp
Local $regexp = "<TD>(dx\d+@[\w-\.]+)</TD>|<TD>(10\.0\.\d+\.\d+)<BR>"
$arr = StringRegExp($data, $regexp,3)
_ArrayDisplay($arr,"DATA")

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0