oathy Posted November 9, 2006 Share Posted November 9, 2006 Hey, I need a little help putting together a program that reads the HTML source code of a website for a bunch of information. I have written an AutoIt script that goes logs into a website that displays the current dynamic IP Addresses of all our branch offices, referenced by their DSL line number (called a DX number in our case). What I need to be able to do, is then take the outputted text file from the source code, and pull out all the DX number and IP address information. To break it down... The source code outputs information like this (about 600 times, but always in the same format) : <TD>dx0017094@as-adsl.dircon.co.uk</TD> <TD>10.0.13.100<BR><SMALL><EM>NAT->195.17.3.38/39 - RG2</EM></SMALL><BR><FONT size=1></FONT></TD> <TD width="30%"> <CENTER><A href="java script:vnc('10.0.13.100')"><IMG alt=VNC src="/images/vnc.gif"></A></CENTER></TD></TR> <TR bgColor=#d3dae3> What I need to be able to do is pull out these two bits of text - dx0017094 (these numbers are different but always start with "dx00") - 10.0.13.100 (again, different numbers but always start with "10.0." And I only need 1 instance of this as the HTML Code outputs 2 versions, one for a text, and then the other for a javascript program connect for VNC) Once it's found all that info, I would like to output them to a new text file without all the HTML code, just the DX Numbers and IP addresses. I've tried fiddling around with strings and FileRead and FileReadLine but I haven't programmed in about five years, and even then I wasn't all that astute. Any help you guys could offer would be immensely appreciated. Link to comment Share on other sites More sharing options...
BigDod Posted November 9, 2006 Share Posted November 9, 2006 Time you enjoyed wasting is not wasted time ......T.S. Elliot Suspense is worse than disappointment................Robert Burns God help the man who won't help himself, because no-one else will...........My Grandmother Link to comment Share on other sites More sharing options...
oathy Posted November 9, 2006 Author Share Posted November 9, 2006 Care to point me in the right direction then? I've been lurking on here for a week or so and saw a lot of other similar posts. Link to comment Share on other sites More sharing options...
BigDod Posted November 9, 2006 Share Posted November 9, 2006 The support forum would be the correct place but do not repost as a mod will probably move this one. Time you enjoyed wasting is not wasted time ......T.S. Elliot Suspense is worse than disappointment................Robert Burns God help the man who won't help himself, because no-one else will...........My Grandmother Link to comment Share on other sites More sharing options...
oathy Posted November 9, 2006 Author Share Posted November 9, 2006 Thanks kindly. Link to comment Share on other sites More sharing options...
Confuzzled Posted November 12, 2006 Share Posted November 12, 2006 Hey, I need a little help putting together a program that reads the HTML source code of a website for a bunch of information. I have written an AutoIt script that goes logs into a website that displays the current dynamic IP Addresses of all our branch offices, referenced by their DSL line number (called a DX number in our case). What I need to be able to do, is then take the outputted text file from the source code, and pull out all the DX number and IP address information. To break it down... The source code outputs information like this (about 600 times, but always in the same format) : <TD>dx0017094@as-adsl.dircon.co.uk</TD> <TD>10.0.13.100<BR><SMALL><EM>NAT->195.17.3.38/39 - RG2</EM></SMALL><BR><FONT size=1></FONT></TD> <TD width="30%"> <CENTER><A href="java script:vnc('10.0.13.100')"><IMG alt=VNC src="/images/vnc.gif"></A></CENTER></TD></TR> <TR bgColor=#d3dae3> What I need to be able to do is pull out these two bits of text - dx0017094 (these numbers are different but always start with "dx00") - 10.0.13.100 (again, different numbers but always start with "10.0." And I only need 1 instance of this as the HTML Code outputs 2 versions, one for a text, and then the other for a javascript program connect for VNC) Once it's found all that info, I would like to output them to a new text file without all the HTML code, just the DX Numbers and IP addresses. I've tried fiddling around with strings and FileRead and FileReadLine but I haven't programmed in about five years, and even then I wasn't all that astute. Any help you guys could offer would be immensely appreciated. Are the two items of interest always on the first two lines, and always in between the same HTML characters? If so, it should be fairly easy to perform string manipulation to find and extract what you want and write them out in a formatted fashion to another text file. (eg: Open your input file, look for dx00 on the first line and copy all characters until you get to the </TD> characters into your first variable, and for the next line you have read in, find 10.0 and copy until you hit the <BR> characters into the second variable. Open your output file and write the two variables to it.) Show us what you have written and maybe we can offer more specific advice. Link to comment Share on other sites More sharing options...
Uten Posted November 12, 2006 Share Posted November 12, 2006 Something to play with. From the top of my head, so you might have to do some corrections. #include <misc.au3> ; Load data Local $data = FileRead("<your file>") ;Create regexp Local $regexp = "<TD>(dx\d+@[\w-\.]+)</TD>|<TD>(10\.0\.\d+\.\d+)<BR>" $arr = StringRegExp($data, $regexp,3) _ArrayDisplay($arr,"DATA") Please keep your sig. small! Use the help file. Search the forum. Then ask unresolved questions :) Script plugin demo, Simple Trace udf, TrayMenuEx udf, IOChatter demo, freebasic multithreaded dll sample, PostMessage, Aspell, Code profiling Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now