Andrew Posted March 25, 2009 Posted March 25, 2009 Considering the level of programming knowledge it seems most of you have, I must admit to being embarassed for asking this question. But frankly I am a programming novice and so I am hopeful that an answer here will be relatively quick and painless. Simply put: I would like to find all instances of a random 10-digit number (whose first 3 digits begin with '360') on a web page, and write them to a text file. Your suggestions for doing this would be very appreciated!
Authenticity Posted March 25, 2009 Posted March 25, 2009 You'll need to use the clipboard to copy the text of the entire page or to use the browser libraries (IE.au3, FF.au3) to read the document text (and if it contains frames then frames as well) and parse it using the StringRegExp function.
TerarinK Posted March 25, 2009 Posted March 25, 2009 what tagname is is in, because all you need to find is the tagname then innertext it. Can I ask what webpage is this? 0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E
Andrew Posted March 26, 2009 Author Posted March 26, 2009 what tagname is is in, because all you need to find is the tagname then innertext it. Can I ask what webpage is this?TerarinK, the web page is on an intranet site.You'll need to use the clipboard to copy the text of the entire page or to use the browser libraries (IE.au3, FF.au3) to read the document text (and if it contains frames then frames as well) and parse it using the StringRegExp function.Excellent, Authenticity! That helped a great deal. My only problem now is that I need to filter multiple instances of the same number (in many cases there are extended numbers on the same page that may look like: 36012345673601234567-3213601234567-32236076543213607654321-3213607654321-322...etc...My StringRegExp is picking up the first 10 digits of all of them, and I need only the first instance (minus the -xxx).
logcomptechs Posted March 26, 2009 Posted March 26, 2009 TerarinK, the web page is on an intranet site. Excellent, Authenticity! That helped a great deal. My only problem now is that I need to filter multiple instances of the same number (in many cases there are extended numbers on the same page that may look like: 3601234567 3601234567-321 3601234567-322 3607654321 3607654321-321 3607654321-322 ...etc... My StringRegExp is picking up the first 10 digits of all of them, and I need only the first instance (minus the -xxx). I made a script to remove duplicates from a text file and then it puts all original lines into a new text file. #Include <File.au3> #include <Array.au3> Dim $oFile,$nFile _FileReadToArray("old_text.txt",$oFile) $nFile = _ArrayUnique($oFile) _FileWriteFromArray("new_text.txt",$nFile) You could just run this after you collected all the numbers.
GEOSoft Posted March 26, 2009 Posted March 26, 2009 Since the RegExp returns an array, you can just #include <array.au3> and pass your array through _ArrayUnique() George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!"
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now