Sign in to follow this  
Followers 0
mcfr1es

Urldownloadtofile

4 posts in this topic

#1 ·  Posted (edited)

Forgive my noobness....

Hypothetically speaking, lets say a webpage contains alot of text that is useless but is also scattered with usernames (i.e forums). Using URLDownloadtofile, is there any way to save only the usernames listed on that page onto a text file?

BTW, when I view source i see that each username is cotained within these tags...

<span class="blistSmall">"username"</span>

and also contained within this tag...

<a href="/randomfriend.phtml?user=---USERNAME---">

is this info of any use to me or do i need to attain these usernames by other means :ph34r:

Edited by mcfr1es

Roger! You son of a big pile o' Monkey Nuts.

Share this post


Link to post
Share on other sites



Yes there should be several ways to extract the user names into a seperate file...

Unfortunately most pages that are automatically generated, don't utilize pretty printing for their source HTML code. (I've seen some web pages that have everything crammed into a few massive lines of code - very ugly)

If you're lucky, and the source uses line feeds after each line of code, then you could make a function using a simple while loop and a few calls to FileReadLine and StringInStr.

However, if you're unlucky enough to get one of the ugly source pages then you'll have to do more work:

1) You could either cheat and call DOS's find function (Probably the easier option if you're short on time)

2) Or build your own native AutoIt routine to search each byte in the file looking for the desired tags.

ie: Search until you find the '<' character. Do a quick check to ensure that the next character is not a slash '/' (close tag designation). If it's an opening tag, then save the current position, and scan ahead to the end of tag marker '>'. Use StringInStr to see if the tag that you found is the one that you're looking for (<span class="blistSmall">) and then copy the characters between the end of the open tag statement ('>') and the beginning of its closing tag statement. ('</')

I know this may sound a little confusing, but I hope that it makes at least a little sense. I would normally provide an example, but I don't have the time at the moment. :ph34r:

Hope this helps!

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Thank you bartokv your help is much appreciated...

if anyone has anything else to add (especially an example) feel free

I am currently at the point where i scan for "<" but i do not know how to check the text infront of the pointed bracket for a slash or to do any of the rest to tell you the truth :ph34r:

Edited by mcfr1es

Roger! You son of a big pile o' Monkey Nuts.

Share this post


Link to post
Share on other sites

You could use this routine .. with one slight modifcation: edit the three lines in the section marked ;Break it into chewable bytes.

Just use the token you identified, <span class="blistSmall">, instead of the href= in the routine.

Play around with it :ph34r:

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0