Jump to content
Sign in to follow this  
sgmailer

How to read content of a specific website?

Recommended Posts

sgmailer

I was just wondering how will I approach reading the content of a website once it has load.

For example, let's say I've navigated to the "General Help and Support" section of this forum. How will I make it so that the script will take the title of the first topic (first non-sticky), store it in a string variable, and then output it back to the user?

Also related would be how to read content based on certain triggers?

For example: I want to only read the numbers of a coordinate pair. For example, if there is (283,1033), I want to read only the 283 and the 1033. The triggers would be to read from the "(" to the "," and then from the "," to the ")". Let's say the number of digits for the x and y is unknown (meaning it could be any length).

Thanks in advance for any tips and advice and if you guys need me to clarify anything, please feel free to ask.

Share this post


Link to post
Share on other sites
DaleHohm

Lots of approaches, but you'll need to study and understand things a bit more...

There is _IEBodyReadHTML, _IEPropertyGet innerText, _IETableWriteToArray and _InetGetSource and more...

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
sgmailer

Thanks for the help!

I've spent the last few hours working on my program and just ran into something that I have no idea what to do about.

Basically, what I want to do is read the text between the header tags (H1 tags) and typecast it into a string. After that, I'll figure out how to manipulate into only what I need.

I've tried the following but unfortunately, it didn't work:

$title = String(_IETagNameGetCollection($oIE,"H1"))
MsgBox(0,"Title",$title)

EDIT: Also, if anyone can point me in the right direction...What are collections and how do I deal with them? I can't seem to find anything in the AutoIt help files. :) For example, I found a post where Dale recommended looping though a collection. How do I do that? Is a collection similar to an array or a matrix because I'm fairly familiar with those?

Edited by sgmailer

Share this post


Link to post
Share on other sites
DaleHohm

A collection is a group of objects (a collection is an object as well). A collection, nor and object is a text string as you are trying to treat it. Objects can have properties that are the string you are looking for however.

You ask about looping through a collection -- the function in your code above has an example in the helpfile that shows how to loop through its elements -- you use a FOR...IN...NEXT loop.

For example:

#include <IE.au3>
$oIE = _IECreate("http://www.autoitscript.com")
$oH1s = _IETagNameGetCollection($oIE, "H1")
For $oH1 in $oH1s
     ConsoleWrite("H1 text: " & _IEPropertyGet($oH1, "innertext") & @CR)
Next

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.