Jump to content

Getting source from a website


Recommended Posts

I've been on quite an extended break from scripting, so try not to be too hard on me. I'm trying to automate some events at a website, something I was once reasonably adept at. My issue lies in that I am having difficulty mapping the HTML structure. I've been using _IEBodyReadHTML/_IEDocReadHTML, but they are returning errors, specifically an error executing "document.body.innerHTML" or "document.documentElement.outerHTML". I then ran _INetGetSource and the result was only a couple of lines that didn't help me out. I sourced each page just using IE's "View Source", but that doesn't help me map the frames/forms/etc. I was curious to know what is commonly used to extract a complete HTML source. I will try using a DOM Viewer when I get home tonight, but I never found those viewers to be the easiest to understand. I'd prefer to look at the source in the form like _IE*ReadHTML returns. Thanks for any advice. :rolleyes:

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

Thanks, but that's not what I'm getting at. I'm more than capable of using most of the _IE* functions, including the _IE*ReadHTML's (Much thanks to Dale for his support). My problem here is that those functions agree that I'm passing them a valid object, but the $object.document.body.innerHTML execution fails. Could this be a trick in the site's HTML that I'm not aware of? For instance (if it's possible), there are no <body> tags? I'm just speculating on this, I don't actually know why the innerHTML wouldn't be available, and that's why I'm asking :rolleyes:

Edit: typos

Edited by mikehunt114
IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

  • Moderators

Is there a link for us to examine to get a better grasp on what it is you "need" and are trying to accomplish?

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

That would be the same thing as IE's "View Source". That's fine for looking up things inside the frame or whatever you click on, but not the parent structure (I'm looking to get everything).

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

@SmOke_N:

I just want to get the complete source from a site. I want to be able to look through it and map frames, forms, elements, etc. Sorry if what I want is unclear. I usually use _IEDocReadHTML on my IE object to reference frames, then the same function on the frames to reference forms, and I do this all the way down until I get to where I need (if my memory serves me correctly). This time, _IEDocReadHTML isn't working for me (neither can I get that same info from _INetGetSource or View Source). The website is www.ogame.org, but it's the layout after a login that I am interested in (sign up a dummy account if you like). I don't want somebody to go look at the website and tell me a frame name, I'd like to be able to look at them all myself, I'm just having trouble doing that.

Thanks

Edited by mikehunt114
IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

  • Moderators

Sounds to me like you need to familiarize yourself with _IEFormGetCollection/_IEFrameGetCollection... Then do some error checking, and you should be smooth on your way.

Edit:

I should say this... the "evasive" tactic you're using not revealing the site isn't going to help you any quicker get the "absolute" solution :rolleyes: .

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

The answer to my question doesn't depend on what site it is. I'm just looking for a general method to get HTML source. Pertaining to your suggestion, I've already done that, I got 0 frames when I ran through a collection. So then I ran a form collection and got 0 forms on the page. That's when I went to _IEDocReadHTML, and you know the rest. Since I gave you the url anyways, did you have any luck on the page? :rolleyes:

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

  • Moderators

The answer to my question doesn't depend on what site it is. I'm just looking for a general method to get HTML source. Pertaining to your suggestion, I've already done that, I got 0 frames when I ran through a collection. So then I ran a form collection and got 0 forms on the page. That's when I went to _IEDocReadHTML, and you know the rest. Since I gave you the url anyways, did you have any luck on the page? :rolleyes:

I didn't notice the link, sorry.

http://www.ogame.org/

0 forms

http://www.ogame.org/home.php

1 form

Form

Id Name Method Action

loginForm post

Elements

Index Id Name Type Value Label Size Maximum Length State

0 v hidden 2

1 universe select

2 login 20

3 pass password 20

4 button button image

http://nwl.gameforge.de/?zone=ogame&lang=en

0 forms

about:blank

0 forms

This did get the frame for me (straight from the help file):
#include <IE.au3>
$oIE = _IECreate ("http://www.ogame.org")
$oFrames = _IEFrameGetCollection ($oIE)
$iNumFrames = @extended
For $i = 0 to ($iNumFrames - 1)
    $oFrame = _IEFrameGetCollection($oIE, $i)
    ConsoleWrite(@CRLF & "Frame Info:" & @CRLF &  _IEPropertyGet ($oFrame, "locationurl") & @CRLF)
Next

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Right, I have no problem with things before the login (which you've shown). I grabbed the frame, form, and form elements names on the login page. It's after I login that I have trouble mapping HTML.

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

  • Moderators

Right, I have no problem with things before the login (which you've shown). I grabbed the frame, form, and form elements names on the login page. It's after I login that I have trouble mapping HTML.

Guess we are left to assume you are doing everything correctly... no script and I'm not going to sign up to test it personally... Are you using _IEAttatch() after the Login load to get the new IE object?

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

No, I wasn't aware that I needed to re-attach after a page has loaded. Are you sure that's correct or am I misunderstanding you perhaps? There's no new IE object being created, just the same one redirected. As for a script, when I get home later (the webiste is blocked here at work), I can write a quick one to log you in with a dummy account and try and grab the HTML. I would do it right now, since I remember the frame/form names, but I can't sign up a new account on account of the web-blocker.

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

  • Moderators

No, I wasn't aware that I needed to re-attach after a page has loaded. Are you sure that's correct or am I misunderstanding you perhaps? There's no new IE object being created, just the same one redirected. As for a script, when I get home later (the webiste is blocked here at work), I can write a quick one to log you in with a dummy account and try and grab the HTML. I would do it right now, since I remember the frame/form names, but I can't sign up a new account on account of the web-blocker.

I can't remember if it's 100% necessary, but I remember having to do it with a script I had written. Now I just use _IEAttach() as a habit to make sure.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

I've never encountered the need to do that before, but I'll try it when I get home. If it still doesn't work I'll post a reproducer...

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

This should reproduce the situation:

Edit: <dummy user/pass removed>

Edited by mikehunt114
IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...