Sign in to follow this  
Followers 0
dbzfanatic

_IEDocReadText()

8 posts in this topic

#1 ·  Posted (edited)

Here's another IE UDF, I hope Dale doesn't mind or feel I'm trying to hone in on his territory :).

#include-once
#include <IE.au3>

Func _IEDocReadText(ByRef $oObject)
    Local $sSource, $sText
    
    If Not IsObj($oObject) Then
        __IEErrorNotify("Error", "_IEDocReadText", "$_IEStatus_InvalidDataType")
        SetError($_IEStatus_InvalidDataType, 1)
        Return 0
    EndIf
    
    SetError($_IEStatus_Success)
    $sSource = _IEDocReadHTML($oObject)
    $sText = StringStripWS(StringRegExpReplace(StringRegExpReplace($sSource,'(<.*?>)',""),'(</.*?>)',""),4)
    Return $sText
EndFunc

Example:

#include <IE.au3>

$oIE = _IECreate("about:blank",1,1,1,1)
_IENavigate($oIE,"www.autoitscript.com",1)
ConsoleWrite(_IEDocReadText($oIE))

Edit: updated function to remove excess white space.

Edit 2: typo

Edit 3: typo again

Edited by dbzfanatic

Share this post


Link to post
Share on other sites



IE.au3 has _IEDocReadHTML, but no _IEDocReadText because _IEDocReadText would return the same result as _IEBodyReadText. The difference between _IEBody* and _IEDoc* is that _IEDoc* includes the page <HEAD> section, which is all HTML and is stripped out by a function returning just the text.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

The only reason I wrote this is because I've seen headers with text. I really wasn't sure if there was a difference that would be returned or not so I did this. Even if it is a bit useless it's good practice if nothing else. Please don't feel I'm trying to undermine you, in fact I say you're the IE king :) I honestly never expected you to come across this topic and the fact you posted at all actually makes me feel that I at least did something right, even if I only succeeded in doing something wrong :).

Share this post


Link to post
Share on other sites

The only reason I wrote this is because I've seen headers with text. I really wasn't sure if there was a difference that would be returned or not so I did this. Even if it is a bit useless it's good practice if nothing else. Please don't feel I'm trying to undermine you, in fact I say you're the IE king :) I honestly never expected you to come across this topic and the fact you posted at all actually makes me feel that I at least did something right, even if I only succeeded in doing something wrong :).

You didn't expect Dale to read something that had IE in the title? <News flash> He will usually be among the first to read them.</News flash>

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

I know I'm not supposed to bump within 24 hours but I figured I'd do this now before I forgot. I tested if _IEBodyReadText() and _IEDocReadText() returned the same thing and in two different tests they do not at least by autoit's processing. Here are the tests I performed:

$oIE = _IECreate("about:blank",1,1,1,1)
_IENavigate($oIE,"www.autoitscript.com",1)

If _IEDocReadText($oIE) <> StringStripWS(_IEBodyReadText($oIE),4) Then
    MsgBox(0,"","It's NOT the same!")
EndIf

If _IEDocReadText($oIE) <> _IEBodyReadText($oIE) Then
    MsgBox(0,"","It's NOT the same!")
EndIf

I at first thought it was the formatting because I use the StringStripWS() function in my UDF but it appears that is not the case. I am about to do a console write between the two for comparison and will post that soon.

Edit: Here are the results

_IEDocReadText()

==============
AutoIt Script Home Page
AutoIt
AutoIt v3
AutoIt v2 (legacy)
GImageX
GImageX
Misc.
C++ Source Code
Introduction
Welcome to the AutoIt Script home page - the home of AutoIt scripting and related applications. This site provides everything you need to get started with AutoIt and features great user support via the forum.
AutoIt
AutoIt is a freeware Windows automation language. It can be used to script most simple Windows-based tasks (great for PC rollouts or home automation). AutoIt has been in popular use since 1999 and continues to provide users and administrators with an easy way to script the Windows GUI. In February 2004 the latest version of AutoIt - known as AutoIt v3 - was released and added powerful scripting features.
AutoIt v3 was developed in a small team with the help of contributors around the world and this has led to a great set of help files, examples, support forum, mailing list, editor files, and third-party utilities.&nbsp; Oh, and lets not forget some nice graphics and wallpapers too! &nbsp; ImageX GUI (GImageX)
For a graphical user interface for the ImageX tool see this page.
&nbsp;
C++ Source Code
Click here to go to the source code section. There are some useful C++ libraries here including some LZ77 compression routines and some source code from AutoIt v3.
&nbsp;
&nbsp;
&nbsp;
&nbsp;
©1999-2008 Jonathan Bennett
Back To Top
==============

_IEBodyReadText()

==============
 AutoIt
AutoIt v3
AutoIt v2 (legacy)
GImageX
GImageX
Misc.
C++ Source Code
Introduction
Welcome to the AutoIt Script home page - the home of AutoIt scripting and related applications. This site provides everything you need to get started with AutoIt and features great user support via the forum.
AutoIt
AutoIt is a freeware Windows automation language. It can be used to script most simple Windows-based tasks (great for PC rollouts or home automation). AutoIt has been in popular use since 1999 and continues to provide users and administrators with an easy way to script the Windows GUI. In February 2004 the latest version of AutoIt - known as AutoIt v3 - was released and added powerful scripting features.
AutoIt v3 was developed in a small team with the help of contributors around the world and this has led to a great set of help files, examples, support forum, mailing list, editor files, and third-party utilities. Oh, and lets not forget some nice graphics and wallpapers too! ImageX GUI (GImageX)
For a graphical user interface for the ImageX tool see this page.
C++ Source Code
Click here to go to the source code section. There are some useful C++ libraries here including some LZ77 compression routines and some source code from AutoIt v3.
©1999-2008 Jonathan BennettBack To Top
==============

It appears the only difference is that my code returns the html character &nbsp which would be useful for formatting purposes but not much else. Mine also seems to return "Back To Top" in a new line but that is minor Minor formatting differences are there as well but, as usual, Dale was right :)

Edited by dbzfanatic

Share this post


Link to post
Share on other sites

but, as usual, Dale was right :)

Could have told you that right up front. There is no other person with as much IE experience around here as Dale has. If he tells you something then you can pretty much take it to the bank.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0