ShariefPareed Posted March 9, 2005 Share Posted March 9, 2005 Hi all, Can any one of you guide me how to capture all displayed text from IE browser page. I have set "WinDetectHiddenText" value to 1. Still it is not getting all the detail section of IE browser page. Any help will be highly appreciated. Thank you Sharief Pareed Intel Corporation Link to comment Share on other sites More sharing options...
Alterego Posted March 9, 2005 Share Posted March 9, 2005 alternatively, you could grab the page out of the cache... This dynamic web page is powered by AutoIt 3. Link to comment Share on other sites More sharing options...
SvenP Posted March 14, 2005 Share Posted March 14, 2005 Hi all,Can any one of you guide me how to capture all displayed text from IE browser page. I have set "WinDetectHiddenText" value to 1. Still it is not getting all the detail section of IE browser page. Any help will be highly appreciated.Thank you Sharief PareedIntel Corporation<{POST_SNAPBACK}>Hello Sharief Pareed,In which programming language are you going to try this? As you posted your question in the ActiveX forum of AutoIt, I assume you are using an Object-aware programming language? In that case you don't have to use AutoItX. Just open the "winhttp.winhttprequest.5" object and you are 'in business'.A short example (yes, this one is in AutoIt script, not AutoItX):$httpObj = ObjCreate("winhttp.winhttprequest.5") $httpObj.open("GET","http://your-url-here.com/etc/etc") $httpObj.send() $HTMLSource = $httpObj.ResponsetextThe variable $HTMLSource contains the complete ASCII source code of the given HTML page.More info about this Object can be found on: http://msdn.microsoft.com/library/en-us/wi...httprequest.aspHope this helps.Regards,-Sven Link to comment Share on other sites More sharing options...
therks Posted March 15, 2005 Share Posted March 15, 2005 The winhttp.winhttprequest.5 object doesn't seem to work for me. Which I find odd, seeing as according to the manual, "the winhttp.winhttprequest.5 object only exist on computers that have at least Internet Explorer version 5 installed." And I'm running IE6. But whenever I try that code I get "Variable must be of type "Object"." Ah well. My AutoIt Stuff | My Github Link to comment Share on other sites More sharing options...
SvenP Posted March 15, 2005 Share Posted March 15, 2005 The winhttp.winhttprequest.5 object doesn't seem to work for me. Which I find odd, seeing as according to the manual, "the winhttp.winhttprequest.5 object only exist on computers that have at least Internet Explorer version 5 installed." And I'm running IE6. But whenever I try that code I get "Variable must be of type "Object"."Ah well.<{POST_SNAPBACK}>Weird indeed. Maybe it's only present on Win2000/XP/2003 computers ?You could check your registry whether this object is present or not. It is located directly under HKEY_CLASSES_ROOT, search for the key starting with WinHttp. Maybe on your computer it has a lower or higher version, e.g. "WinHttp.WinHttpRequest.5.1"Regards,-Sven Link to comment Share on other sites More sharing options...
therks Posted March 16, 2005 Share Posted March 16, 2005 Thanks for the tip on where to check it. It was indeed WinHttp.WinHttpRequest.5.1. And I am running WinXP btw. My AutoIt Stuff | My Github Link to comment Share on other sites More sharing options...
Fredledingue Posted July 23, 2005 Share Posted July 23, 2005 Here is the code to download the source of a web page With CreateObject("MSXML2.XMLHTTP") .open "GET", "http://finance.yahoo.com", False .send t = .responseText MsgBox t End With Here is a full script to download and save to txt. (You have to create temp.htm in the same directory as the script before trying.) CODE '-------By Fredledingue------ '--------set constants-------- Const ForReading = 1, ForWriting = 2, ForAppending = 3 Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0 Const OPEN_FILE_FOR_APPENDING = 8 Set fso = CreateObject("Scripting.FileSystemObject") '--------------------- theURL = "http://www.autoitscript.com" With CreateObject("MSXML2.XMLHTTP") .open "GET", theURL, False .send t =.responseText 't = Replace(Replace(t, Chr(10) , "_"), "_", VbCrlf) & VbCrlf End With t = Replace(t, ">", ">" & VbCrlf ) msgbox Len(t) & vbcrlf & t Set objOutputFile = fso.OpenTextFile("temp.htm", ForWriting) i=1 On Error Resume Next Do Until i=Len(t)-1 objOutputFile.Write Mid(t,i,1) i=i+1 Loop objOutputFile.Close msgbox "ok" here is a partial code to convert this htm text to text (set f as temp.htm) CODE '------by Fredledingue-------- Set ts = f.OpenAsTextStream(ForReading, Tristatefalse) Do Until ts.AtEndOfStream t = ts.ReadLine '-------lowercase------ t = Replace(Replace(t," ","")," }","") t = Replace(t,"</tr>",VbCrlf) t = Replace(t,"</br>",VbCrlf) t = Replace(t,"<br>",VbCrlf) t = Replace(t,"</dd>",VbCrlf) t = Replace(t,"<p", VbCrlf & VbCrlf & VbTab & "<") t = Replace(t,"<table", VbCrlf & VbCrlf & "<") t = Replace(t,"</table>", VbCrlf & VbCrlf) t = Replace(t,"<title>","Page Title: ") t = Replace(t,"javascript>","") t = Replace(t,"<style>","<") t = Replace(t,"</style>",">") t = Replace(t,""","""") t = Replace(t,"&","&") t = Replace(t,"•","*") t = Replace(t,"—","--") t = Replace(t,"World"","""") t = Replace(t,"<!-- saved from url=","Saved from url: ") t = Replace(t,"onclick=","<") '-------uppercase------ t = Replace(Replace(t,"&NBSP;","")," }","") t = Replace(t,"</TR>",VbCrlf) t = Replace(t,"</BR>",VbCrlf) t = Replace(t,"<BR>",VbCrlf) t = Replace(t,"</DD>",VbCrlf) t = Replace(t,"<P", VbCrlf & VbCrlf & VbTab & "<") t = Replace(t,"<TABLE", VbCrlf & VbCrlf & "<") t = Replace(t,"</TABLE>", VbCrlf & VbCrlf) t = Replace(t,"<TITLE>","Page Title: ") t = Replace(t,"Javascript>","") t = Replace(t,"<STYLE type=text/css>","<") t = Replace(t,"<STYLE>","<") t = Replace(t,"</STYLE>",">") t = Replace(t,"<!-- SAVED FROM URL=","Saved from url: ") t = Replace(t,"onclick=","<") '----------Save internet links?-------------- If KeepLinks = True Then t = Replace(t,"HREF=", VbCrlf & ">_Link: ") t = Replace(t,"ID="," <id=") t = Replace(t,"href=", VbCrlf & ">_Link: ") t = Replace(t,"id=","<id=") End if '------------------------ If InStr(t,"<") >0 Or InStr(t,">") >0 Then i=0 u="" Do Until i=Len(t) i=i+1 c = Mid(t,i,1) If c="<" Then IsText = False Else If c=">" Then IsText = True End If End If If IsText = True And c <> ">" Then u = u & c End If Loop i=0 t=u u="" Text = Text & VbCrlf & t Else If t <> "-->" And IsText = True Then Text = Text & VbCrlf & t End If End If Loop t="" ts.Close '----------delete useless blank lines------------------ Do while InStr(Text, " ") Text = Replace(Text, " ", " ") Loop Do while InStr(Text, VbTab & " ") Text = Replace(Text, VbTab & " ", VbTab) Loop Do while InStr(Text, " " & VbTab) Text = Replace(Text, " " & VbTab, VbTab) Loop Do while InStr(Text, VbTab & VbCrlf) Text = Replace(Text, VbTab & VbCrlf, VbCrlf) Loop Do while InStr(Text, " " & VbCrlf) Text = Replace(Text, " " & VbCrlf, VbCrlf) Loop Do while InStr(Text, VbCrlf & VbCrlf & VbCrlf & VbCrlf) Text = Replace(Text, VbCrlf & VbCrlf & VbCrlf & VbCrlf, VbCrlf & VbCrlf & VbCrlf) Loop Text = Replace(Text, "_Link: ", VbCrlf & "_Link: ") '---------------------write to file---------------------- Set objOutputFile = fso.CreateTextFile(temp.txt, True) objOutputFile.Write Text objOutputFile.Close Link to comment Share on other sites More sharing options...
therks Posted July 23, 2005 Share Posted July 23, 2005 (edited) Uh... What the hell are you doing? You revive a 4 month dead thread, to reply with code that isn't even AutoIt?? And your code is huge. Why would you download the entire source of the page, then "convert" it into text when you could tap into an Internet Explorer COM object directly and just read it right off the page? No need to dump out your entire toolbox just to hammer a nail. *Edit: Removed a rude comment, replace it with some more nagging dialog. Edited July 23, 2005 by Saunders My AutoIt Stuff | My Github Link to comment Share on other sites More sharing options...
Fredledingue Posted July 23, 2005 Share Posted July 23, 2005 Saunders, First, I didn't notice (and anyway don't care) that the thread is four months old. Would you stop checking this forum after 4 months? I just noticed it was in the SECOND page of the forum and therefore still actual. The code posted by SvenP is not VBS and even translated to vbs, it doesn't work. Anyway his code, if working, would exactely download the entire source of the page and you would still need my "huge" code to do something with it. My codes are by no means huge, regarding to what they do. The 1st code will just pop up a msgbox and it's not longer than that of SvenP. The second code will download the source to a text file with an easy-to edit format and WITHOUT ERROR. The 3d code is sorting out text and links from the htm source (here saved as text). If you join code2 and code3, you practicaly have a text based web browser. Please try to do it smaller if you can. Link to comment Share on other sites More sharing options...
therks Posted July 23, 2005 Share Posted July 23, 2005 (edited) Please try to do it smaller if you can.<{POST_SNAPBACK}>Okay.$o_object = ObjCreate("InternetExplorer.Application") If IsObj($o_object) Then $o_object.visible = 0 $o_object.navigate ("http://www.google.com/") While $o_object.busy Sleep(100) WEnd $s_Text = $o_object.document.body.innerText $o_object.quit() EndIf MsgBox(0, 'Page Text', $s_Text)What really bothered me was not that you replied to a dead topic, was not that you provided large functions, but that you replied in an AutoIt forum with strictly VBS code.The code posted by SvenP is not VBS and even translated to vbs, it doesn't work.Wow, it's not VBS? Perhaps it's AutoIt code. Who would have guessed that would show up in an AutoIt forum?First, I didn't notice (and anyway don't care) that the thread is four months old. Would you stop checking this forum after 4 months?I just noticed it was in the SECOND page of the forum and therefore still actual.Uh yeah, maybe it was on the second page, after we both replied to it... But I'll admit to reviving dead threads on occasion, it happens, no biggie. Like I said, it was the content of your post that bothered me more than the fact you brought this back up.Anyway, I'll admit, maybe I flew off the handle a little bit, but I'd just finished reading several other stupid posts (which probably more deserved my flaming) and was in a grumpy sleep deprived mood when I hit yours and kind of just snapped.Btw: The code I got from above was all taken from Dale's IE automation UDF set (I just took out the bits I needed for the example).*Edit: Lots of rewording. Edited July 23, 2005 by Saunders My AutoIt Stuff | My Github Link to comment Share on other sites More sharing options...
Fredledingue Posted July 24, 2005 Share Posted July 24, 2005 (edited) I'm posting in the forum for AutoItX which is a dll for VBS implementation, at least that's the way I use it.I assume that if you want to talk strictly autoit code, you would post in another section of the forum.About your code, I must admit it's shorter, but I know from experience that it's much slower than the .responsetext method with MSXML2.XMLHTTP object, especialy when downloading large amounts of pages.It also doesn't allow you to manipulate the source code and extract other datas than text. Edited July 24, 2005 by Fredledingue Link to comment Share on other sites More sharing options...
therks Posted July 24, 2005 Share Posted July 24, 2005 I'm posting in the forum for AutoItX which is a dll for VBS implementation, at least that's the way I use it.<{POST_SNAPBACK}>Perhaps, but it still seemed odd to me that you didn't even use or mention the AutoItX.dll. I just would have mentioned in my post that you could do it exclusivly in VB. Also, the .dll can be used for many languages, not just VB.Anyway, I apologize for being rude. It was unbecoming of me. I'd had a bad couple of days and while that doesn't excuse my behaviour, I hope it provides a little understanding. My AutoIt Stuff | My Github Link to comment Share on other sites More sharing options...
WSCPorts Posted July 26, 2005 Share Posted July 26, 2005 (edited) i tried my own version of Inetget and Ie Automation i get a error at $iE.document.<- or if i try to set it like so $oDoc = $iE.document i still get a error something like cant do that operation on object :/ but heres my code this should work iono why its not expandcollapse popupfunc getTextToIE($strURL) Dim $objError = @Error Dim $oDoc Dim $strResult; ; Create the WinHTTPRequest ActiveX Object and IEObject. dim $WinHttpReq = ObjCreate("WinHttp.WinHttpRequest.5.1") dim $iE = ObjCreate("InternetExplorer.Application") ; Create an HTTP request. Dim $temp = $WinHttpReq.Open("GET", $strURL, false) ; Send the HTTP request. $WinHttpReq.Send() ; Retrieve the response text. $strResult = $WinHttpReq.ResponseText ; Return the response text To Ie. return $strResult $iE.RegisterAsBrowser = 1 ;get to a blank page $iE.navigate("about:blank") $iE.Visible = 1 ;make a Doc Object for automation ? $iE.document = $oDoc $oDoc.body.insertAdjacentHtml(0, $strResult) If @Error = 1 then $strResult = $objError $strResult = $strResult & "WinHTTP returned error: " +_ ($objError.number & 0xFFFF).toString() $strResult = $strResult & $objError.description MsgBox(2, "Error Raised", $strResult) Endif EndFunc dim $objText = getTextToIE("http://www.google.com") Edited July 26, 2005 by WSCPorts http://www.myclanhosting.com/defiasVisit Join and contribute to a soon to be leader in Custumized tools development in [C# .Net 1.1 ~ 2.0/C/C++/MFC/AutoIt3/Masm32] Link to comment Share on other sites More sharing options...
DaleHohm Posted July 26, 2005 Share Posted July 26, 2005 You've got some funky syntax in here and I'm really not clear on what you are trying to do. But you know what, I really wouldn't pile onto this post and keep it alive with your reply if I were you... too much water over the dam in this one. If you want some discussion on your code I'd suggest a new thread. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Fredledingue Posted July 29, 2005 Share Posted July 29, 2005 Saunders, Ok, No problem! BTW thanks for the piece of code, in fact I didn't know it... WSCPorts, The ("WinHttp.WinHttpRequest.5.1") obj doesn't seem to work. Try the other codes above posted by Saunders and me... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now