Sign in to follow this  
Followers 0
cypher175

Open a Webpage Stripped of all Media..?

14 posts in this topic

Is it possible with AutoIt to open a webpage in its basic form, with all of its media (Pictures, Videos, Graphics, Ect) stripped out of it..??

Share this post


Link to post
Share on other sites



Good questions. I can only suggest playing with _IECreateEmbedded. It works differently to IECreate - gives you more options.

This is something more for someone like DaleHohm.


Post your code because code says more then your words can. SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y. Use Opt("MustDeclareVars", 1)[topic="84960"]Brett F's Learning To Script with AutoIt V3[/topic][topic="21048"]Valuater's AutoIt 1-2-3, Class... is now in Session[/topic]Contribution: [topic="87994"]Get SVN Rev Number[/topic], [topic="93527"]Control Handle under mouse[/topic], [topic="91966"]A Presentation using AutoIt[/topic], [topic="112756"]Log ConsoleWrite output in Scite[/topic]

Share this post


Link to post
Share on other sites

If you open any webpage and save it as an ( .htm ) file, it's stripped of all images automatically.


My Projects: [topic="89413"]GoogleHack Search[/topic], [topic="67095"]Swiss File Knife GUI[/topic], [topic="69072"]Mouse Location Pointer[/topic], [topic="86040"]Standard Deviation Calculator[/topic]

Share this post


Link to post
Share on other sites

but i dont want to save the webpage or view it embedded in a GUI , I just want to open it up to view, but stripped out to its basic form though.. is this even possible in AutoIt at all..??

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

but i dont want to save the webpage or view it embedded in a GUI , I just want to open it up to view, but stripped out to its basic form though.. is this even possible in AutoIt at all..??

If you don't care about the html and just want the text you can use _IEBodyReadText. I would look over the custom UDFs for _IE to see what matches what you want. Stripped to its basic form could mean a few things to me. When you say stripped, I think text only - no html.

; *******************************************************
; Example 1 - Open a browser with the basic example, read the body Text
;               (the content with all HTML tags removed) and display it in a MsgBox
; *******************************************************
;
#include <IE.au3>
$oIE = _IE_Example ("basic")
$sText = _IEBodyReadText ($oIE)
MsgBox(0, "Body Text", $sText)

Per the helpfile above.

And, yes, it can be done in autoit. There are more than a few ways to accomplish this. I can think of three off the top of my head but _IE functions might be a simpler route for you.

Edited by Ealric

My Projects: [topic="89413"]GoogleHack Search[/topic], [topic="67095"]Swiss File Knife GUI[/topic], [topic="69072"]Mouse Location Pointer[/topic], [topic="86040"]Standard Deviation Calculator[/topic]

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

well basically i just want text, links & forms .. thats about it, no images or media at all..!!!

how do you use these parameters though with _IECreate..??

basic = (Default) simple HTML page with text, links and images

form = simple HTML page with multiple form elements

Edited by cypher175

Share this post


Link to post
Share on other sites

Anybody know how to open or navigate to a website using these parameters..??

Or are these parameters just for the _IE_Example UDF..??

basic = (Default) simple HTML page with text, links and images

form = simple HTML page with multiple form elements

Share this post


Link to post
Share on other sites

what, nobody knows..??

Share this post


Link to post
Share on other sites

What is your goal? Speed or something else? You can turn off images in Internet Explorer, but for the entire browser, not just a specific page. You can set $f_wait to 0 in _IECreate or _IENavigate and then monitor the page readyState yourself with _IEPropertyGet and use _IEAction($oIE, "stop") when the readyState is >= 3 (interactive).

If you never want images, check out Lynx... I don't imagine is is really what you want, but it is a really cool terminal-based browser written using curses... back in the early days of the Web it was really cool (still is - based on what it can do in a cleant text interface).

http://en.wikipedia.org/wiki/Lynx_(web_browser)

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

isnt there anyway at all to use these parameters though with _IECreate or _IENavigate..??

basic = (Default) simple HTML page with text, links and images

form = simple HTML page with multiple form elements

; *******************************************************

; Example 1 - Create browser windows with each of the example pages displayed.

; The object variable returned can be used just as the object

; variables returned by _IECreate or _IEAttach

; *******************************************************

;

#include <IE.au3>

$oIE_basic = _IE_Example ("basic")

$oIE_form = _IE_Example ("form")

Share this post


Link to post
Share on other sites

anybody know much about this..??

@DaleHohm is there anyway to do like $f_wait to 0 in _IECreate and then have like a ( Do-Until ) where it keeps checking to see if all of the Object names that are needed for the next function in the app are available on the website with { $button = _IEGetObjByName($IE, "button", -1) } and keep looping this until all of the button objects are available to be clicked..??

becuase thats basicaly what i need..

I tried the way you mentions with the _IEPropertyGet and use _IEAction($oIE, "stop") when the readyState is >= 3 (interactive). but it always stops with a blank page and my script just fails to collect the objects in the page that are needed. so what else could I do then..??

Share this post


Link to post
Share on other sites

any takers on this subject at all..??

Share this post


Link to post
Share on other sites

This works:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>

<script type='text/javascript'>
window.onload = breakImages;

function breakImages() {
    var im, newNode;
    var i=document.images.length
    while (--i+1) {
        im = document.images[i];
        im.parentNode.removeChild(im);
    }
}
</script>

</head>

<body>
    Pic 1<br />
    <img src='pic1.gif'>
    <br />
    <br />
    Pic 2<br />
    <img src='pic2.gif'>
    <br />
    <br />
    Pic 3<br />
    <img src='pic3.gif'>
</body>
</html>

But I tried injecting with AutoIt and I get an ActiveX warning.

#include <IE.au3>
;RegWrite("HKCU\SOFTWARE\Microsoft\Internet Explorer\Main","Display Inline Images","REG_SZ","no")

#cs
$s_script = ""
$s_script &= "var im, newNode;" & @CRLF
$s_script &= "var i=document.images.length" & @CRLF
$s_script &= "    while (--i+1) {"& @CRLF
$s_script &= "  im = document.images[i];"& @CRLF
$s_script &= "    im.parentNode.removeChild(im);" & @CRLF
$s_script &= "}"
#ce

$s_script = "var im, newNode;var i=document.images.length;while (--i+1) {im = document.images[i];im.parentNode.removeChild(im);}"

$oIE = _IECreate (@ScriptDir & "\test.html",0,1,1,0)
_IEHeadInsertEventScript ($oIE, "document", "onload", $s_script )

Share this post


Link to post
Share on other sites

Is there anyway to use _IEBodyReadHTML & _IEBodyWriteHTML to find <img src=""> in a webpage and then like change the code in the page so that the images & media dont load.. or am i just speaking out my arse here..??

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0