Jump to content
Sign in to follow this  
koma

To get text from a web page

Recommended Posts

koma

I'm writing a script that takes different actions depending on the contents of the current web page in my browser.

Since WinGetText doesn't work well with Firefox I'm using a workaround:

I send ctrl-A and ctrl-C to the browser and then uses whats in the clipboard.

However, the problem is that sometimes it is almost like the ctrl-C doesn't work. Nothing ends up in the clipboard even though it should have...

Anyone knows what's going on?

Share this post


Link to post
Share on other sites
phillip123adams

I'm writing a script that takes different actions depending on the contents of the current web page in my browser.

Since WinGetText doesn't work well with Firefox I'm using a workaround:

I send ctrl-A and ctrl-C to the browser and then uses whats in the clipboard.

However, the problem is that sometimes it is almost like the ctrl-C doesn't work. Nothing ends up in the clipboard even though it should have...

Anyone knows what's going on?

<{POST_SNAPBACK}>

Just a guess, but I have had similar problems with ctri-C because of the case of the "C". Here's a quote from the AutoIt help file for the Send function:

N.B. Some programs are very choosy about capital letters and CTRL keys, i.e. "^A" is different to "^a". The first says CTRL+SHIFT+A, the second is CTRL+a. If in doubt, use lowercase!


Phillip

Share this post


Link to post
Share on other sites
koma

Just a guess, but I have had similar problems with ctri-C because of the case of the "C".  Here's a quote from the AutoIt help file for the Send function:

<{POST_SNAPBACK}>

I don't think that is the problem, because it works _sometimes_...

Share this post


Link to post
Share on other sites
Jos

Just a couple of thoughts:

- are you sure the page is loaded before doing ctrl+a & crtl+c?

- are you sure the Browser/Section has the focus when doing ctrl+a & crtl+c?


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites
layer

JdeB is probably right, but about the Ctrl+c.... This has happened to me a few times, like im unable to use my windows key, or copy stuff... This is because i used to have a program where it's hotkey was "Alt", it was the program Ventrilo where you could talk to people with your voice online... And anyways, pressing "Alt" usually disabled my windows key and copy functions for some reason... :) but pressing "Alt" again, fixed it! So I'd give JdeB's idea a try first, if that doesn't work, try pressing "Alt"! :D

EDIT: fixed bold

Edited by layer

FootbaG

Share this post


Link to post
Share on other sites
koma

Just a couple of thoughts:

- are you sure the page is loaded before doing ctrl+a & crtl+c?

- are you sure the Browser/Section has the focus when doing ctrl+a & crtl+c?

<{POST_SNAPBACK}>

I am sure the page is loaded, but I am not sure that the Browser/Section has the focus...

Is there a best practice for setting focus for a specific part of a window?

Share this post


Link to post
Share on other sites
steveR

Try the ControlFocus() command

Also remember alot of times you can use CTRL-INSERT as Copy and SHIFT-INSERT as Paste as alternatives. Not all software still support those hotkeys tho.

EDIT: If Firefox doesn't like the ControlFocus() command, pick somewhere on the page where you know there won't be any links or anything and just do MouseMove() and MouseClick() to set the focus.

Edited by steveR

AutoIt3 online docs Use it... Know it... Live it...MSDN libraryglobal Help and SupportWindows: Just another pane in the glass.

Share this post


Link to post
Share on other sites
Blue_Drache

What about InetGet() and pulling the page out of the cache?


Lofting the cyberwinds on teknoleather wings, I am...The Blue Drache

Share this post


Link to post
Share on other sites
koma

What about InetGet() and pulling the page out of the cache?

<{POST_SNAPBACK}>

InetGet may be good for static pages, but this is really dynamic stuff...

(Or did I understand InetGet wrong?)

Share this post


Link to post
Share on other sites
koma

Try the ControlFocus() command

Also remember alot of times you can use CTRL-INSERT as Copy and SHIFT-INSERT as Paste as alternatives. Not all software still support those hotkeys tho.

EDIT: If Firefox doesn't like the ControlFocus() command, pick somewhere on the page where you know there won't be any links or anything and just do MouseMove() and MouseClick() to set the focus.

<{POST_SNAPBACK}>

I did try the mouseClick without luck. However, I didn't try the mouseMove... Maybe that will do the trick...

Share this post


Link to post
Share on other sites
koma

Here is a bit of code that works - sometimes:

ClipPut("!")

WinActivate("CNN")
Send("^a")
Send("^c")

MsgBox(0, "Text found was:", ClipGet())

To run it, open either Firefox or IE at www.cnn.com and run the script.

The funny thing is that most of the time, the script returns the text from cnn.com.

But when it fails it returns 1. Not "!" as I would expect!! Anyone knows why??

:)

Share this post


Link to post
Share on other sites
steveR

I did try the mouseClick without luck. However, I didn't try the mouseMove... Maybe that will do the trick...

<{POST_SNAPBACK}>

Well I meant that you use the MouseMove() in conjuction with MouseClick().

Use MouseMove() to move the mouse to an area of the page (off to the side maybe), and then use MouseClick() to let the control (in this case, a fancy edit box) gain focus.

Someplace where there are no links or anything. It only takes a pixel.

Edited by steveR

AutoIt3 online docs Use it... Know it... Live it...MSDN libraryglobal Help and SupportWindows: Just another pane in the glass.

Share this post


Link to post
Share on other sites
layer

try activating it, or "focusing" it..

WinActivate ( "title" [, "text"] )

although i dont use firefox..

EDIT: the "Text" paremeter in WinActivate, is optional...

Edited by layer

FootbaG

Share this post


Link to post
Share on other sites
koma

try activating it, or "focusing" it..

WinActivate ( "title" [, "text"] )

although i dont use firefox..

EDIT: the "Text" paremeter in WinActivate, is optional...

<{POST_SNAPBACK}>

I have a WinActivate in my script... But with no text parameter. But that doesn't matter, does it?

Share this post


Link to post
Share on other sites
phillip123adams

Here is a bit of code that works - sometimes:

ClipPut("!")

WinActivate("CNN")
Send("^a")
Send("^c")

MsgBox(0, "Text found was:", ClipGet())

To run it, open either Firefox or IE at www.cnn.com and run the script.

The funny thing is that most of the time, the script returns the text from cnn.com.

But when it fails it returns 1. Not "!" as I would expect!! Anyone knows why??

:)

<{POST_SNAPBACK}>

After allowing the CNN page to fully load in Firefox, I tried your code, and without a Sleep period after ctrl-c, I get "!". With a Sleep period greater than 150, I got the page each of the 10 times I tried it on 3 different pages.

Perhaps a larger value would be even safer.


Phillip

Share this post


Link to post
Share on other sites
Alterego

Use Lynx. You can either use Cygwin or get one of the several flavors compiled for windows. After Lynx is set up as an environmental path variable your run command will look something like this.

RunWait(@ComSpec & ' /c ' & 'lynx -dump -accept_all_cookies C:\website.html http://www.website.com')

Note that -dump instructs Lynx to parse the html of the page. So you're text file will end up with the text that you would normally see in your browser and not the source code. I use this method to scrape alexa for their rankings since they are not always directly in the browser code and are obfuscated. (and I use cygwin, it's worth the day it takes to set up and get used to it. Windows compiled versions of Lynx never seem to work quite right for me)

Share this post


Link to post
Share on other sites
DaleHohm

Or just get surgical about it:

Opt("WinTitleMatchMode", 4); allow ClassName lookup to avoid window confusion
$appWindow = WinGetHandle("classname=MozillaWindowClass")

;Give Focus to the main window in FireFox
;ControlId for main FireFox window is 1 (use the Active Window Info tool)
ControlFocus($appWindow,"",1)
Send("^{a}^{c}")

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×