Jump to content

To get text from a web page


koma
 Share

Recommended Posts

I'm writing a script that takes different actions depending on the contents of the current web page in my browser.

Since WinGetText doesn't work well with Firefox I'm using a workaround:

I send ctrl-A and ctrl-C to the browser and then uses whats in the clipboard.

However, the problem is that sometimes it is almost like the ctrl-C doesn't work. Nothing ends up in the clipboard even though it should have...

Anyone knows what's going on?

Link to comment
Share on other sites

I'm writing a script that takes different actions depending on the contents of the current web page in my browser.

Since WinGetText doesn't work well with Firefox I'm using a workaround:

I send ctrl-A and ctrl-C to the browser and then uses whats in the clipboard.

However, the problem is that sometimes it is almost like the ctrl-C doesn't work. Nothing ends up in the clipboard even though it should have...

Anyone knows what's going on?

<{POST_SNAPBACK}>

Just a guess, but I have had similar problems with ctri-C because of the case of the "C". Here's a quote from the AutoIt help file for the Send function:

N.B. Some programs are very choosy about capital letters and CTRL keys, i.e. "^A" is different to "^a". The first says CTRL+SHIFT+A, the second is CTRL+a. If in doubt, use lowercase!

Phillip

Link to comment
Share on other sites

Just a guess, but I have had similar problems with ctri-C because of the case of the "C".  Here's a quote from the AutoIt help file for the Send function:

<{POST_SNAPBACK}>

I don't think that is the problem, because it works _sometimes_...
Link to comment
Share on other sites

  • Developers

Just a couple of thoughts:

- are you sure the page is loaded before doing ctrl+a & crtl+c?

- are you sure the Browser/Section has the focus when doing ctrl+a & crtl+c?

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

JdeB is probably right, but about the Ctrl+c.... This has happened to me a few times, like im unable to use my windows key, or copy stuff... This is because i used to have a program where it's hotkey was "Alt", it was the program Ventrilo where you could talk to people with your voice online... And anyways, pressing "Alt" usually disabled my windows key and copy functions for some reason... :) but pressing "Alt" again, fixed it! So I'd give JdeB's idea a try first, if that doesn't work, try pressing "Alt"! :D

EDIT: fixed bold

Edited by layer
FootbaG
Link to comment
Share on other sites

Just a couple of thoughts:

- are you sure the page is loaded before doing ctrl+a & crtl+c?

- are you sure the Browser/Section has the focus when doing ctrl+a & crtl+c?

<{POST_SNAPBACK}>

I am sure the page is loaded, but I am not sure that the Browser/Section has the focus...

Is there a best practice for setting focus for a specific part of a window?

Link to comment
Share on other sites

Try the ControlFocus() command

Also remember alot of times you can use CTRL-INSERT as Copy and SHIFT-INSERT as Paste as alternatives. Not all software still support those hotkeys tho.

EDIT: If Firefox doesn't like the ControlFocus() command, pick somewhere on the page where you know there won't be any links or anything and just do MouseMove() and MouseClick() to set the focus.

Edited by steveR
AutoIt3 online docs Use it... Know it... Live it...MSDN libraryglobal Help and SupportWindows: Just another pane in the glass.
Link to comment
Share on other sites

What about InetGet() and pulling the page out of the cache?

<{POST_SNAPBACK}>

InetGet may be good for static pages, but this is really dynamic stuff...

(Or did I understand InetGet wrong?)

Link to comment
Share on other sites

Try the ControlFocus() command

Also remember alot of times you can use CTRL-INSERT as Copy and SHIFT-INSERT as Paste as alternatives. Not all software still support those hotkeys tho.

EDIT: If Firefox doesn't like the ControlFocus() command, pick somewhere on the page where you know there won't be any links or anything and just do MouseMove() and MouseClick() to set the focus.

<{POST_SNAPBACK}>

I did try the mouseClick without luck. However, I didn't try the mouseMove... Maybe that will do the trick...
Link to comment
Share on other sites

Here is a bit of code that works - sometimes:

ClipPut("!")

WinActivate("CNN")
Send("^a")
Send("^c")

MsgBox(0, "Text found was:", ClipGet())

To run it, open either Firefox or IE at www.cnn.com and run the script.

The funny thing is that most of the time, the script returns the text from cnn.com.

But when it fails it returns 1. Not "!" as I would expect!! Anyone knows why??

:)

Link to comment
Share on other sites

I did try the mouseClick without luck. However, I didn't try the mouseMove... Maybe that will do the trick...

<{POST_SNAPBACK}>

Well I meant that you use the MouseMove() in conjuction with MouseClick().

Use MouseMove() to move the mouse to an area of the page (off to the side maybe), and then use MouseClick() to let the control (in this case, a fancy edit box) gain focus.

Someplace where there are no links or anything. It only takes a pixel.

Edited by steveR
AutoIt3 online docs Use it... Know it... Live it...MSDN libraryglobal Help and SupportWindows: Just another pane in the glass.
Link to comment
Share on other sites

try activating it, or "focusing" it..

WinActivate ( "title" [, "text"] )

although i dont use firefox..

EDIT: the "Text" paremeter in WinActivate, is optional...

Edited by layer
FootbaG
Link to comment
Share on other sites

try activating it, or "focusing" it..

WinActivate ( "title" [, "text"] )

although i dont use firefox..

EDIT: the "Text" paremeter in WinActivate, is optional...

<{POST_SNAPBACK}>

I have a WinActivate in my script... But with no text parameter. But that doesn't matter, does it?
Link to comment
Share on other sites

Here is a bit of code that works - sometimes:

ClipPut("!")

WinActivate("CNN")
Send("^a")
Send("^c")

MsgBox(0, "Text found was:", ClipGet())

To run it, open either Firefox or IE at www.cnn.com and run the script.

The funny thing is that most of the time, the script returns the text from cnn.com.

But when it fails it returns 1. Not "!" as I would expect!! Anyone knows why??

:)

<{POST_SNAPBACK}>

After allowing the CNN page to fully load in Firefox, I tried your code, and without a Sleep period after ctrl-c, I get "!". With a Sleep period greater than 150, I got the page each of the 10 times I tried it on 3 different pages.

Perhaps a larger value would be even safer.

Phillip

Link to comment
Share on other sites

Use Lynx. You can either use Cygwin or get one of the several flavors compiled for windows. After Lynx is set up as an environmental path variable your run command will look something like this.

RunWait(@ComSpec & ' /c ' & 'lynx -dump -accept_all_cookies C:\website.html http://www.website.com')

Note that -dump instructs Lynx to parse the html of the page. So you're text file will end up with the text that you would normally see in your browser and not the source code. I use this method to scrape alexa for their rankings since they are not always directly in the browser code and are obfuscated. (and I use cygwin, it's worth the day it takes to set up and get used to it. Windows compiled versions of Lynx never seem to work quite right for me)

Link to comment
Share on other sites

Or just get surgical about it:

Opt("WinTitleMatchMode", 4); allow ClassName lookup to avoid window confusion
$appWindow = WinGetHandle("classname=MozillaWindowClass")

;Give Focus to the main window in FireFox
;ControlId for main FireFox window is 1 (use the Active Window Info tool)
ControlFocus($appWindow,"",1)
Send("^{a}^{c}")

Dale

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

  • 13 years later...
1 hour ago, Poconnor2018 said:

I know this issue is likely solved at this point

After 13 years,  I would hope so.

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...