Jump to content
Sign in to follow this  
WayneCusack

Two screen scraping questions

Recommended Posts

WayneCusack

I am attempting to write a program that collects data from a web site, which is then pasted into an Excel spreadsheet.  The method I have been using is to select all of the data on the web page, copy it to the Clipboard, then pull it off the Clipboard and paste it into the Excel spreadsheet.  What I am running into is a pop-up message that warns me about allowing Internet Explorer to access my Clipboard.  I have to click a button on the pop-up to indicate that accessing the Clipboard is okay and then the window closes.  The intervention of the pop-up defeats the automation process.  Is there a way to get rid of the pop-up so that I do not have to intervene on each page that I access?

Secondly, when I put the text into the Excel spreadsheet I use the "Send" command.  Is there a faster method of dumping the data into the spreadsheet?

Here's the relevant portion of the code:

_IEAction($oIE,"selectall")
_IEAction($oIE,"copy")
$text1= _ClipBoard_GetData()
 
WinActivate("[CLASS:XLMAIN]", "")
 
Send($text1)
 

 

Share this post


Link to post
Share on other sites
NewPlaza

Could the messagebox be controlled by a controlsend command?

Share this post


Link to post
Share on other sites
kylomas

Post the URL of the site you are interested in and what data you want...


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
WayneCusack

Thanks Danp2 - your response solved the issue.  I had not been aware of that setting.

Can anyone offer a suggestion for my 2nd question?  What happens is that the contents of the web page are stored as text on the Clipboard, then pulled off the Clipboard and written into the Excel spreadsheet.  Although the web site contains publicly available information, its design doesn't seem to allow me to select individual data items and copy them - I can only get the data by selecting the whole page.  That means I also get a lot of text that is irrelevant.  When it is written to the Excel spreadsheet it is entered line by line, so the process of writing it is slow.  Is there some way to paste it in more quickly?

Share this post


Link to post
Share on other sites
kylomas

Can anyone offer a suggestion for my 2nd question?

 

Again, need more info, e.g.

1 - URL of the WEB site 

2 - What data you want

3 - How you are doing it now (code, not general description)


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
WayneCusack

Again, need more info, e.g.

1 - URL of the WEB site 

2 - What data you want

3 - How you are doing it now (code, not general description)

 

Here is the URL of the site:

http://ucbswww.bank-banque-canada.ca/scripts/search_english.cfm

The process is to insert a text string and then click a control that searches a database.  That produces a list of names that I am trying to capture.  I am not having a problem up to that point - I can input the string, do the search and generate the list.  The difficulty arises in copying the list to the Excel spreadsheet.

The code for that part is as follows:

_IEAction($oIE,"selectall")
_IEAction($oIE,"copy")
$text1= _ClipBoard_GetData()
WinActivate("[CLASS:XLMAIN]", "")
Send($text1)
 
The first line selects the whole page that is produced - i.e. - the list of names plus extraneous data
The 2nd line copies the selected data to the Clipboard
the 3rd line copies it back from the Clipboard to a variable
the 4th line activates the Excel spreadsheet
the last line writes the data to the spreadsheet.
 
It is the last line that I am trying to address.  It is sending the data to the spreadsheet as a string of text, as if it is being typed, and that is a slow way to enter the data.  I am wondering whether there is a faster way to enter it.  When I use the usual Windows paste commands it just enters the actual commands into the spreadsheet, rather than the data that I am trying to record.

Share this post


Link to post
Share on other sites
WayneCusack

I tried _IEBodyReadHTML and _IEBodyReadText

Neither one would capture the data

The site permits the data to be captured manually, but so far I have been unable to grab the data via automation

Share this post


Link to post
Share on other sites
WayneCusack

no, but I will try it - thanks

Share this post


Link to post
Share on other sites
WayneCusack

I tried it - it found 144 links, but none appear to be the ones I am trying to capture.  The list of results that is produced do not appear to be links.  Rather, the list seems to be just a series of text entries, one of which is submitted to the database when it is clicked on.

Share this post


Link to post
Share on other sites
WayneCusack

Sorry for not responding earlier - other matters became more pressing.

I think I sorted out the problems with a different approach.  Thanks for everyone's input - it was helpful.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×