Jump to content

Internet Explorer Excel Screen Scraping help


Recommended Posts

This looks like a great tool and i'm really looking forward to learning more about it.
I do code development, but not Windows or ObjectOriented, so while i'm not a newb, there definitely is an "above my head" level.  if you can keep that in mind, i'd apprecaite it.

I am attempting to write a script that does the following
1) access a web page (it's not publicly visible, so providing the URL is moot)
2) execute a Select All and Copy to "scrape" the contents
3) bring up an Excel spreadsheet
4) Paste the clipboard contents to a worksheet
5) execute a Copy in Excel (the correct select area replans in place from the Paste)
6) switch to a new worksheet tab
7) execute a Paste Special Values (to put only text contents from the web page scrape)
8) repeat

Some steps were pretty clear to set up, but some have me stumped.

I start by activating IE and grabbing the object handle, then using the _IE functions

    WinActivate ("Internet Explorer")
    Global $oIE = _IEAttach("<windows_title>")
    _IENavigate($oIE, <URL>)
           ***
    _IEAction($oIE,"selectall")
    _IEAction($oIE,"copy")

The first problem happens at here at the *** line
 

When the web page renders the context/focus/cursor-point is in a text entry box - hitting CTRL-A or executing the Select All here does nothing - (when doing this operation manually, i have to shift focus by hitting the TAB key a couple of times.  (see the attached graphic)  I tried doing a
 

    Send ("{TAB}{TAB}")
        or

    ControlSend ("{TAB}{TAB}")
 

before executing the SelectAll, but it was not reliable.  Then I tried
 

    _IEAction($oIE,"blur")
 

but would get an error "IE.au3 T3.0-1 Error from function _IEAction(blur), $_IEStatus_COMError (-2147352570)" so evidently, i'm not using this function properly.

#1 question is, how can I reliably get away from the text entry box so i can do the Select All properly?

-----------------------------------------------------------------------------------------------

the second question is how to use the _Excel functions to properly manage/manipulate Excel.

When i run this script, i already have the Excel spreadsheet i want to use open.
i'm tying to use the _ExcelBookAttach function to get the handle of the open Excel workbook so I can reference _Excel functions.
I've tried

 

    $oExcel = _ExcelBookAttach ("Microsoft Excel")
    $oExcel = _ExcelBookAttach ("Microsoft Excel - <worksheet filename>")
    $oExcel = _ExcelBookAttach ("<worksheet filename>")

 

but no mater, the value of $oExcel is equal to 0, not the object handle.

#2 question is - what am I doing wrong with the _ExcelBookAttach function?

-----------------------------------------------------------------------------------------------

#3 question - what's the best way to execute a "Paste Special - Values" command using either Excel or Windows controls.
 

I'd rather not do
 

    Send ("!esv")
 

as using Send or ControlSend has proven to be unreliable.

Thanks in advance and Happy New Year Everyone!!!

BeekerC

post-32797-0-26293300-1388552575_thumb.j

Link to comment
Share on other sites

Hi there BeekerC!, I feel I'm in the same experience level as you and I've had some practice with the functions your questioning, so I'm going to take this opportunity to try and give back stuff that I've learned from here(But I could be wrong!)

 

I start by activating IE and grabbing the object handle, then using the _IE functions

    WinActivate ("Internet Explorer")
    Global $oIE = _IEAttach("<windows_title>")
    _IENavigate($oIE, <URL>)
           ***
    _IEAction($oIE,"selectall")
    _IEAction($oIE,"copy")

The first problem happens at here at the *** line
 

When the web page renders the context/focus/cursor-point is in a text entry box - hitting CTRL-A or executing the Select All here does nothing - (when doing this operation manually, i have to shift focus by hitting the TAB key a couple of times.  (see the attached graphic)  I tried doing a
 

    Send ("{TAB}{TAB}")
        or

    ControlSend ("{TAB}{TAB}")
 

before executing the SelectAll, but it was not reliable.  Then I tried
 

    _IEAction($oIE,"blur")
 

but would get an error "IE.au3 T3.0-1 Error from function _IEAction(blur), $_IEStatus_COMError (-2147352570)" so evidently, i'm not using this function properly.

#1 question is, how can I reliably get away from the text entry box so i can do the Select All properly?

 

I'm going to start with a couple questions of my own. I'm totally confused when you say the webpage is not publicly available, so providing the URL is moot.

it seems as if 

    WinActivate ("Internet Explorer")
    Global $oIE = _IEAttach("<windows_title>")
    _IENavigate($oIE, <URL>)

Could be replaced with 

$window = _IECreate ("the url you use in the navigate function")

 _IEAction($window ,"selectall")
 _IEAction($window ,"copy")

I don't think there's anything wrong with the code your using, it just seems it could be simpler. I've yet to get _IEAttach to work correctly for me, but I think that can be attributed to me using the wrong parameters. Are you confident that _IEAttach is successful from debugging with console or msgbox? If not then that is probably why your _IE actions are not registering. If you are succesfully attaching, I would also try _IEAction 'unselect' before 'selectall' possibly.

Also if your using Send or ControlSend, you need to have a & between each different keyboard command your sending. so  

  Send ("{TAB}" & "{TAB}")

Would send 2 tabs. I'm surprised Scite will even let you run it the way it is now.

 

 

the second question is how to use the _Excel functions to properly manage/manipulate Excel.

 

When i run this script, i already have the Excel spreadsheet i want to use open.
i'm tying to use the _ExcelBookAttach function to get the handle of the open Excel workbook so I can reference _Excel functions.
I've tried

 

    $oExcel = _ExcelBookAttach ("Microsoft Excel")
    $oExcel = _ExcelBookAttach ("Microsoft Excel - <worksheet filename>")
    $oExcel = _ExcelBookAttach ("<worksheet filename>")

 

but no mater, the value of $oExcel is equal to 0, not the object handle.

 

#2 question is - what am I doing wrong with the _ExcelBookAttach function?

 

Again I've yet to have a use for _ExcelAttach, I've always preferred to use _ExcelOpen or _ExcelNew. (Nothing wrong with attach!)

The default string parameter that ExcelAttach uses is the full file location(C:BlahBlahfile.xlsx), so I would either just change your string to the file location rather then the Window title, or change the second paramter of _ExcelAttach to use the Filename of the open workbook, or the title.

  $oExcel = _ExcelBookAttach ("Microsoft Excel" , "title")
    $oExcel = _ExcelBookAttach ("Microsoft Excel - <worksheet filename>" , "title")
    $oExcel = _ExcelBookAttach ("<worksheet filename>" , "filename")

    $oExcel = _ExcelBookAttach ("<worksheet Full Path>" , "filepath")

 

#3 question - what's the best way to execute a "Paste Special - Values" command using either Excel or Windows controls.

 

I'd rather not do
 

    Send ("!esv")
 

as using Send or ControlSend has proven to be unreliable.

 

I don't think there's a way to imply this exactly using an AutoIt function. I think that using a modifier like ! or + only performs on the next character sent, so what you have would only send Alt+E and the S and V independently.  

In this situation I think you need to use this

 Send("{ALT down}" & "{E down}" & "{S down}" & "{V down}" & "{V up}" etc...)   *There might or might not be spaces between ALT and down, the reference for Send shows different ways in 2 examples for holding a key. 

I hope I could help! Happy new year!

Edited by Prime03
Link to comment
Share on other sites

first i would use the IE functions to navigate to the page you want to copy. once there, i would look at the code on the page to see if it is easy enough to pull the info from the source instead of trying to use the send function.

once you have the data from the webpage you can then use the Excel UDF to place the data from the webpage to the spreadsheet of your choice.

 

this would be the most reliable way to do this.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...