Jump to content

Need help scrapeing an IE "Report Viewer" - (Moved)


Recommended Posts

I have an HTML document under IE that I can't seem to "crack" using VBA. That is, I am trying to scrape it for specific data but cant seem

to find the data. I can see it through debug and can programmatically get a handle to the document but routines like GetElementsbyClassName, etc. all return NULL.

  I will try to describe the document and data below. Any ideas you may have would be very much appreciated. I am not a programmer, I am an Inspector for an

Aerospace company - they handed me this task and I've succeeded up to now. (note:There are no ID's to select nor are there Tags to select, just class names).

Please forgive me if I'm being vague, I'm seeing all of this for the first time and teaching myself.... until now.

I simply can't find the correct call sequence to get inside.

The document is defined as:

<!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml"><head><title>_8130</title><meta charset="UTF-8" /><style type="text/css">/*<![CDATA[*/

div.html-root>ol, div.html-root>ul, div.html-root>p

The data I need is embedded as shown (see items in bold text):

style="position:absolute;overflow:hidden;left:711px;top:140px;width:234px;height:36px;"><div style="position:absolute;top:1px;white-space:pre;left:82px;">12. ???????? </div><div style="position:absolute;top:19px;white-space:pre;left:84px;">Status/Work</div></div><div title="" class="textBox19 s10-" style="position:absolute;overflow:hidden;left:58px;top:178px;width:157px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:55px;">NOZZLE </div></div><div title="" class="textBox20 s11-" style="position:absolute;overflow:hidden;left:221px;top:178px;width:147px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:35px;"> 2448M48P07</div></div><div title="" class="textBox21 s10-" style="position:absolute;overflow:hidden;left:375px;top:178px;width:109px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:39px;">GENX</div></div><div title="" class="textBox22 s10-" style="position:absolute;overflow:hidden;left:490px;top:178px;width:80px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:39px;">1</div></div><div title="" class="textBox23 s11-" style="position:absolute;overflow:hidden;left:576px;top:178px;width:128px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:35px;">PAT5891C</div></div><div title="" class="txtStatus s12-" style="position:absolute;overflow:hidden;left:711px;top:178px;width:234px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:86px;">REPAIRED</div></div><div title="" class="panel4 s1-"

Link to comment
Share on other sites

  • Developers

Moved to the appropriate forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions. If it's super geeky and you don't know where to put it - it's probably here.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

#include <string.au3>
#include <array.au3>
#include <Inet.au3>
$sText = 'style="position:absolute;overflow:hidden;left:711px;top:140px;width:234px;height:36px;"><div style="position:absolute;top:1px;white-space:pre;left:82px;">12. ???????? </div><div style="position:absolute;top:19px;white-space:pre;left:84px;">Status/Work</div></div><div title="" class="textBox19 s10-" style="position:absolute;overflow:hidden;left:58px;top:178px;width:157px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:55px;">NOZZLE </div></div><div title="" class="textBox20 s11-" style="position:absolute;overflow:hidden;left:221px;top:178px;width:147px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:35px;"> 2448M48P07</div></div><div title="" class="textBox21 s10-" style="position:absolute;overflow:hidden;left:375px;top:178px;width:109px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:39px;">GENX</div></div><div title="" class="textBox22 s10-" style="position:absolute;overflow:hidden;left:490px;top:178px;width:80px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:39px;">1</div></div><div title="" class="textBox23 s11-" style="position:absolute;overflow:hidden;left:576px;top:178px;width:128px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:35px;">PAT5891C</div></div><div title="" class="txtStatus s12-" style="position:absolute;overflow:hidden;left:711px;top:178px;width:234px;height:27px;"><div style="position:absolute;top:10px;white-space:pre;left:86px;">REPAIRED</div></div><div title="" class="panel4 s1-"'

;~ Decomment next line and change to your URL
;~ $sText = _INetGetSource("https://www.blabla.com")

$sResult = _StringBetween($sText, 'white-space', '</div>')
_ArrayDisplay($sResult)
$sReturn = ""
For $i = 1 To 7
    If $i = 5 Then ContinueLoop
    $sReturn &= StringTrimLeft($sResult[$i], 17) & @CRLF
Next
ConsoleWrite($sReturn & @CRLF)
MsgBox(64 + 262144, Default, $sReturn, 0)

 

App: Au3toCmd              UDF: _SingleScript()                             

Link to comment
Share on other sites

Hi @ScottNAZ, and welcome to the AutoIt forums :welcome:
You could use something like this:

#include <Array.au3>
#include <StringConstants.au3>

Global $strText, _
       $arrResult


$strText = 'style="position:absolute;overflow:hidden;left:711px;top:140px;width:234px;height:36px;">' & _
           '<div style="position:absolute;top:1px;white-space:pre;left:82px;">12. ???????? </div>' & _
           '<div style="position:absolute;top:19px;white-space:pre;left:84px;">Status/Work</div></div>' & _
           '<div title="" class="textBox19 s10-" style="position:absolute;overflow:hidden;left:58px;top:178px;width:157px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:55px;">NOZZLE </div></div>' & _
           '<div title="" class="textBox20 s11-" style="position:absolute;overflow:hidden;left:221px;top:178px;width:147px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:35px;"> 2448M48P07</div></div>' & _
           '<div title="" class="textBox21 s10-" style="position:absolute;overflow:hidden;left:375px;top:178px;width:109px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:39px;">GENX</div></div>' & _
           '<div title="" class="textBox22 s10-" style="position:absolute;overflow:hidden;left:490px;top:178px;width:80px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:39px;">1</div></div>' & _
           '<div title="" class="textBox23 s11-" style="position:absolute;overflow:hidden;left:576px;top:178px;width:128px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:35px;">PAT5891C</div></div>' & _
           '<div title="" class="txtStatus s12-" style="position:absolute;overflow:hidden;left:711px;top:178px;width:234px;height:27px;">' & _
           '<div style="position:absolute;top:10px;white-space:pre;left:86px;">REPAIRED</div></div><div title="" class="panel4 s1-"'

$arrResult = StringRegExp($strText, '<[^>]+>([^<]+)<\/div>', $STR_REGEXPARRAYGLOBALMATCH)
If IsArray($arrResult) Then _ArrayDisplay($arrResult)

Then you can filter the array as you like :)

 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...