Jump to content
Sign in to follow this  
JerryD

Need assistance with IE DOM objects

Recommended Posts

JerryD

First, let me say I'm awed at the IE.au3 library. I'm scripting things with IE that I never dreamed were possible!

That said, let me just say I've been working hard over the last week or two trying to learn and understand using the IE DOM model - it is INTENSE! I'm working on a script that, at this point, is just looking to retrieve information from Web pages - specifically from BlackBerry's firmware download sites (they maintain one for most providers). My script starts up IE {$oIEApp=_IECreate($url)} to get to a provider's "site", then loads the DOM {$o_IEDoc=_IEDocGetObj($oIEApp)} of the page, retrieves all the "OPTIONS" major software categories available {$oIE_OptTags=_IETagNameGetCollection($o_IEDoc,'OPTION')}, selects one of the options by setting the .selected attribute of an option to 1 then using _IEFormSubmit to retrieve the list of individual packages available. The information for the packages is stored within the second table which I'm able to retrieve using {_IETagNameGetCollection ( $o_IEDoc, 'TABLE', 1 )} after refreshing the $o_IEDoc object.

This is all great, and with a ludicrous number of get objects and FOR loops, I MIGHT be able to retrieve what I'm looking for. HOWEVER, I'm sure there's got to be an easier way!

The attached text file is a basic diagram of the table object I'm retrieving.

DOM_Diagram.txt

So here's what I can use some help with.

The fist thing I need to get is the text within the object SPAN class=cMBr. I currently retrieve it by getting a collection of all the SPAN items within the table {$oIE_ProdSpan = _IETagNameGetCollection ( $oIE_Table_Col, 'SPAN' )} from which the text I'm looking for is {$oIE_ProdSpan(0).innerText}, but moving further into the table, the second row (TR) of the table is a space holder, and then the information for each software item is in subsequent rows (one for each software item) within name of the software item is in a SPAN class=cMB object within the row, and the details of the item (Application version, Software Platform, File Name, and File Size) are contained within LI objects within SPAN class=cM objects. UNFORTUNATELY, there are SPAN class=cM objects which contain information I'm not looking for (at the moment). Here's the pseudo-code for what I've got so far:

$oIEApp = _IECreate ( $Providers[$iProvider][1], 0, $Visible )
$o_IEDoc = _IEDocGetObj ( $oIEApp )
$oIE_OptTags = _IETagNameGetCollection ( $o_IEDoc, 'OPTION' )
   For $Tag In $oIE_OptTags
   $Tag.Selected = 1
   $oProductSelect = _IEFormGetObjByName ( $oIeApp, 'productSelect' )
   _IEFormSubmit ($oProductSelect)
   $o_IEDoc = _IEDocGetObj ( $oIEApp )
   $oIE_Table_Col = _IETagNameGetCollection ( $o_IEDoc, 'TABLE', 1 )
   $oIE_ProdSpan = _IETagNameGetCollection ( $oIE_Table_Col, 'SPAN' )
   MsgBox ( 0, 'Product Name', $oIE_ProdSpan(0).innerText )
; Here's where I'm stuck!

So what I'm looking for help with:

  • Is there a way to retrieve the SPAN class=cMBr innerText value from the $o_IEDoc , $oIE_Table_Col, or $oIE_ProdSpan object?
  • What code do I need to get the count of LI objects within each $oIE_ProdSpan object?
  • How do I retrieve the TR objects within the $oIE_Table_Col object?
  • Is there a way to do this by just addressing the original DOM object ($o_IEDoc) and NOT have to make little collections of objects within it, and collections of objects within THOSE objects?
Incidentally, one of the sites I'm using is the site to download BlackBerry software and firmware for AT&T.

The list of all the available sites - most of which are hosted by RIM and use the same format - are here.

The issue at hand is that I support BlackBerries from six different vendors, and checking for firmware updates is an long, involved task of going to each site, selecting each item I need, and comparing the firmware version to what I have. If I could somehow automate the process, I'd save myself hours of work each month.

Also, BlackBerry addicts are obsessed with having THE latest version of firmware, but with the dozens of providers, it's impossible to track all the versions available for a specific model. I'm sure at this point I could create a script that could scan all the sites (hosted by RIM anyway) and glean that info out and get it into an INI file, or maybe a CSV file for analysis. CrackBerry Addicts would LOVE this!

Thanks in advance for your help!

Share this post


Link to post
Share on other sites
DaleHohm

I know you tried pretty hard to explain yourself well, but I am not understanding what your questions are. So, let me take a stab at a few things...

The class for an element is held in the property .className, so you can

For $oSpan in $oSpans
    If String($oSpan.className) = "cMBr" Then
        ConsoleWrite("Found It: " & _IEPropertyGet($oSpan, "innertext") & @CR)
    EndIf
Next

Have you looked at _IETableWriteToArray to see if it puts your values into array values that you can use?

Are you using DebugBar to study the DOM? Highly recommended if not -- see my sig.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
JerryD

Dale,

Thanks for the reply, and YES, the Debug Bar is awesome! What you've suggested is getting there, but once again I find myself frustrated and confused! (BIG SURPRISE!)

Here's the full code of a function to retrieve the info (note my comments):

Func SiteGetProductDetail()
    Local $Products, $oProductSelect, $oIE_ProdSpan, $Rows, $Spans, $LIs
    $Products = IniReadSection ( $INIFile, $Providers[$iProvider][0] & '_ProductsSelected' )
    For $iProduct = 1 To $Products[0][0]
        If Not StringInStr ( $Products[$iProduct][1], '-' ) And $Products[$iProduct][0] <> 'Run' Then
            $o_IEDoc = _IEDocGetObj ( $oIEApp )
            $oIE_OptTags = _IETagNameGetCollection ( $o_IEDoc, 'OPTION' )
            For $Tag In $oIE_OptTags
                If $Products[$iProduct][1] = $Tag.Text Then
                    $Tag.Selected = 1                                                   ; Select the drop down option
                    $oProductSelect = _IEFormGetObjByName ( $oIeApp, 'productSelect' )
                    _IEFormSubmit ($oProductSelect)                                     ; Retreive the info on that option / product
                    
                    $o_IEDoc = _IEDocGetObj ( $oIEApp )                                 ; Load the page
                    $oIE_Table_Col = _IETagNameGetCollection ( $o_IEDoc, 'TABLE', 1 )   ; NOTE! We're only getting the second Table - Table 1!!!
                    $Spans = _IETagNameGetCollection ( $oIE_Table_Col, 'SPAN' )         ; Info we want is within SPAN objects (for the most part!)

;****                   MsgBox ( 0, 'Number of Spans in Table', $Spans.all )                ; I DON'T GET WHY THIS DOESN'T WORK - WITH OR WITHOUT .length !!!???

                    For $Span In $Spans
                        MsgBox ( 0, '$Span.innerText', $Span.innerText )                ; Why is there text that shows up here
                        If $Span.className = 'cMBr' Then                                ; that doesn't show in ANY of these IF/ELSEIF/ELSE statements???
                            MsgBox ( 0, 'Product Name', $Span.innerText )
                        ElseIf $Span.className = 'cMB' Then
                            MsgBox ( 0, 'Product Item Name', $Span.innerText )
                        ElseIf $Span.className = 'cM' Then                              ; Why don't "Download" and "View" show up here?
                            $LIs = _IETagNameGetCollection ( $Span, 'LI' )
                            For $LI In $LIs
                                MsgBox ( 0, 'innerText', $LI.innerText )
                            Next
                        Else
                            MsgBox ( 0, 'Other innerText', $Span.innerText )
                        EndIf
                    Next
                    
                    $Rows = _IETagNameGetCollection ( $oIE_Table_Col, 'TR' )            ; Another way to grab the data
                    For $Row In $Rows
                        $Spans = _IETagNameGetCollection ( $Row, 'SPAN' )
                        For $Span In $Spans
                            MsgBox ( 0, '$Span.innerText', $Span.innerText )            ; Again, text displays here
                            If $Span.className = 'cMBr' Then                            ; that doesn't display here
                                MsgBox ( 0, 'Product Name', $Span.innerText )
                            ElseIf $Span.className = 'cMB' Then
                                MsgBox ( 0, 'Product Item Name', $Span.innerText )
                            ElseIf $Span.className = 'cM' Then
                                $LIs = _IETagNameGetCollection ( $Span, 'LI' )
                                For $LI In $LIs
                                    MsgBox ( 0, 'innerText', $LI.innerText )
                                Next
                            Else
                                MsgBox ( 0, 'Other innerText', $Span.innerText )
                            EndIf
                        Next
                    Next
                    ExitLoop
                EndIf
            Next
        EndIf
    Next
EndFunc

If you look back at the diagram or go to the web page with Debug bar, you'll see that the second table on the page has three or more rows.

Row 1 has what I call the Product name like "Software For BlackBerry Pearl (8100c) (AT&T)" within a SPAN with class cMBr. OK, got that.

Row 2 is a filler and contains no valuable info.

All subsequent rows lay out like this

  • SPAN with ClassName cMB has the item name
  • The first (usually?) SPAN with ClassName cM has LI objects which have App version, Platform version, File name, and File size info. Not having a problem retrieving that
  • There can also be one or more SPANs with ClassName cM under UL\LI objects which have Download and Upgrade info, but don't show up in the If/ElseIf/Else clauses, but DO show up before those clauses - what's with that???

I'm also confused as to when to use .all rather than .all.length to find out how many objects _IETagNameGetCollection retrieves, and why (as noted in the comments above) neither seem to work sometimes!

Also, I was hoping for some way to address the data directly from the $o_IEDoc collection retrieved from the second _IEDocGetObj so as not to have to create all those collections and have so many nested for loops and if/then clasues, but maybe that's just not possible!

To see what I'm talking about, go to:

https://www.blackberry.com/Downloads/entry....4E4F82F9F00E7D4

and select something like BlackBerry Pearl (8100c) (AT&T) from the drop down list and then see what DebugBar says!

Thanks again for your help.

Share this post


Link to post
Share on other sites
DaleHohm

After a call to _IETagNameGetCollection @extended contains the item count.

Your descriptions are too complex and confuse me. Can you just describe what you are trying to do now that you have provided a URL?

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
JerryD

Hey Dale,

Thanks for the reminder that @extended returns the number of items. However, I'd like to understand why, in the code above, neither $Spans.all nor $Spans.all.length give me the same answer.

As for what I'm trying to do - mostly I'm trying to learn how to work with your IE library! I have several ideas about what to do with this new found power!

One is to simply go to each site and glean information from the page displayed for each item selected as defined in an INI file.

Another is to develop a GUI interface that allows one to select a provider and check items to download. This is a particularly troublesome task if you're supporting several vendors as each time you click the Download link (something I'd need to be able to do) you're brought to a form to enter your information and check several boxes, after which you have to agree to terms of use, and THEN you're finally brought to the download page!

One other quick question about the code above. In the For $Span In $Spans loop, how come the innerText "Download" which is a SPAN object with a className of cM doesn't display in the If/ElseIf clause? It doesn't appear in the ElseIf $Span.className = 'cM' Then clause which is where I'd expect it to display, and it doesn't display in the Else clause which REALLY confuses me!

Anyway, thanks again for your help!

Edited by JerryD

Share this post


Link to post
Share on other sites
DaleHohm

$Spans is a collection object and its item count is obtained with $Spans.length

I do not recommend the use of the .all property as its use is deprecated in the DOM specification.

One likely source of trouble for you is a statement like: If $Span.className = 'cMBr' -- I strongly recommend that you change that to If String($Span.className) = 'cMBr'. If an element has no className then $Span.className will be numeric 0. If you compare numeric 0 to any string it will return True, which is typically not what you want.

Jerry, it is difficult to respond to long rambling posts like this. It takes 15-20 minutes to digest what you have asked and try to compose a reply. Most posts in the forum are direct questions that may take a bit to consider a reply, but don't take a long time to understand the question.

Even if you are just learning this and don't know exactly what you want to do, I suggest that you pick something specific to accomplish and then try to work through that and ask specific questions with short code samples when you run into trouble. You'll get more help from more people, faster and you may find you learn faster as well with a more structured approach.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
JerryD

Dale,

Thanks for your help and patience. I'll keep it as short as possible.

When you go to the link in question, and select an item from the drop down list and click Next, a list of items available appears which contains the $Span objects we've been discussing. How do I click one of the Download links?

I've got the Download item's SPAN's innerText and innerHtml as $Span.innerText and $Span.innerHTML and tried using $Span.click, but it didn't do anything.

Thanks again.

Share this post


Link to post
Share on other sites
DaleHohm

Here's an example. With something like this you really need to study the DOM structure and find the patterns.

#include <IE.au3>
$oIE =_IECreate("https://www.blackberry.com/Downloads/entry.do?code=577BCC914F9E55D5E4E4F82F9F00E7D4")
$oForm = _IEFormGetObjByName($oIE, "productSelect")
$oSelect = _IEFormElementGetObjByName($oForm, "productName")
_IEFormElementOptionselect($oSelect, "BlackBerry Curve 8310 (TM)", 1, "byText")
_IEFormSubmit($oForm)

$oTable = _IETableGetCollection($oIE, 1)
$oSpans = _IETagNameGetCollection($oTable, "span")
$iSpanCnt = @extended

; first Span contains product name
$oSpan = $oSpans.item(0)
ConsoleWrite("Product: " & _IEPropertyGet($oSpan, "innerText") & @CR & "------------------------" & @CR)

; software download info is broken into groups of 3 spans
$iDownloadCnt = Int(($iSpanCnt - 1)/3)

; specify a product version to match on - perhaps you want StringInStr match instead?
$sVersionIWant = "BlackBerry Handheld Software v4.2.2.312 (Multilanguage)"
For $i = 0 to $iDownloadCnt - 1
    ; Show version info and move to the next isf it is not the one you want
    $oSpan = $oSpans.item(3 * $i + 1)
    ConsoleWrite("Software Version: " & _IEPropertyGet($oSpan, "innerText") & @CR)
    If Not $sVersionIWant = StringStripWS(_IEPropertyGet($oSpan, "innerText"), 3) Then ContinueLoop
    ; show details
    $oSpan = $oSpans.item(3 * $i + 2)
    ConsoleWrite("Details: " & _IEPropertyGet($oSpan, "innerText") & @CR)
    ; click the Download link
    $oSpan = $oSpans.item(3 * $i + 3)
    $oA = _IETagnameGetCollection($oSpan, "a", 0)
    _IEAction($oA, "click")
    _IELoadWait($oIE)
Next

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
JerryD

Dale,

Thanks again, your help has been increadably enlightening. And just to prove it, here's a quick script I put together to gather the URL's of all the available RIM Download sites into an INI file!

#include <array.au3>
#include <file.au3>
#include <IE.au3>
$INIFile = @ScriptDir & '\GetRIMSites.ini'
$URL = 'http://na.blackberry.com/eng/support/downloads/download_sites.jsp'
Dim $sDrive, $sDir, $sFileName, $sExt
$o_IEApp = _IECreate ( $URL, 0, 0 )
$oIEDoc = _IEDocGetObj ( $o_IEApp )
$o_Temp = _IETagNameGetCollection ( $oIEDoc, 'DIV' )
For $Temp In $o_Temp
    If $Temp.className = 'main' And  $Temp.id = 'content-start' Then
        $o_Divs = _IETagNameAllGetCollection ( $Temp )
        ExitLoop
    EndIf
Next
For $DivItem In $o_Divs
    Select
        Case $DivItem.tagName = 'H3'
            $Header = StringStripWS ( $DivItem.innerText,1+2 )
        Case $DivItem.tagName = 'A' And $DivItem.parentNode.className = 'linked'
            If StringInStr ( $DivItem.href, 'blackberry.com' ) Then
                IniWrite ( $INIFile, $Header, StringStripWS($DivItem.innerText,1+2), $DivItem.href )
            Else
                IniWrite ( $INIFile, $Header & '_NonRIM', StringStripWS($DivItem.innerText,1+2), $DivItem.href )
            EndIf
    EndSelect
Next
_IEQuit ( $o_IEApp )
; The reset is just to make the INI file more readable
_PathSplit ( $INIFile, $sDrive, $sDir, $sFileName, $sExt )
$OldINI = @TempDir & '\' & $sFileName & $sExt
FileMove ( $INIFile, $OldINI, 9 )
$Sections = IniReadSectionNames ( $OldINI )
_ArraySort ( $Sections, 0, 1 )
For $i = 1 To $Sections[0]
    IniWriteSection ( $INIFile, $Sections[$i], IniReadSection($OldINI,$Sections[$i]) )
    FileWriteLine ( $INIFile, '' )
Next
FileRecycle ( $OldINI )
You've not only taught an old dog new tricks, but also to fish!

Thanks again.

Jerry

Share this post


Link to post
Share on other sites
DaleHohm

Dale,

Thanks again, your help has been increadably enlightening. And just to prove it, here's a quick script I put together to gather the URL's of all the available RIM Download sites into an INI file!

You've not only taught an old dog new tricks, but also to fish!

Thanks again.

Jerry

I can think of few better compliments. Good job Jerry.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×