jdelaney

IE UDF to get DOM object by XPath

21 posts in this topic

#1 ·  Posted (edited)

Not sure if anyone else was ever looking for a way to get the DOM object of IE by the XPath, but i sure was.

I Created this UDF... _IEGetDOMObjByXPathWithAttributes

Still needs some work, here are the limitations:
Solved: 1) no distinction between // and /, all are handled as //
...Won't address, use syntax like #3: 2) only able to search for Elements, can't end the xpath in an attribute, such as ...//@id='test'
Solved: 3) When searching for only Elements with specific attributes, I'm only handling any number of condition, such as //test[@id="test2"]
Solved: 4) use of contains() now
5) No use of strucutes, such as :Parent, or :Sibling
6) Solved * due to number 1, it's possible to return the same object mutliple times...that'll be the first correction Edit: This can still occur if using xpath provided leaves mutliple paths to the specific element...make xpath more specific to stop
7) No use of enumerated nodes...such as //test[4]
8) No use of <> when defining attributes //test[@id<>"test2"]

So it's very simple, but returns an array of all objects matching the xpath provided...example xpaths to search for on the google home page:

"//td[@id='gs_tti0']/div/input[@id='gbqfq']" this gives object of the search input

#region SCITE_CallTipsForFunctions
;BGe_IEGetDOMObjByXPathWithAttributes($oIEObj, $sXPath, [$iMaxWait=2000]) Return array of objects on browser matching callers xpath
#endregion SCITE_CallTipsForFunctions
#include <ie.au3>
#include <array.au3>
#region GLOBALVariables
Global $gbBGe_PerformConsoleWrites = True
; The XPath array to work with will be 2d, with the following
Global Enum $giBGe_XPath_Dim2_sRawNode, _
    $giBGe_XPath_Dim2_sNodeName, _
    $giBGe_XPath_Dim2_bNodeIsRelative, _
    $giBGe_XPath_Dim2_sRawNodeConstraints, _
    $giBGe_XPath_Dim2_bIsConstrainted, _
    $giBGe_XPath_Dim2_aNodeConstraints, _
    $giBGe_XPath_Dim2_UBound
; $giBGe_XPath_Dim2_aNodeConstraints will contain a 2d, with the following
Global Enum $giBGe_Constraint_Dim2_sNodeName, _
    $giBGe_Constraint_Dim2_bIsAttribute, _
    $giBGe_Constraint_Dim2_bIsSelf, _
    $giBGe_Constraint_Dim2_sNodeValue, _
    $giBGe_Constraint_Dim2_bIsContains, _
    $giBGe_Constraint_Dim2_UBound
; Regexp to split xpath
Global $gsBGe_RegExpNodeSplit = "(?U)(.*(?:['""].*['""].*){0,})(?:\/)" ; Split Xpath into nodes...split by / where it is part of x-path
Global $gsBGe_RegExpNodeAndCondSplit = "([^\[\]]+)\[(.*)\]" ; Get node name and conditions...conditions can be empty
Global $gsBGe_RegExpOrSplit = "(?i)(?U)(.*['""].*['""\)])(?:\sor\s)|.{1,}?" ; Split Or statements inside []
Global $gsBGe_RegExpAndSplit = "(?i)(?U)(.*['""].*['""\)])(?:\sand\s)|.{1,}?" ; Split And statements inside []
Global $gsBGe_RegExpSplitContains = "(?i)contains\s*\(\s*(.+)\s*,\s*['""](.+)['""]\s*\)" ; Split contains, remove spaces that are not needed
Global $gsBGe_RegExpSplitNonContains = "(.*)\s*\=\s*['""](.*)['""]" ; Split constraint that is not a contains, remove spaces that are not needed
#endregion GLOBALVariables

#region SAMPLE

; Using multiple levels as an example...made comples on purpose to demonstrate...:
$xpathForumLink = "//div[@id='top-menu']/ul[contains(@class,'WONT BE FOUND') or @id='menu-mainmenu']//a[contains(@href,'forum')]"
$xpathGeneralHelpSuprt = "//table[contains(@class,'table') and @summary='Forums within the category 'AutoIt v3'']//h4/a[@title='General Help and Support']"

$xpathGeneralHelpUsers = "//div[@id='forum_active_users']//span[@itemprop='name']"

; Create/navigate to page
$oIE = _IECreate("http://www.autoitscript.com/site/",True,True)
If IsObj($oIE) Then
    ConsoleWrite("Able to _IECreate('http://www.autoitscript.com/site/')" & @CRLF)
Else
    ConsoleWrite("UNable to _IECreate('http://www.autoitscript.com/site/')" & @CRLF)
    Exit 1
EndIf

; Get Forum Link
$aForumLink = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpathForumLink)
If IsArray($aForumLink) Then
    ConsoleWrite("Able to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathForumLink & ")" & @CRLF)
    For $i = 0 To UBound($aForumLink)-1
        ConsoleWrite("   " & $aForumLink[$i].outerhtml )
    Next
Else
    ConsoleWrite("UNable to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathForumLink & ")" & @CRLF)
    Exit 2
EndIf

; Click the link
_IEAction($aForumLink[0], "focus")
If _IEAction($aForumLink[0], "click") Then
    ConsoleWrite("Able to _IEAction($aForumLink[0], 'click')" & @CRLF)
    _IELoadWait($oIE)
Else
    ConsoleWrite("UNable to _IEAction($aForumLink[0], 'click')" & @CRLF)
    Exit 3
EndIf

; Get General help link
$aGenHelpLink = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpathGeneralHelpSuprt)
If IsArray($aGenHelpLink) Then
    ConsoleWrite("Able to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    For $i = 0 To UBound($aGenHelpLink)-1
        ConsoleWrite("   " & $aGenHelpLink[$i].outerhtml )
    Next
Else
    ConsoleWrite("UNable to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    Exit 4
EndIf

; Click the link
_IEAction($aGenHelpLink[0], "focus")
If _IEAction($aGenHelpLink[0], "click") Then
    ConsoleWrite("Able to _IEAction($aGenHelpLink[0], 'click')" & @CRLF)
    _IELoadWait($oIE)
Else
    ConsoleWrite("UNable to _IEAction($aGenHelpLink[0], 'click')" & @CRLF)
    Exit 5
EndIf

; Get current users on page
$aGenHelpUsers = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpathGeneralHelpUsers)
If IsArray($aGenHelpUsers) Then
    ConsoleWrite("Able to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    For $i = 0 To UBound($aGenHelpUsers)-1
        ConsoleWrite("   " & $aGenHelpUsers[$i].outerhtml & @CRLF )
        ConsoleWrite("   " & $aGenHelpUsers[$i].innertext & @CRLF )
    Next
Else
    ConsoleWrite("UNable to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    Exit 6
EndIf

#endregion SAMPLE

#region ExternalFunctions
Func BGe_IEGetDOMObjByXPathWithAttributes($oIEObject, $sXPath, $iMaxWait=2000) ; Get dom object by XPath
    If $gbBGe_PerformConsoleWrites Then ConsoleWrite("Start Function=[BGe_IEGetDOMObjByXPathWithAttributes] with $sXPath=[" & $sXPath & "]." & @CRLF)
    Local $aReturnObjects = ""

    Local $aSplitXpath = BGe_ParseXPath($sXPath)
    If Not IsArray($aSplitXpath) Then
        ConsoleWrite("BGe_IEGetDOMObjByXPathWithAttributes: Callers XPath/Node/Conditions not well formed=[" & $sXPath & "]" & @CRLF)
        Return SetError(1,0,False)
    EndIf

    Local $iTimer = TimerInit()
    While TimerDiff($iTimer)<$iMaxWait And Not IsArray($aReturnObjects)
        $aReturnObjects = BGe_RecursiveGetObjWithAttributes($oIEObject,$aSplitXpath)
    WEnd

    Return $aReturnObjects
EndFunc   ;==>BGe_IEGetDOMObjByXPathWithAttributes
#endregion ExternalFunctions
#region InternalFunctions
Func BGe_RecursiveGetObjWithAttributes($oParent, $aCallersSplitXPath, $asHolder="", $Level=0)

    $asObjects = $asHolder
    Local $sNodeName            = $aCallersSplitXPath[$Level][$giBGe_XPath_Dim2_sNodeName]
    Local $bNodeIsRelative      = $aCallersSplitXPath[$Level][$giBGe_XPath_Dim2_bNodeIsRelative]    ; true=relative false=absolute
    Local $bIsConstrainted      = $aCallersSplitXPath[$Level][$giBGe_XPath_Dim2_bIsConstrainted]    ; array[OR] of arrays[AND]; all constraints on the node
    Local $aNodeOrConstraints   = $aCallersSplitXPath[$Level][$giBGe_XPath_Dim2_aNodeConstraints]   ; array[OR] of arrays[AND]; all constraints on the node
    Local $aPossibleNodeMatch   = ""

    If $gbBGe_PerformConsoleWrites Then ConsoleWrite("Start Function=[BGe_RecursiveGetObjWithAttributes] level=[" & $Level & "]: $sNodeName=[" & $sNodeName & "], $bNodeIsRelative=[" & $bNodeIsRelative & "] $bIsConstrainted=[" & $bIsConstrainted & "]."& @CRLF)

    If Not IsObj($oParent) Then Return $asObjects

    ; Get nodes that match
    If $bNodeIsRelative Then
        If $sNodeName = "*" Then
            $oPossibleNodes = _IETagNameAllGetCollection($oParent)
        Else
            $oPossibleNodes = _IETagNameGetCollection($oParent, $sNodeName)
        EndIf
        For $oPossibleNode In $oPossibleNodes
            If $oPossibleNode.NodeType == 1 Then ; only add nodes
                If IsArray($aPossibleNodeMatch) Then
                    _ArrayAdd($aPossibleNodeMatch,$oPossibleNode)
                Else
                    Local $aPossibleNodeMatch[1] = [$oPossibleNode]
                EndIf
            EndIf
        Next
    Else
        $oPossibleNodes = $oParent.childnodes
        For $oPossibleNode In $oPossibleNodes
            If String($oPossibleNode.NodeName) = $sNodeName Or $sNodeName = "*" Then
                If IsArray($aPossibleNodeMatch) Then
                    _ArrayAdd($aPossibleNodeMatch,$oPossibleNode)
                Else
                    Local $aPossibleNodeMatch[1] = [$oPossibleNode]
                EndIf
            EndIf
        Next
    EndIf

    ; Loop through nodes against restraints
    If IsArray($aPossibleNodeMatch) Then

        For $iChild = 0 To UBound($aPossibleNodeMatch) - 1
            Local $oChild = $aPossibleNodeMatch[$iChild]

            ; Find matching conditions, when necessary
            If $bIsConstrainted Then

                ; Loop through OR Conditions
                For $i = 0 To UBound($aNodeOrConstraints) - 1
                    Local $aNodeAndConstraints = $aNodeOrConstraints[$i]
                    Local $bAndConditionsMet = True

                    ; Loop through And Conditions, or conditions are outside of this loop, and will go if current and's are not met
                    For $j = 0 To UBound($aNodeAndConstraints) - 1

                        ; Remove the @...
                        Local $sConstraintName      = StringReplace($aNodeAndConstraints[$j][$giBGe_Constraint_Dim2_sNodeName],"@","")
                        Local $bConstraintIsAtt     = $aNodeAndConstraints[$j][$giBGe_Constraint_Dim2_bIsAttribute]
                        Local $bConstraintIsNode    = $aNodeAndConstraints[$j][$giBGe_Constraint_Dim2_bIsSelf]
                        Local $sConstraintValue     = $aNodeAndConstraints[$j][$giBGe_Constraint_Dim2_sNodeValue]
                        Local $bConstraintIsContains= $aNodeAndConstraints[$j][$giBGe_Constraint_Dim2_bIsContains]

                        If $bConstraintIsNode Then
                            If $bConstraintIsContains Then
                                If Not StringInStr(String($oChild.innertext), $sConstraintValue) Then $bAndConditionsMet = False
                            Else
                                If String($oChild.innertext) <> $sConstraintValue Then $bAndConditionsMet = False
                            EndIf

                        ElseIf $bConstraintIsAtt Then
                            Local $sAttributeValue = ""
                            Switch $sConstraintName
                                Case "class"
                                    $sAttributeValue = $oChild.className()
                                Case "style"
                                    $sAttributeValue = $oChild.style.csstext
                                Case "onclick"
                                    $sAttributeValue = $oChild.getAttributeNode($sConstraintName).value
                                Case Else
                                    $sAttributeValue = $oChild.getAttribute($sConstraintName)
                            EndSwitch

                            If $bConstraintIsContains Then
                                If Not StringInStr(String($sAttributeValue), $sConstraintValue) Then $bAndConditionsMet = False
                            Else
                                If String($sAttributeValue) <> $sConstraintValue Then $bAndConditionsMet = False
                            EndIf
                        Else
                            ; failure
                        EndIf
                        ; Skip looping if a condition of the And array was not met
                        If Not $bAndConditionsMet Then ExitLoop
                    Next

                    If $bAndConditionsMet Then
                        ; If last level, add the object

                        If $Level = UBound($aCallersSplitXPath) - 1 Then
                            If Not IsArray($asObjects) Then
                                Local $asObjects[1]=[$oChild]
                            Else
                                $bUnique = True
                                ; Only add if not present in the array
                                For $iObject = 0 To UBound($asObjects)-1
                                    If $oChild = $asObjects[$iObject] Then
                                        $bUnique=False
                                        ExitLoop
                                    EndIf
                                Next
                                If $bUnique Then _ArrayAdd($asObjects, $oChild)
                            EndIf
                        Else
                            $asObjects = BGe_RecursiveGetObjWithAttributes($oChild, $aCallersSplitXPath, $asObjects, $Level + 1)
                        EndIf
                    EndIf
                    ; No need to loop additional or if already found one and
                    If $bAndConditionsMet Then ExitLoop
                Next
            Else
                ; No constraints, match is implied
                If $Level = UBound($aCallersSplitXPath) - 1 Then
                    ; Final xpath level, so add to final array
                    If Not IsArray($asObjects) Then
                        Local $asObjects[1]=[$oChild]
                    Else
                        Local $bUnique=True
                        ; Only add if not present in the array
                        For $iObject = 0 To UBound($asObjects)-1
                            If $oChild = $asObjects[$iObject] Then
                                $bUnique=False
                                ExitLoop
                            EndIf
                        Next
                        If $bUnique Then _ArrayAdd($asObjects, $oChild)
                    EndIf
                Else
                    ; Continue Recurssion
                    $asObjects = BGe_RecursiveGetObjWithAttributes($oChild, $aCallersSplitXPath, $asObjects, $Level + 1)
                EndIf
            EndIf
        Next
    EndIf
    Return $asObjects
EndFunc   ;==>BGe_RecursiveGetObjWithAttributes
Func BGe_ParseXPath($sCallersXPath)

    ; RegExp require a trailing "/"
    $sCallersXPath &= "/"
    Local $aReturnParsedXPath=False

    ; Parse all the '/' outside of single, or double, quotes
    Local $aNodesWithQualifiers = StringRegExp($sCallersXPath,$gsBGe_RegExpNodeSplit,3)

    ; Loop through, and determine if the node is direct, or relative.../ vs //
    Local $iSlashCount = 0
    For $i = 0 To UBound($aNodesWithQualifiers) - 1
        If StringLen($aNodesWithQualifiers[$i])=0 Then
            $iSlashCount+=1
        Else
            ; Add dimentions to the return array
            If Not IsArray($aReturnParsedXPath) Then
                Local $aReturnParsedXPath[1][$giBGe_XPath_Dim2_UBound]
            Else
                ReDim $aReturnParsedXPath[UBound($aReturnParsedXPath)+1][$giBGe_XPath_Dim2_UBound]
            EndIf

            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_sRawNode]  = $aNodesWithQualifiers[$i]
            ; Split current Node
            Local $aSplitNodeAndCond = StringRegExp($aNodesWithQualifiers[$i],$gsBGe_RegExpNodeAndCondSplit,3)
            If UBound($aSplitNodeAndCond) = 2 Then
                Local $sNodeName = $aSplitNodeAndCond[0]
                Local $sNodeConstraints = $aSplitNodeAndCond[1]
                $aNodeConstraints = BGe_ParseXPathConstraints($sNodeConstraints)
                If Not IsArray($aNodeConstraints) Then
                    ConsoleWrite("ParseXPath: Callers XPath/Node/Conditions not well formed=[" & $aNodesWithQualifiers[$i] & "]" & @CRLF)
                    Return SetError(1,1,False)
                EndIf
            ElseIf UBound($aSplitNodeAndCond) = 0 Then
                Local $sNodeName = $aNodesWithQualifiers[$i]
                Local $sNodeConstraints = ""
                Local $aNodeConstraints = ""
            Else
                ConsoleWrite("ParseXPath: Callers XPath/Node/Conditions not well formed=[" & $aNodesWithQualifiers[$i] & "]" & @CRLF)
                Return SetError(1,2,False)
            EndIf
            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_sNodeName]             = $sNodeName
            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_sRawNodeConstraints]   = $sNodeConstraints
            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_bIsConstrainted]       = (StringLen($sNodeConstraints)>0)
            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_aNodeConstraints]      = $aNodeConstraints
            $aReturnParsedXPath[UBound($aReturnParsedXPath)-1][$giBGe_XPath_Dim2_bNodeIsRelative]       = $iSlashCount>1
            $iSlashCount=1
        EndIf
    Next

    Return $aReturnParsedXPath

EndFunc
Func BGe_ParseXPathConstraints($sCallersXPathConstraints)
    ; Returns array of arrays
    ; Array is split of all 'or' statements, and then includes array of 'and' statements, which are split out into 2d array of name/value/bcontains
    Local $aReturnParsedXPathConstraints[1]

    ; Will always return at least the first condition
    Local $aOrQualifiers = StringRegExp($sCallersXPathConstraints,$gsBGe_RegExpOrSplit,3)
    ReDim $aReturnParsedXPathConstraints[UBound($aOrQualifiers)]
    For $i = 0 To UBound($aReturnParsedXPathConstraints)-1
        Local $aAndQualifiers = StringRegExp($aOrQualifiers[$i],$gsBGe_RegExpAndSplit,3)
        Local $aaSplitQualitfiers = BGe_ParseXPathConstraint($aAndQualifiers)
        If IsArray($aaSplitQualitfiers) Then
            $aReturnParsedXPathConstraints[$i]=$aaSplitQualitfiers
        Else
            ConsoleWrite("ParseXPathConstraints: Callers XPath/Node/Conditions not well formed=[" & $aOrQualifiers[$i] & "]" & @CRLF)
            Return SetError(1,3,False)
        EndIf
    Next

    Return $aReturnParsedXPathConstraints
EndFunc
Func BGe_ParseXPathConstraint($aCallersXPathConstraint)
    Local $aReturnParsedXPathConstraints[UBound($aCallersXPathConstraint)][$giBGe_Constraint_Dim2_UBound]

    For $i = 0 To UBound($aCallersXPathConstraint)-1
        ; Remove leading and trailing spaces
        Local $sCurrentConstraint = StringStripWS($aCallersXPathConstraint[$i], 3)
        ; Check if $sCurrentConstraint makes use of contains()

        Local $aTempContains = StringRegExp($sCurrentConstraint,$gsBGe_RegExpSplitContains,3)
        Local $aTempNonContains = StringRegExp($sCurrentConstraint,$gsBGe_RegExpSplitNonContains,3)

        If UBound($aTempContains)=2 Then
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsContains]  = True
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsSelf]      = ($aTempContains[0]=".")
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_sNodeName]    = $aTempContains[0]
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsAttribute] = (StringLeft($aTempContains[0],1)="@")
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_sNodeValue]   = $aTempContains[1]
        ElseIf UBound($aTempNonContains)=2 And Not StringInStr($aTempNonContains[0],"(") Then
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsContains] = False
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsSelf]      = ($aTempNonContains[0]=".")
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_sNodeName]    = $aTempNonContains[0]
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_bIsAttribute] = (StringLeft($aTempNonContains[0],1)="@")
            $aReturnParsedXPathConstraints[$i][$giBGe_Constraint_Dim2_sNodeValue]   = $aTempNonContains[1]
        Else
            ConsoleWrite("ParseXPathConstraint: Callers XPath/Node/Conditions not well formed=[" & $aCallersXPathConstraint[$i] & "]" & @CRLF)
            Return SetError(1,4,False)
        EndIf
    Next

    Return $aReturnParsedXPathConstraints
EndFunc
#endregion InternalFunctions

Edit 08/12/2015...navigate here to get new functionality which allows for specific types of predicates:

 

Edited by jdelaney
4 people like this

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites



Thanks so much! :thumbsup:

Share this post


Link to post
Share on other sites

Hi I'd love to use this function, but for some reason it doesn't work

-May be, as I've seen in the cooments Requires: #Include "C:QAAutoITEFLVariables.au3")

it needs Variables?

If I test it without #include..., it gives this type error

'Variable must be of type "Object".:

For $oChildNode In $oParent

For $oChildNode In $oParent^ ERROR'

Could you please give some more examples to make it work, as Autoit really needs xpath

Thank you

Share this post


Link to post
Share on other sites

Well...

without the #Include "C:QAAutoITEFLVariables.au3"

your script is useless...

What a shame... :x

Share this post


Link to post
Share on other sites

Hi I'd love to use this function, but for some reason it doesn't work

-May be, as I've seen in the cooments Requires: #Include "C:QAAutoITEFLVariables.au3")

it needs Variables?

If I test it without #include..., it gives this type error

'Variable must be of type "Object".:

For $oChildNode In $oParent

For $oChildNode In $oParent^ ERROR'

Change this line in the script above.

; from this
$oIE = _IECreate ("[url="http://www.google.com"]www.google.com[/url]")
; to this
$oIE = _IECreate("http://www.google.com")

The forum corrupted the code block in the original script.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Made major improvements (see autoit script in post 1)...much quicker (when consolewrite is turned off, but turned on to demonstrate recursion).

Added failure handling for bad xpaths.

Added example of navigating to General Help and Support forum, which is overkilled xpath to demonstrate it's ability.

Added a new variable to allow for looping until object is found (again, will work when consolewrites are turned off, those really hog resources)

Now have the ability to pass in any object, and not just a browser object...that object will be the starting point of the xpath.

Turn off consolewrites by setting

Global $gbBGe_PerformConsoleWrites = False

tack this on the end to demonstrate grabing a collection...this grabs all users viewing the General Help and Support page(integrated it in to above script, now):

$xpathGeneralHelpUsers = "//div[@id='forum_active_users']//span[@itemprop='name']"
; Get current users on page
$aGenHelpUsers = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpathGeneralHelpUsers)
If IsArray($aGenHelpUsers) Then
    ConsoleWrite("Able to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    For $i = 0 To UBound($aGenHelpUsers)-1
        ConsoleWrite("   " & $aGenHelpUsers[$i].outerhtml & @CRLF )
        ConsoleWrite("   " & $aGenHelpUsers[$i].innertext & @CRLF )
    Next
Else
    ConsoleWrite("UNable to BGe_IEGetDOMObjByXPathWithAttributes($oIE, " & $xpathGeneralHelpSuprt & ")" & @CRLF)
    Exit 6
EndIf
Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

Hello, im not that familiar with xpath, I have a page that has a table and an input box called customerID which I want to click into, or set focus to, so that I can send some info.   I've coppied your script and tried to edit it for my purposes but am getting stuck.  Can you show an example of how to click into a form field.  Below is an excerpt from the target section of the page that I am working with.

<tbody id="cibody"    >
  <tr>
    <td align="right" id="customeridtd" width="30%">Customer ID</td>
    <td width="70%"><input type="text" name="customerId" value=""></td>
  </tr>

Share this post


Link to post
Share on other sites

Nevermind, I was able to use _IEFormElementSetValue

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

$xpath = "//input[@name='customerId']"
$aInput = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpath)
_IEFormElementSetValue($aInput[0],"some text")

...though, as you have seen, you don't need my function for that simple an object.  If there were multiple instances of customerid, and the xpath to them are unique, then it would work perfectly.

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

I am trying to use XPath to fill an form but am getting below error could you please help me anyone

Error 

"C:\Program Files (x86)\AutoIt3\Include\ie.au3" (1032) : ==> Error parsing function call.:
Func _IEFormElementSetValue(ByRef $oObject, $sNewvalue, $iFireEvent = 1)
Func _IEFormElementSetValue(By^ ERROR
>Exit code: 1    Time: 6.706
 

My Code

$xpathDEC = "//*[@id='value(declarationNo)']"
$aInput = BGe_IEGetDOMObjByXPathWithAttributes($g_oIE,$xpathDEC)
_IELoadWait($g_oIE)
 
_IEFormElementSetValue($aInput[0],"3030151715814")

 HTML as below

<form name="baseForm" method="post" action="/m2decext/declaration/invokePrintDeclarationSearch.do" class="dtform">
 
<div id="holder">
<div class="top">
</div>
<!--<div class="top" id="topMessage">Fields marked with an asterisk <span class="asteriks">*</span> are required.</div></div>
--><div>
<div class="formRow">
<div id="search-content">
 <div class="mod">
<div class="bd">
<fieldset>
<ol><!--
<legend>Quick Search</legend>
--><li id="declarationNoLi" >
<label for="declarationNo"><span class="asteriks">*</span> Declaration No.:</label>
<input type="text" name="value(declarationNo)" maxlength="13" value="" id="value(declarationNo)">
</li>
</ol>
</fieldset>
</div>
</div>
</div>
</div>
<div class="bottom">
<input type="button" value="Search" class="searchButton" id="searchButton" onclick="submitForm();">
<input type="button" value="Reset" title="Reset form entries" class="resetButton" onclick="resetFields();">
<input type="hidden" name="value(advancedSearch)" value="" id="value(advancedSearch)">
<input type="hidden" name="value(defaultFromDate)" value="04-10-2014" id="value(defaultFromDate)">
<input type="hidden" name="value(defaultToDate)" value="11-10-2014" id="value(defaultToDate)">
<input type="hidden" name="value(submissionChannelId)" value="4">
</div>
</div>
</form>

Share this post


Link to post
Share on other sites

Not all valid xpaths are covered.  I'm just using some regular expressions to create a way to loop through the HTML DOM.

The wild card in not support for any node name, you'll have to use the actual node name for now:

$xpathDEC = "//Input[@id='value(declarationNo)']"

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

How do you create an xpath.

In chrome inspect element there is an option to copy xpath, it gives me this..

//*[@id="js_4"]

But it comes up in console

Able to _IECreate('https://www.facebook.com/my.page.1')
Start Function=[BGe_IEGetDOMObjByXPathWithAttributes] with $sXPath=[//*[@id="js_4"]].
Start Function=[BGe_RecursiveGetObjWithAttributes] level=[0]: $sNodeName=[*], $bNodeIsRelative=[True] $bIsConstrainted=[True].
UNable to BGe_IEGetDOMObjByXPathWithAttributes($oIE, //*[@id="js_4"])
!>19:29:08 AutoIt3.exe ended.rc:2

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

That would be an enhancement requirement.  Not every possible xpath is usable.  I'm really just using regexp to parse the string, to determine how to loop through the DOM.

You'd have to include the actual node name...like:

$xpath = '//Input[@id="js_4"]'

This function is better used for more complex scenarios...such as when multiple nodes have the same data, but their parent is unique...you can point to the parent, and then down to the child to find exactly your dom element.  The above xpath can be handled by:

_IEGetObjById

I'll update to handle a node name of '*', tonight.

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

Updated to allow for use of an xpath where the nodename = * (a wildcard...grab all nodes)

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#15 ·  Posted

Cannot figure out how to use your UDF in script. After it is invoked in script, this error is thrown: http://take.ms/5NnqB

Code is following:

#include <IE.au3>
#include <IEbyXPATH.au3>

Also tried to include only IEbyXPATH, but same error. How do i get rid of it and proceed to UDF functions' usage?

 

Share this post


Link to post
Share on other sites

; #VARIABLES# ===================================================================================================================
#Region Global Variables
Global $__g_iIELoadWaitTimeout = 300000 ; 5 Minutes
Global $__g_bIEAU3Debug = False
Global $__g_bIEErrorNotify = True

so it is, post your code, complete code, then maybe, someone may be able to help you.

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

To make things more clear: IE.au3 was not modified on my machine since Autoit "stable" was installed. That fragment of code you posted is the same as in "mine" IE.au3.

I have no errors thrown when using IE.au3 in scripts where IEbyXPATH.au3 is not invoked. Mentioned error appear only when i try to include UDF from this thread, that's why i asked how to use this UDF.

PS: IEbyXPATH.au3 consists of code which i copy-pasted from start post, except i deleted:

#region SAMPLE
; all code which is here
#endregion SAMPLE

Maybe, this way i broke UDF, but if this "SAMPLE" in place Autoit site is loaded and example actions are performed (which i do not need).

Edited by ss26

Share this post


Link to post
Share on other sites

So, i was wrong. UDF (without example sample code) loads OK. It's my ugly code (misuse of UDF's functions) causes error.

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

Hey I really like your UDF!

It's bringing me forward a lot with my script,

one "enhancement" maybe you could find time for, that would be great:

I have for example this (potential) Xpath (works in Chrome)

Const $xpath_last_offset = '//*[@id="search_results_table"]/div[@class="results-paging"]/ul/li[last()]/a'

and as far as I can see the part that isn't working through the UDF is the xyz[last()] bit.

I've been scanning over, among others,  the https://en.wikipedia.org/wiki/XPath page, being a beginner with Xpath, and one part addresses this:

Predicates[edit]
Predicates, written as expressions in square brackets, can be used to restrict a node-set to select only those nodes for which some condition is true. For example a[@href='help.php'] will select those a elements (among the children of the context node) having an href attribute whose value is help.php.
There is no limit to the number of predicates in a step, and they need not be confined to the last step in an XPath. They can also be nested to any depth. Paths specified in predicates begin at the context of the current step (i.e. that of the immediately preceding node test) and do not alter that context. All predicates must be satisfied for a match to occur.
When the value of the predicate is numeric, it is interpreted as a test on the position of the node. So p[1] selects the first p element child, while p[last()] selects the last.
In other cases, the value of the predicate is automatically converted to a boolean. When the predicate evaluates to a node-set, the result is true when the node-set is non-empty. Thus p[@x] selects those p elements that have an attribute named x.
A more complex example: the expression a[/html/@lang='en'][@href='help.php'][1]/@target selects the value of the target attribute of the first a element among the children of the context node that has its href attribute set to help.php, provided the document's html top-level element also has a lang attribute set to en. The reference to an attribute of the top-level element in the first predicate affects neither the context of other predicates nor that of the location step itself.
Predicate order is significant if predicates test the position of a node. Each predicate 'filters' a location step's selected node-set in turn. So a[1][@href='help.php'] will find a match only if the first a child of the context node satisfies the condition @href='help.php', while a[@href='help.php'][1] will find the first a child that satisfies this condition.

I'm getting the impression that there may be a workaround to using last(), it's just all still quite over my head, thus the use of your great (if not perfect i.e. a work in progress) UDF.

So I just wanted to give my compliments, and ask if you could put it on the to-do list, and/or give me a suggestion...

 

Edit/PS: to give some context to what I'm trying to achieve:

getting the Link the "21st" i.e. Last page in this case, (the number of the last page will vary it won't always be #21)

ahkco.jpg

Edited by guestscripter
Context/Screenshot

Share this post


Link to post
Share on other sites

#20 ·  Posted (edited)

That would need two enhancements:

1) wildcard searches on the nodename

2) enumeration control on the elements returned

For now, your workaround would be to not wildcard the node...this part //*

You can also grab all the links, and then grab the ubound - 2 (skip the next link) something like this:

Const $xpath_last_offset = '//Table[@id="search_results_table"]/div[@class="results-paging"]/ul/li[last()]/a'
$aLink_Col = BGe_IEGetDOMObjByXPathWithAttributes($oIE,$xpath_last_offset)
; assumes the last link will always be 'Next link'
;...if this is not the case, you can loop the the collection backwards, and get the text until it's a number

; example one, where the last link is always 'Next Link'
$aLastLink = $aLink_Col[UBound($aLink_Col)-2]

; example two
For $i = UBound($aLink_Col)-1 To 0 Step -1
    If Number($aLink_Col[$i].innertext) Then
        $aLastLink = $aLink_Col[$i]
        ExitLoop

    EndIf
Next

In a few days, I'll add in the wildcard search for NodeNames.

Edit: oh wow, the wildcard node search will be super easy...will get that in within a bit.

Edit2: I thought number 1 was needed...haven't looked at this in so long...already in there.  But you are correct that 'predicates' are not considered in my functions.  Another good one to include would be axes...such as  //something/else/example[@attrib='test']/ancestor::/else/siblingofexample

 

 

Edited by jdelaney
1 person likes this

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now