Jump to content
Burgs

Manipulate 'Object' without ID or name?

Recommended Posts

Burgs

Howdy,

  I would like to find the screen coordinates (x, y) of an element within an IE browser so that I may be able to automate 'scrolling' of the browser window to that particular coordinate. 

  I know I can use the '_IEGetObjByID' and '_IEGetObjByName' commands to return an 'object variable' which I can then use to access the 'browserx' and 'browsery' properties of an element.

  Question is how do I get an 'object variable' if there is no 'Name' or 'ID' to use with those commands?  Not all elements on a page have 'Name' or 'ID' attributes associated with them.  For example if I search for the string "49 Albert St" on a page...this may be in the form of various tags (<p>, <h2>, <span>, etc) but have no 'Name' or 'ID' attributes assigned...so how can I obtain the 'object variable' for an element such as this...?

  I figure if I can get the 'object variable' I should then be able to get the browser coordinates where that element lies on a webpage...then I should hopefully be able to automate the process of scrolling the page down to that location of the element...but how does one go about sourcing the 'object variable' when there are no attributes to 'key' on...?  I thank you in advance for any advice.  Regards.

Share this post


Link to post
Share on other sites
Subz

Normally if it has absolutely not tags or attributes or innertext I'll use the previous element that has something distinguishable and then use something like $oObject.NextSibling...

Share this post


Link to post
Share on other sites
Burgs

Hi,

  Thanks for the reply.  In my example the "49 Albert St" would be the innertext (for example <h2>49 Albert St</h2>) ...can the innertext be used in the 'GetObjByName' command to return the 'object variable'...?  If so then that would be helpful...

Share this post


Link to post
Share on other sites
Subz

No you would use Tags for example

;~ Get all H2 Elements
$oH2Tags = _IETagNameGetCollection($oIE, "h2")
;~ Loop through each H2 Object Element and check InnerText
For $oH2Tag In $oH2Tags
    If $oH2Tag.InnerText = "49 Albert St" Then MsgBox(4096, "Address Info", "Address : " & $oH2Tag.InnerText)
Next

 

Share this post


Link to post
Share on other sites
Burgs
Posted (edited)

OK, thanks for the information...I think I understand that...seems a bit inefficient, especially if there are many tags/innertexts to check...but I guess I can live with that...so is that used as a 'name' or an 'ID'...so I can get the x/y coordinates of that element?  example:

;~ Get all H2 Elements
$oH2Tags = _IETagNameGetCollection($oIE, "h2")
;~ Loop through each H2 Object Element and check InnerText
For $oH2Tag In $oH2Tags
    If $oH2Tag.InnerText = "49 Albert St" Then 
    MsgBox(4096, "Address Info", "Address : " & $oH2Tag.InnerText)
    $_reference = _IEGetObjByName($oIE, $oH2Tag)  ;get the 'object variable'...???
    
    $_ref_X = _IEPropertyGet($_reference, "browserx") ;x coordinate of <h2>49 Albert St</h2>...???
    $_ref_Y = _IEPropertyGet($_reference, "browsery") ;y coordinate of <h2>49 Albert St</h2>...???    
    EndIf  
Next

 

thank you again so much for the feedback

Edited by Burgs

Share this post


Link to post
Share on other sites
Danp2
1 hour ago, Burgs said:

so is that used as a 'name' or an 'ID'...so I can get the x/y coordinates of that element?

You don't need to do anything "extra". Just use $oH2Tag, which has the reference to the element, ie --

$_ref_X = _IEPropertyGet($oH2Tag, "browserx") ;x coordinate of <h2>49 Albert St</h2>...???
    $_ref_Y = _IEPropertyGet($oH2Tag, "browsery") ;y coordinate of <h2>49 Albert St</h2>...???

 

Share this post


Link to post
Share on other sites
Burgs

Hello,

  Oh OK thanks, I understand.  So the " $oH2Tag " IS the 'object variable' itself. 

  It seems that works fine when the proper tag is called to use in the _IETagNameGetCollection...however what if the tag is unknown?  I can think of a way or two to do it...using 'StringInStr' and possibly 'StringRegExp' to get the tag which can then be plugged into the '_IETagNameGetCollection'...however is there a more elegant or efficient way of doing it...?   Probably not...just asking.

Thanks again for the responses...!

Share this post


Link to post
Share on other sites
Danp2

Well, there's always _IETagNameAllGetCollection, but that is going to require you to traverse the entire DOM. Other possibilities --

  • Use Regex with _IEDocReadHTML
  • Use NextSibling (as suggested by @Subz) or something similar
  • Use the Xpath UDF
  • etc

There are lots of ways to skin the cat. Maybe we could help you identify the best option if you posted some HTML that demonstrates the scenario you are trying to solve.

Share this post


Link to post
Share on other sites
Burgs

Hello,

  Thanks for the reply.  Well actually I cannot offer any HTML because I'm attempting to generate something of a more 'universal' solution.  I have a need to scrape data, mainly addresses and other related information...off a great many websites.  Due to the large number (millions) of websites on the Internet...there are countless different ways to display it using so many different HTML tags.

  Schematically what I'm doing is basically:

  •   ...reading in the HTML source code from a page. ( _IEDocReadHTML )
  •   ...using 'StringInStr' to locate certain desired text I'm looking for...since I don't know what tags will be used however I generally do know what address information I am seeking from a website.    
  •   ...using additional 'StringInStr' commands to 'strip out' the tag(s) associated with the 'innertext' I have located.
  •   ...performing any needed additional processing (like discovering attributes, parent nodes, child and sibling nodes, etc).

  That is why I asked about removing the need to perform a loop in order to isolate the tag I want to work with...I was simply looking for ways to simplify the process and reduce iterations of code...I apologize I cannot be more specific with actual HTML however there are endless examples I could come up with if you needed to see...right now I was just looking for ideas on how to streamline my process...because I thought the loop is a bit cumbersome however I normally do not have 'ID' or 'name' attributes available to work with so it seems that is the only way to get an 'object variable'.

  I will look further into those ideas you mentioned as well...thanks so much again for the feedback from each who posted replies.  I appreciate it.

 

Share this post


Link to post
Share on other sites
Subz

If you only want text you can use _IEBodyReadText, it drops all of the HTML Tags.  Scrapping a website will always be specific to a particular web page because:

a. hardly anyone keeps to standards which means the code will generally differ from website to website
b. depends on how the pages are rendered, unless a page is straight html, you're bound to encounter a number of issues. which mixing javascript or serverside rendered pages.

One thing about objects is that you can usually read everything within that object so for example, look at the html code below, I should be able to use the following code, which basically should only loop through one iteration of <h2> tags not the entire document.

Anyway Friday night need to go and have a drink

Ciao

$oDiv3 = $oDiv3 = _IEGetObjById("Div3")
$oH2Tags = _IETagNameGetCollection($oDiv3, "h2")
;~ Loop through each H2 Object Element and check InnerText
For $oH2Tag In $oH2Tags
    If $oH2Tag.InnerText = "49 Albert St" Then MsgBox(4096, "Address Info", "Address : " & $oH2Tag.InnerText)
Next
<body>
<div id="Div1">
<div id="Div2">
<h2>Blah</h2>
<p>BlahBlah</p>
<h2>SomeOtherData</h2>
</div> ;~ End of Div2
<div id="Div3">
<h2>49 Albert St</h2>
</div> ;~ End of Div3
</div> ;~ End of Div1
</body>

 

Share this post


Link to post
Share on other sites
Burgs
Posted (edited)

Yes OK I see how you are confining the loop to only cycle through the "h2" elements within the "Div3" tag only...that is useful for sure to reduce processing...thanks for the tip.

You mentioned some other concerns I had as well about server side and JavaScript pages...I don't suppose there is anything that can be done about them...?  These techniques being mentioned are only for HTML source only...correct?

Thanks again for the information...have a drink for me as well!  haha

Edited by Burgs

Share this post


Link to post
Share on other sites
jdelaney

_iefocus...or _ieaction focus will bring the object into display.   no need to get x/y.  (away from computer, one of those functions is valid)


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
Burgs

Oh...thanks that command could be useful...did not realize that was there... 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • PINTO1927
      By PINTO1927
       
      Hello guys,
      I need to bring up a MsgBox when the user clicks a button on the Internet Explorer page.
      This is the button on the HTML page:
      <button id="NOT_READY_BTN-btnEl" type="button" class="x-btn-center" hidefocus="true" disabled="disabled" role="button" autocomplete="off" data-qtip="ENTRA" style="width: 169px; height: 24px;"> <span id="NOT_READY_BTN-btnInnerEl" class="x-btn-inner" style="width: 169px;">ENTRA</span> <span id="NOT_READY_BTN-btnIconEl" class="x-btn-icon " style="background-image:url(img/icons/ENTRA.gif)"></span></button> Thanks
    • SkysLastChance
      By SkysLastChance
      I am not sure on how to grab the innertext I want. Is there a way I can just grab the second line, or is there a better way to do it all together? 
      #include <IE.au3> $oIE = _IEAttach("Form Details") $oDiv = _IEGetObjById($oIE, "Col3") ;Phone MsgBox (0,"Oops",$oDiv.outertext) When I use this code I get

       


       
      html.au3
    • philkryder
      By philkryder
      I received the following error when trying to use _IEGetObjById on a browser object that I had obtained by using _IEAttach.
      Shouldn't the code have gotten a 1 or 0 from IsObj and continued by returning with the standard message from _IEGetObjById that there was no match for the requested ID?
      I'd appreciate any guidance on next steps to debug.


    • Vishal85
      By Vishal85
      _IEGetObjById and _IEGetObjByName functions use variable of an InternetExplorer.Application, Window or Frame object as reference to find any dom element.

      Is it possible to use a dom element instead of InternetExplorer.Application, Window or Frame to find target dom element instead the dom element. This will help in finding elements using a parent child relationship where duplicate target dom elements exists on the HTML page.

      Other automation tools like WebDriver do have this feature. It would be great to have something like this in AutoIt. I love the way AutoIt works!!!!! Especially the IE UDF. Cool stuff from DALE and other developers.

      Ex - i would like to do something like this,

      $oIE = _IEAttach("Title")
      $oParentObject = _IEGetObjById($oIE, "Parent Widget or element Id which contains my target widget in its hierarchy below")
      $oTargetObject = _IEGetObjById(oParentObject , "Id of widget i would like to find and do some action")
      _IEAction($oTargetObject, "click")

      This will help in situations where there are widgets with duplicate ids on the page....Ex - 2 Add buttons with same id on the same page.....One add button to add a CAR...another Add button to add a Vehicle. In this case i cannot click on the exact Add button i want just with the Add button id. i will need reference of their immediate parent object to uniquely identify them.
×