Jump to content

Manipulate 'Object' without ID or name?


Recommended Posts

Howdy,

  I would like to find the screen coordinates (x, y) of an element within an IE browser so that I may be able to automate 'scrolling' of the browser window to that particular coordinate. 

  I know I can use the '_IEGetObjByID' and '_IEGetObjByName' commands to return an 'object variable' which I can then use to access the 'browserx' and 'browsery' properties of an element.

  Question is how do I get an 'object variable' if there is no 'Name' or 'ID' to use with those commands?  Not all elements on a page have 'Name' or 'ID' attributes associated with them.  For example if I search for the string "49 Albert St" on a page...this may be in the form of various tags (<p>, <h2>, <span>, etc) but have no 'Name' or 'ID' attributes assigned...so how can I obtain the 'object variable' for an element such as this...?

  I figure if I can get the 'object variable' I should then be able to get the browser coordinates where that element lies on a webpage...then I should hopefully be able to automate the process of scrolling the page down to that location of the element...but how does one go about sourcing the 'object variable' when there are no attributes to 'key' on...?  I thank you in advance for any advice.  Regards.

Link to comment
Share on other sites

Hi,

  Thanks for the reply.  In my example the "49 Albert St" would be the innertext (for example <h2>49 Albert St</h2>) ...can the innertext be used in the 'GetObjByName' command to return the 'object variable'...?  If so then that would be helpful...

Link to comment
Share on other sites

OK, thanks for the information...I think I understand that...seems a bit inefficient, especially if there are many tags/innertexts to check...but I guess I can live with that...so is that used as a 'name' or an 'ID'...so I can get the x/y coordinates of that element?  example:

;~ Get all H2 Elements
$oH2Tags = _IETagNameGetCollection($oIE, "h2")
;~ Loop through each H2 Object Element and check InnerText
For $oH2Tag In $oH2Tags
    If $oH2Tag.InnerText = "49 Albert St" Then 
    MsgBox(4096, "Address Info", "Address : " & $oH2Tag.InnerText)
    $_reference = _IEGetObjByName($oIE, $oH2Tag)  ;get the 'object variable'...???
    
    $_ref_X = _IEPropertyGet($_reference, "browserx") ;x coordinate of <h2>49 Albert St</h2>...???
    $_ref_Y = _IEPropertyGet($_reference, "browsery") ;y coordinate of <h2>49 Albert St</h2>...???    
    EndIf  
Next

 

thank you again so much for the feedback

Edited by Burgs
Link to comment
Share on other sites

1 hour ago, Burgs said:

so is that used as a 'name' or an 'ID'...so I can get the x/y coordinates of that element?

You don't need to do anything "extra". Just use $oH2Tag, which has the reference to the element, ie --

$_ref_X = _IEPropertyGet($oH2Tag, "browserx") ;x coordinate of <h2>49 Albert St</h2>...???
    $_ref_Y = _IEPropertyGet($oH2Tag, "browsery") ;y coordinate of <h2>49 Albert St</h2>...???

 

Link to comment
Share on other sites

Hello,

  Oh OK thanks, I understand.  So the " $oH2Tag " IS the 'object variable' itself. 

  It seems that works fine when the proper tag is called to use in the _IETagNameGetCollection...however what if the tag is unknown?  I can think of a way or two to do it...using 'StringInStr' and possibly 'StringRegExp' to get the tag which can then be plugged into the '_IETagNameGetCollection'...however is there a more elegant or efficient way of doing it...?   Probably not...just asking.

Thanks again for the responses...!

Link to comment
Share on other sites

Well, there's always _IETagNameAllGetCollection, but that is going to require you to traverse the entire DOM. Other possibilities --

  • Use Regex with _IEDocReadHTML
  • Use NextSibling (as suggested by @Subz) or something similar
  • Use the Xpath UDF
  • etc

There are lots of ways to skin the cat. Maybe we could help you identify the best option if you posted some HTML that demonstrates the scenario you are trying to solve.

Link to comment
Share on other sites

Hello,

  Thanks for the reply.  Well actually I cannot offer any HTML because I'm attempting to generate something of a more 'universal' solution.  I have a need to scrape data, mainly addresses and other related information...off a great many websites.  Due to the large number (millions) of websites on the Internet...there are countless different ways to display it using so many different HTML tags.

  Schematically what I'm doing is basically:

  •   ...reading in the HTML source code from a page. ( _IEDocReadHTML )
  •   ...using 'StringInStr' to locate certain desired text I'm looking for...since I don't know what tags will be used however I generally do know what address information I am seeking from a website.    
  •   ...using additional 'StringInStr' commands to 'strip out' the tag(s) associated with the 'innertext' I have located.
  •   ...performing any needed additional processing (like discovering attributes, parent nodes, child and sibling nodes, etc).

  That is why I asked about removing the need to perform a loop in order to isolate the tag I want to work with...I was simply looking for ways to simplify the process and reduce iterations of code...I apologize I cannot be more specific with actual HTML however there are endless examples I could come up with if you needed to see...right now I was just looking for ideas on how to streamline my process...because I thought the loop is a bit cumbersome however I normally do not have 'ID' or 'name' attributes available to work with so it seems that is the only way to get an 'object variable'.

  I will look further into those ideas you mentioned as well...thanks so much again for the feedback from each who posted replies.  I appreciate it.

 

Link to comment
Share on other sites

If you only want text you can use _IEBodyReadText, it drops all of the HTML Tags.  Scrapping a website will always be specific to a particular web page because:

a. hardly anyone keeps to standards which means the code will generally differ from website to website
b. depends on how the pages are rendered, unless a page is straight html, you're bound to encounter a number of issues. which mixing javascript or serverside rendered pages.

One thing about objects is that you can usually read everything within that object so for example, look at the html code below, I should be able to use the following code, which basically should only loop through one iteration of <h2> tags not the entire document.

Anyway Friday night need to go and have a drink

Ciao

$oDiv3 = $oDiv3 = _IEGetObjById("Div3")
$oH2Tags = _IETagNameGetCollection($oDiv3, "h2")
;~ Loop through each H2 Object Element and check InnerText
For $oH2Tag In $oH2Tags
    If $oH2Tag.InnerText = "49 Albert St" Then MsgBox(4096, "Address Info", "Address : " & $oH2Tag.InnerText)
Next
<body>
<div id="Div1">
<div id="Div2">
<h2>Blah</h2>
<p>BlahBlah</p>
<h2>SomeOtherData</h2>
</div> ;~ End of Div2
<div id="Div3">
<h2>49 Albert St</h2>
</div> ;~ End of Div3
</div> ;~ End of Div1
</body>

 

Link to comment
Share on other sites

Yes OK I see how you are confining the loop to only cycle through the "h2" elements within the "Div3" tag only...that is useful for sure to reduce processing...thanks for the tip.

You mentioned some other concerns I had as well about server side and JavaScript pages...I don't suppose there is anything that can be done about them...?  These techniques being mentioned are only for HTML source only...correct?

Thanks again for the information...have a drink for me as well!  haha

Edited by Burgs
Link to comment
Share on other sites

_iefocus...or _ieaction focus will bring the object into display.   no need to get x/y.  (away from computer, one of those functions is valid)

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...