Deltron0 Posted August 29, 2012 Share Posted August 29, 2012 Analyzing this HTML <TH vAlign=top align=left width="15%">Address</TH> <TD width="35%">123 CHERRY RD NEW YORK CITY, NY 19001</TD> <TH align=left width="15%">EID</TH> I run: $pos = StringInStr($bodyHTML,">Address") $start = StringInStr($bodyHTML,"%"">",0,1,$pos) + 3 $end = StringInStr($bodyHTML,"",0,1,$start) $len = $end - $start $addy = StringMid($bodyHTML,$start,$len) ConsoleWrite($addy) That dumps: 123 CHERRY RD NEW YORK CITY, NY 19001 The carriage return (I do not mean the <br> tag) seems to disappear, any idea why this might happen? Link to comment Share on other sites More sharing options...
jdelaney Posted August 29, 2012 Share Posted August 29, 2012 I still don't get why people go through sooooo much trouble to parse out HTML. You can use the _IETableWriteToArray() function, and then navigate through the array to get your text. looks like the you missed a char in the $end definition..."<"...anyways, try taking the final string, and running it in: StringToASCIIArray() See if any of the LF, CRLF, etc are present in the array. IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
Deltron0 Posted August 29, 2012 Author Share Posted August 29, 2012 I normally use _IETableWriteToArray, however this table is an unpredictable size and I want to try to find the right cell using surrounding text. StringToASCIIArray() came back with "32" where the carriage return should be, for whatever reason it's seen as a space. Link to comment Share on other sites More sharing options...
jdelaney Posted August 29, 2012 Share Posted August 29, 2012 Where are you reading the HTML...is the reader just out of room? IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
Deltron0 Posted August 30, 2012 Author Share Posted August 30, 2012 (edited) $ActiveIEobj = _IECreate("http://intranetsitewithtableonit.org/casenumber",0,1,0) _IELoadWait($ActiveIEobj,500,10000) Local $bodyHTML = _IEBodyReadHTML($ActiveIEobj) $pos = StringInStr($bodyHTML,">Address") $start = StringInStr($bodyHTML,"%"">",0,1,$pos) + 3 $end = StringInStr($bodyHTML,"</TD>",0,1,$start) $len = $end - $start $addy = StringMid($bodyHTML,$start,$len) ConsoleWrite($addy) As you can see we are in the IE world. The captured $addy is 100% correct other than the missing carriage return. If I increase the size of $len I get the rest of the HTML - including the carriage returns It looks like: 123 CHERRY RD NEW YORK CITY, NY 19001</TD> <TH align=left width="15%">EID</TH> Notice after "RD" is a space (32) instead of a carriage return. Could "view source" in IE7 be displaying the CR, but for some reason _IEBodyReadHTML sees the source differently? Edited August 30, 2012 by Deltron0 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now