cwem Posted March 11, 2009 Share Posted March 11, 2009 I'm looking for "Error: Cannot locate database" in the following HTML. I'm able to obtain 12345678 by: $lastInputObj = _IETagNameGetCollection($mainFrame, "INPUT", @extended - 1) MsgBox(0, $lastInputObj.value, $lastInputObj.value) But I just don't know how to retrieve the last <font color=red> </font> Would anybody has any idea? <tr><td ALIGN=CENTER VALIGN=CENTER ><font color=red> Error: Cannot locate database </font></td></tr></table> <form action="http://www.google.com" method="post" name="iform" > <INPUT TYPE=HIDDEN NAME=action VALUE=2> <INPUT TYPE=HIDDEN NAME=src1 VALUE=12345678></form> Link to comment Share on other sites More sharing options...
weaponx Posted March 11, 2009 Share Posted March 11, 2009 If you just want to know if the text "Error: Cannot locate database" exists in the html then just use StringInStr. Link to comment Share on other sites More sharing options...
cwem Posted March 11, 2009 Author Share Posted March 11, 2009 If you just want to know if the text "Error: Cannot locate database" exists in the html then just use StringInStr.But I don't know how to refer to this dynamically generated HTML page... Link to comment Share on other sites More sharing options...
weaponx Posted March 11, 2009 Share Posted March 11, 2009 You can use _IEBodyReadHTML or _IEBodyReadText. If StringInStr(_IEBodyReadText($o_object),"Error: Cannot locate database") Then ;Do Stuff... EndIf Link to comment Share on other sites More sharing options...
cwem Posted March 21, 2009 Author Share Posted March 21, 2009 You can use _IEBodyReadHTML or _IEBodyReadText. If StringInStr(_IEBodyReadText($o_object),"Error: Cannot locate database") Then ;Do Stuff... EndIf So if I wanna know the total pages that is 11 in the following text appearing somewhere in the HTML, Page 1 of 11 </td> how can I use StringInStr to extract "11"? _IENavigate ($oIE, "www.google.com") If StringInStr(_IEBodyReadText($oIE),"Page") Then ; How to process this specific line? ; and if I wanna do some stuff on the 3 lines following "Page 1 of 11 </td>", what should i write here? EndIf Link to comment Share on other sites More sharing options...
Authenticity Posted March 21, 2009 Share Posted March 21, 2009 Dim $aResults = StringRegExp(_IEBodyReadText($oIE), "(?i)page\s*(\d*)\s*of\s*(\d*)", 1) If IsArray($aResults) Then Dim $iCurrentPage = $aResults[0] Dim $iTotalPages = $aResults[1] EndIf It's better to use the _IE* functions to get these like _IETableGetCollection or _IEPropertyGet for the Inner or Outer - Text or HTML and then using the string functions. Unless, of course, you're sure that no other "page" can appear in the body text. Link to comment Share on other sites More sharing options...
cwem Posted March 21, 2009 Author Share Posted March 21, 2009 Dim $aResults = StringRegExp(_IEBodyReadText($oIE), "(?i)page\s*(\d*)\s*of\s*(\d*)", 1) If IsArray($aResults) Then Dim $iCurrentPage = $aResults[0] Dim $iTotalPages = $aResults[1] EndIf It's better to use the _IE* functions to get these like _IETableGetCollection or _IEPropertyGet for the Inner or Outer - Text or HTML and then using the string functions. Unless, of course, you're sure that no other "page" can appear in the body text. I agree with you that _IETableGetCollection is a better implementation, but the problem is that I wanna process some specific contents/lines/tables, say, process the 2nd table following "Page 1 of 11 </td>", what codes should i write ? _IETableGetCollection only allows us to retrieve the nth table explicitly but not depending on content. Thanks again Link to comment Share on other sites More sharing options...
cwem Posted March 28, 2009 Author Share Posted March 28, 2009 It's better to use the _IE* functions to get these like _IETableGetCollection or _IEPropertyGet for the Inner or Outer - Text or HTML and then using the string functions. Unless, of course, you're sure that no other "page" can appear in the body text. I have 2 questions: 1) innertext/html is always "said" to equivalent to outertext/html in Help. would you please exemplify their difference? 2) The following table data fooled my table writing program to multiple rows instead of single row, <td class='img'> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR000719' >IPR000719</a> Protein kinase<br/> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR002290' >IPR002290</a> Serine/threonine protein kinase<br/> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR005543' >IPR005543</a> PASTA<br/> </td> Initially I guessed it's because <br/> but it seems it's not, I don't know what mess up the process now.... expandcollapse popup#include <IE.au3> #include <Array.au3> #include <File.au3> $dir = "C:\" $murl = "http://img.jgi.doe.gov/" $qurl = "cgi-bin/pub/main.cgi?section=GeneDetail&page=geneDetail&" $purl = "gene_oid=" $file = FileOpen("IMGID.txt", 0) ; example IMGID.txt (without semicolons) ; 637094262 ; 637094263 ; 637094264 ; Check if file opened for reading OK If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit EndIf $sFile = $dir & "IMG_DB.xls" If FileExists($sFile) Then FileDelete($sFile) $hFile = FileOpen($sFile, 1) ; 1 = append $oIE = _IECreate () _IELoadWait ($oIE) $i=0 ; Read in lines of text until the EOF is reached While 1 $rsnum_entry = FileReadLine($file) If @error = -1 Then ExitLoop ;MsgBox(0, "Line read:", $rsnum_entry) _IENavigate ($oIE, $murl & $qurl & $purl & $rsnum_entry) _IELoadWait ($oIE) $oTable = _IETableGetCollection ($oIE, 1) ;Gene Information $aTableData = _IETableWriteToArray ($oTable, True) ;_ArrayDisplay($aTableData) if ($i = 1) Then $head = 0 Else $head = 1 EndIf ;MsgBox(0, "Value of $i is:", $head) ;Exit For $y = 1 To UBound($aTableData,2) - 1 For $x = $head To UBound($aTableData) - 1 ;If $y > 1 Then FileWrite($hFile, ";") ;If $x > 1 Then FileWrite($hFile, @Tab) $cell = StringReplace($aTableData[$x][$y],"<br/>","MULTIPLEITEMS") ;FileWrite($hFile, $aTableData[$x][$y]) FileWrite($hFile, $cell) Next FileWrite($hFile,@crlf) Next Wend FileClose($file) Link to comment Share on other sites More sharing options...
Authenticity Posted March 28, 2009 Share Posted March 28, 2009 MSDN, this is the object page. Most, if not all of the object has/have it's own inner/outer-/text/html and probably it's not changing it's functionality from object to object but I didn't read them all. ;]About the data formatting. I guess you want it like this:Row1: Col1 @TAB Col2 @TAB Col3 @TAB Row2: Col1 @TAB Col2 @TAB Col3 @TABRight? If so then it seems to me that it's already formatted as demonstrated, correct me where I'm wrong. Link to comment Share on other sites More sharing options...
cwem Posted March 28, 2009 Author Share Posted March 28, 2009 About the data formatting. I guess you want it like this: Row1: Col1 @TAB Col2 @TAB Col3 @TAB Row2: Col1 @TAB Col2 @TAB Col3 @TAB Right? If so then it seems to me that it's already formatted as demonstrated, correct me where I'm wrong. Thx, Authenticity, huhu, In fact, <td class='img'> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR000719' >IPR000719</a> Protein kinase<br/> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR002290' >IPR002290</a> Serine/threonine protein kinase<br/> - <a href='http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR005543' >IPR005543</a> PASTA<br/> </td> does NOT produce Row1: Col1 @TAB Col2 @TAB Col3 @TAB Row2: Col1 @TAB Col2 @TAB Col3 @TAB but Row1:Col1 Row2:Col2 Row3:Col3 instead... Link to comment Share on other sites More sharing options...
cwem Posted March 29, 2009 Author Share Posted March 29, 2009 <br/> messes up the retrieval process by line wrapping instead of maintaining the info in the same line. It turns out the StringReplace fails to replace <br/> because it's a tag. So what else can I do to replace this tag? #include <IE.au3> #include <Array.au3> #include <File.au3> $sFile = "C:\IMG_DB.xls" $hFile = FileOpen($sFile, 1) ; 1 = append $oIE = _IECreate () _IELoadWait ($oIE) _IENavigate ($oIE, "http://img.jgi.doe.gov/cgi-bin/pub/main.cgi?section=GeneDetail&page=geneDetail&gene_oid=637094262") _IELoadWait ($oIE) $oTable = _IETableGetCollection ($oIE, 1) $aTableData = _IETableWriteToArray ($oTable, True) ;_ArrayDisplay($aTableData) For $y = 1 To UBound($aTableData,2) - 1 For $x = 1 To UBound($aTableData) - 1 FileWrite($hFile, @Tab) $cell = StringReplace($aTableData[$x][$y],"<br/>","MULTIPLEITEMS") FileWrite($hFile, $cell) Next FileWrite($hFile,@crlf) Next Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now