Pachan Posted June 18, 2010 Share Posted June 18, 2010 Hi Team, I wanted to get text running across multiple lines between html tags. I have used the following code : ******************************************************** $fileForRead = FileOpen("test.html", 0) ; Check if file opened for reading OK If $fileForRead = -1 Then MsgBox(0, "Error", "Unable to open file test.html.") Exit EndIf ; Read in full character at a time until the EOF is reached $sText = FileRead($fileForRead ) FileClose($fileForRead) $subString = StringRegExp($sText,'<div id="likely_problems">(.*?)</div>', 1) ConsoleWrite($subString[0]) ******************************************************* The text file I use as input is *********************************************************** abcdefghijk <div id="likely_problems"><span class='congrats_msg'><img src='http://test.com/images/feedback.gif' alt='Feedback' /> Congratulations! No likely problems.</span> ukjkjghjka ksadfhjhagk; jadghkl;aosdjl aklgoruiouporip </div> abcdefghijk ************************************************************** It is not working and throwing the following error : "D:\AccesibilityTesting\regExpsimple.au3 (15) : ==> Subscript used with non-Array variable.:" The pattern matching is working if input file has <div id="likely_problems"> and </div> in "same line". How can I get a pattern which is over multiple lines (Ignoring white space & newline)? Thanks, Thomas Link to comment Share on other sites More sharing options...
Juvigy Posted June 18, 2010 Share Posted June 18, 2010 $nOffset = 1 While 1 $array = StringRegExp('<div id="likely_problems"><span class="congrats_msg"><img src="http://test.com/images/feedback.gif" alt="Feedback"/> Congratulations! No likely problems.</div>', '<(?i)div id="likely_problems">(.*?)</(?i)div>', 1, $nOffset) If @error = 0 Then $nOffset = @extended Else ExitLoop EndIf for $i = 0 to UBound($array) - 1 msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i]) Next WEnd Keep in mind the " and ' characters from the html file.Put a line like: msgbox(0,"",$sText) to see your string before you feed it to the StringRegExp. Link to comment Share on other sites More sharing options...
GEOSoft Posted June 18, 2010 Share Posted June 18, 2010 (edited) You didn't really say what portion of that code you wanted returned. I'll assume it's everything within the div tags. $sText = FileRead("test.html") ;; No need to open a file in read mode to use FileRead() If $sText Then $subString = StringRegExp($sText, "(?i)(?s)<div id=.?likely_.+?>(.+?)\s*</div>", 1) If NOT @Error Then ConsoleWrite($subString[0]) & @CRLF ;; To clean it up even further use this line $sCleaned = StringRegExpReplace($subString[0], "<.+?>", "") If @Extended Then ConsoleWrite($sCleaned) & @CRLF EndIf EndIf Look in my signature for a toolkit to test AutoIt PCRE expressions. EDIT: $sSubString[0] should contain <span class='congrats_msg'><img src='http://test.com/images/feedback.gif' alt='Feedback' /> Congratulations! No likely problems.</span> ukjkjghjka ksadfhjhagk; jadghkl;aosdjl aklgoruiouporipAfter cleaning that would become Congratulations! No likely problems. ukjkjghjka ksadfhjhagk; jadghkl;aosdjl aklgoruiouporip Edited June 18, 2010 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
Pachan Posted June 21, 2010 Author Share Posted June 21, 2010 You didn't really say what portion of that code you wanted returned. I'll assume it's everything within the div tags. $sText = FileRead("test.html") ;; No need to open a file in read mode to use FileRead() If $sText Then $subString = StringRegExp($sText, "(?i)(?s)<div id=.?likely_.+?>(.+?)\s*</div>", 1) If NOT @Error Then ConsoleWrite($subString[0]) & @CRLF ;; To clean it up even further use this line $sCleaned = StringRegExpReplace($subString[0], "<.+?>", "") If @Extended Then ConsoleWrite($sCleaned) & @CRLF EndIf EndIf Look in my signature for a toolkit to test AutoIt PCRE expressions. EDIT: $sSubString[0] should contain After cleaning that would become Thank you. It is working directly. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now