armor Posted December 25, 2009 Share Posted December 25, 2009 I have been working on this for two days debugging, with no clue at all, till I found out in the document telling me that the Perl Engine (which autoit uses) does not allow grammar like (?<=ab(c|de)), please refer to the docu below: http://www.autoitscript.com/autoit3/pcrepattern.html " However, if there are several top-level alternatives, they do not all have to have the same fixed length. Thus (?<=bullock|donkey) is permitted, but (?<!dogs?|cats?) causes an error at compile time. Branches that match different length strings are permitted only at the top level of a lookbehind assertion. This is an extension compared with Perl (at least for 5.8), which requires all branches to match the same length of string. An assertion such as (?<=ab(c|de)) is not permitted, because its single top-level branch can match two different lengths, but it is acceptable if rewritten to use two top-level branches: (?<=abc|abde) " So is there any alternaltive way to do the task like (?<=\w+) . Thanks, Link to comment Share on other sites More sharing options...
PsaltyDS Posted December 27, 2009 Share Posted December 27, 2009 Your "(?<=ab(c|de))" is just rewritten as "(?<=abc|abde)", and then it works as explained in that doc. Your "(?<=\w+)" doesn't make any sense. What are you trying to get? Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
armor Posted January 1, 2010 Author Share Posted January 1, 2010 Your "(?<=ab(c|de))" is just rewritten as "(?<=abc|abde)", and then it works as explained in that doc. Your "(?<=\w+)" doesn't make any sense. What are you trying to get? Thanks for replying. Actually I want to parse an HTML code and retrieves those words between the HTML tags, e.g. <a> this is a program</a> <b> to retrieve the sentences bwteen tags and break down into words</b> will be broken down into this is a program to retrieve the sentences bwteen tags and break down into words by using the Regular Expression (?<=>[^<>]*)([^<>\s\n\r]+)(?=[^<>]*<) The Regular Expression I am using works well in some parser but not in AutoIt. Please see the attachement. thanks, Link to comment Share on other sites More sharing options...
GEOSoft Posted January 1, 2010 Share Posted January 1, 2010 #Include<array.au3> ;; For _ArrayDisplay() only $sTxt = "<a> this is a program</a>" & @LF $sTxt &= "<b> to retrieve the sentences bwteen tags and break down into words</b>" $aMatch = StringRegExp(StringRegExpReplace($sTxt, "(?i)(<.+?>)", ""), "(?i)(\b\w+\b)", 3) If NOT @Error Then _ArrayDisplay($aMatch) Else MsgBox(0, "Error", "No match was found") EndIf George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
Malkey Posted January 1, 2010 Share Posted January 1, 2010 $sTxt = "<a> this is a program</a>" & @CRLF & _ "<b> to retrieve the sentences between tags and break down into words</b>" MsgBox(0, "Word List", StringRegExpReplace(StringRegExpReplace($sTxt, "(?i)(<.+?>)", ""), _ "\s*([^\s]+)\s+", "\1" & @CRLF)) Link to comment Share on other sites More sharing options...
MvGulik Posted January 1, 2010 Share Posted January 1, 2010 (edited) nevermind Edited January 1, 2010 by MvGulik "Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions.""The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014) "Believing what you know ain't so" ... Knock Knock ... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now