Sign in to follow this  
Followers 0
armor

stupid Perl regexp Compatability on Lookbehind assertions like (?<=ab(c|de))

6 posts in this topic

I have been working on this for two days debugging, with no clue at all, till I found out in the document telling me that the Perl Engine (which autoit uses) does not allow grammar like (?<=ab(c|de)), please refer to the docu below:

http://www.autoitscript.com/autoit3/pcrepattern.html

"

However, if there are several top-level alternatives, they do not all have to have the same fixed length. Thus

(?<=bullock|donkey)

is permitted, but

(?<!dogs?|cats?)

causes an error at compile time. Branches that match different length strings are permitted only at the top level of a lookbehind assertion. This is an extension compared with Perl (at least for 5.8), which requires all branches to match the same length of string. An assertion such as

(?<=ab(c|de))

is not permitted, because its single top-level branch can match two different lengths, but it is acceptable if rewritten to use two top-level branches:

(?<=abc|abde)

"

So is there any alternaltive way to do the task like (?<=\w+) . Thanks,

Share this post


Link to post
Share on other sites



Your "(?<=ab(c|de))" is just rewritten as "(?<=abc|abde)", and then it works as explained in that doc.

Your "(?<=\w+)" doesn't make any sense. What are you trying to get?

;)


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

Your "(?<=ab(c|de))" is just rewritten as "(?<=abc|abde)", and then it works as explained in that doc.

Your "(?<=\w+)" doesn't make any sense. What are you trying to get?

;)

Thanks for replying.

Actually I want to parse an HTML code and retrieves those words between the HTML tags, e.g.

<a> this is a program</a>

<b> to retrieve the sentences bwteen tags and break down into words</b>

will be broken down into

this

is

a

program

to

retrieve

the

sentences

bwteen

tags

and

break

down

into

words

by using the Regular Expression

(?<=>[^<>]*)([^<>\s\n\r]+)(?=[^<>]*<)

The Regular Expression I am using works well in some parser but not in AutoIt.

Please see the attachement. thanks,

post-54973-12623258321856_thumb.jpg

Share this post


Link to post
Share on other sites

#Include<array.au3> ;; For _ArrayDisplay() only
$sTxt = "<a> this is a program</a>" & @LF
$sTxt &= "<b> to retrieve the sentences bwteen tags and break down into words</b>"

$aMatch = StringRegExp(StringRegExpReplace($sTxt, "(?i)(<.+?>)", ""), "(?i)(\b\w+\b)", 3)
If NOT @Error Then
    _ArrayDisplay($aMatch)
Else
    MsgBox(0, "Error", "No match was found")
EndIf


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

nevermind ;)

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0