Sign in to follow this  
Followers 0
Mithrandir

Help with conditional regular expression

5 posts in this topic

#1 ·  Posted (edited)

I'm trying to extract the message body from an email from an html file of my mail. After looking at the source code with Debug Bar I know that I have to get the <div whose id is "mailContent" so what I want to do is a regular expression that searches for a pattern that starts with <div id="mailContent" and ends with </div> and if there are any opening <div it finds the closing </div>

$conditionalregexp = StringRegExp($sRead,'(?i)(?s)<div id="mailContent"(?(?=(<div){1}?)<div.*?</div>)</div>',3)

But that didn't function. I also tried

$conditionalregexp = StringRegExp($sRead,'(?i)(?s)<DIV id="mailContent"((?si)<\h*DIV\h*(.*?)/*\h*</DIV>)(.*?)</DIV>',3)

And it neither returned anything. I appreciate any help! I would also appreciate some guidance in the use and sintax of (?si) conditions because I have seen it in some UDFs but I haven't find any documentation.

Edited by Mithrandir

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I hate regx, nothing regular about them.

When I was creating regular expressions for JMeter I used http://jakarta.apache.org/oro/demo.html to test my expressions.

You can also download the applet and run it using appletviewer.

I know different programs use different regx and AutoIt's is a common one (Perl I Believe). You may be able to find one out there.

That might help else there are some regx boffins around here.

Edit: For more help look at "Tutorial - Regular Expression" in the help file and go to http://www.autoitscript.com/autoit3/pcrepattern.html.

Edited by bo8ster

Post your code because code says more then your words can. SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y. Use Opt("MustDeclareVars", 1)[topic="84960"]Brett F's Learning To Script with AutoIt V3[/topic][topic="21048"]Valuater's AutoIt 1-2-3, Class... is now in Session[/topic]Contribution: [topic="87994"]Get SVN Rev Number[/topic], [topic="93527"]Control Handle under mouse[/topic], [topic="91966"]A Presentation using AutoIt[/topic], [topic="112756"]Log ConsoleWrite output in Scite[/topic]

Share this post


Link to post
Share on other sites

I'm trying to extract the message body from an email from an html file of my mail. After looking at the source code with Debug Bar I know that I have to get the <div whose id is "mailContent" so what I want to do is a regular expression that searches for a pattern that starts with <div id="mailContent" and ends with </div> and if there are any opening <div it finds the closing </div>

Disclaimer:

I know very little about REGEX but I had a similar situation and this was given to me....

Local $asDownloadLinks = StringRegExp(BinaryToString($sPageSource), '(?s)(?i)class="download-link"\s*href="(.*?)">', 3)

to get everything between

class="download-link"\s*href="

and

">

So doing a cut and pastI think your REGEX should look like this

$conditionalregexp = StringRegExp($sRead,'(?s)(?i)<div id="mailContent"(.*?)</DIV>', 3)

That should get you the main criteria then you could do another regex on each of the results from that.

Hope that helps

John

Share this post


Link to post
Share on other sites

Thank to everyone! After trying a bit more I got a working code that do what I want:

$conditionalregexp = StringRegExp($sRead, '(?i)(?s)<div id="mailContent".*?(?(?=<div).*?</div>)</div>', 3)

I tried putting the .*? right before the last </div> but it didn't function. I also found these useful resources for regexp that were in the signature of someone that I don't remember (thanks to him/her anyway!) :

http://www.regular-expressions.info/lookaround.html

http://regexlib.com/DisplayPatterns.aspx

This is also useful:

http://gskinner.com/RegExr/

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

You should find a partial answer to your question in a where I just posted an answer. This solves (sort of) the issue with recursing in html tags but you'll have to adapt the idea to your precise situation.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0