Mithrandir Posted January 10, 2011 Share Posted January 10, 2011 (edited) I'm trying to extract the message body from an email from an html file of my mail. After looking at the source code with Debug Bar I know that I have to get the <div whose id is "mailContent" so what I want to do is a regular expression that searches for a pattern that starts with <div id="mailContent" and ends with </div> and if there are any opening <div it finds the closing </div>$conditionalregexp = StringRegExp($sRead,'(?i)(?s)<div id="mailContent"(?(?=(<div){1}?)<div.*?</div>)</div>',3)But that didn't function. I also tried$conditionalregexp = StringRegExp($sRead,'(?i)(?s)<DIV id="mailContent"((?si)<\h*DIV\h*(.*?)/*\h*</DIV>)(.*?)</DIV>',3)And it neither returned anything. I appreciate any help! I would also appreciate some guidance in the use and sintax of (?si) conditions because I have seen it in some UDFs but I haven't find any documentation. Edited January 10, 2011 by Mithrandir Help with SOAP message!! Link to comment Share on other sites More sharing options...
bo8ster Posted January 10, 2011 Share Posted January 10, 2011 (edited) I hate regx, nothing regular about them.When I was creating regular expressions for JMeter I used http://jakarta.apache.org/oro/demo.html to test my expressions.You can also download the applet and run it using appletviewer.I know different programs use different regx and AutoIt's is a common one (Perl I Believe). You may be able to find one out there.That might help else there are some regx boffins around here.Edit: For more help look at "Tutorial - Regular Expression" in the help file and go to http://www.autoitscript.com/autoit3/pcrepattern.html. Edited January 10, 2011 by bo8ster Post your code because code says more then your words can. SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y. Use Opt("MustDeclareVars", 1)[topic="84960"]Brett F's Learning To Script with AutoIt V3[/topic][topic="21048"]Valuater's AutoIt 1-2-3, Class... is now in Session[/topic]Contribution: [topic="87994"]Get SVN Rev Number[/topic], [topic="93527"]Control Handle under mouse[/topic], [topic="91966"]A Presentation using AutoIt[/topic], [topic="112756"]Log ConsoleWrite output in Scite[/topic] Link to comment Share on other sites More sharing options...
storme Posted January 10, 2011 Share Posted January 10, 2011 I'm trying to extract the message body from an email from an html file of my mail. After looking at the source code with Debug Bar I know that I have to get the <div whose id is "mailContent" so what I want to do is a regular expression that searches for a pattern that starts with <div id="mailContent" and ends with </div> and if there are any opening <div it finds the closing </div> Disclaimer: I know very little about REGEX but I had a similar situation and this was given to me.... Local $asDownloadLinks = StringRegExp(BinaryToString($sPageSource), '(?s)(?i)class="download-link"\s*href="(.*?)">', 3) to get everything between class="download-link"\s*href=" and "> So doing a cut and pastI think your REGEX should look like this $conditionalregexp = StringRegExp($sRead,'(?s)(?i)<div id="mailContent"(.*?)</DIV>', 3) That should get you the main criteria then you could do another regex on each of the results from that. Hope that helps John Some of my small contributions to AutoIt Browse for Folder Dialog - Automation SysTreeView32 | FileHippo Download and/or retrieve program information | Get installedpath from uninstall key in registry | RoboCopy function John Morrison aka Storm-E Link to comment Share on other sites More sharing options...
Mithrandir Posted January 10, 2011 Author Share Posted January 10, 2011 Thank to everyone! After trying a bit more I got a working code that do what I want:$conditionalregexp = StringRegExp($sRead, '(?i)(?s)<div id="mailContent".*?(?(?=<div).*?</div>)</div>', 3)I tried putting the .*? right before the last </div> but it didn't function. I also found these useful resources for regexp that were in the signature of someone that I don't remember (thanks to him/her anyway!) :http://www.regular-expressions.info/lookaround.htmlhttp://regexlib.com/DisplayPatterns.aspxThis is also useful:http://gskinner.com/RegExr/ Help with SOAP message!! Link to comment Share on other sites More sharing options...
jchd Posted January 10, 2011 Share Posted January 10, 2011 (edited) You should find a partial answer to your question in a where I just posted an answer. This solves (sort of) the issue with recursing in html tags but you'll have to adapt the idea to your precise situation. Edited January 10, 2011 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now