Sparrowlord Posted August 31, 2009 Share Posted August 31, 2009 I'm trying to get the content between two tags using StringRegExp, and I'm running into a problem.. it's not working. <h1 class="r_outline"> // lots of random crap between </h1> I tried the following with no luck: StringRegExp($source, '<(?i)h1 class="r_outline">(.*?)</(?i)h1>') Help? Link to comment Share on other sites More sharing options...
jvanegmond Posted August 31, 2009 Share Posted August 31, 2009 . does not match new line by default. Try adding (?s) in front of your regexp. github.com/jvanegmond Link to comment Share on other sites More sharing options...
Sparrowlord Posted August 31, 2009 Author Share Posted August 31, 2009 . does not match new line by default. Try adding (?s) in front of your regexp.I tried adding "(?s)" in front of my regexp, that didn't work.. any more suggestions? Link to comment Share on other sites More sharing options...
jvanegmond Posted August 31, 2009 Share Posted August 31, 2009 It works when I try it: #include <Array.au3> $source = '<h1 class="r_outline">' & @CRLF & _ ' // lots of random crap between' & @CRLF & _ '</h1>'& @CRLF $regexp = StringRegExp($source, '(?s)<(?i)h1 class="r_outline">(.*?)</(?i)h1>', 3) If @error Then MsgBox(0,"", @error) _ArrayDisplay($regexp) github.com/jvanegmond Link to comment Share on other sites More sharing options...
AuToItItAlIaNlOv3R Posted August 31, 2009 Share Posted August 31, 2009 Try with _StringBetween function #include <String.au3> #include <Array.au3> $source = '<h1 class="r_outline">// lots of random crap between</h1>' $aArray1 = _StringBetween ($source,'<h1 class="r_outline">','</h1>') _ArrayDisplay($aArray1) It's work fine on my pc. Link to comment Share on other sites More sharing options...
Sparrowlord Posted August 31, 2009 Author Share Posted August 31, 2009 I can't get any suggestions to work, and I'm not quite sure if it's because there's some sort of spaces in front of it.. <h1 class="r_outline"> // stuff I want here </h1> That's exactly how it appears when I view the page source. Link to comment Share on other sites More sharing options...
AuToItItAlIaNlOv3R Posted August 31, 2009 Share Posted August 31, 2009 Try with this : #include <String.au3> #include <Array.au3> #include <INet.au3> $source = _INetGetSource ("http://www.xxxxx.com") $aArray1 = _StringBetween ($source,'<h1 class="r_outline">','</h1>') _ArrayDisplay($aArray1) If you don't show us the pagesource is hard to help you. Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted August 31, 2009 Moderators Share Posted August 31, 2009 Don't forget the 3rd parameter of StringRegExp() ( Not using it uses zero as default and only returns a boolean for found or not found) (Use it like Manadar has his).And then try this expression:"(?s)(?i)<h1\W*class=\x22r_outline\x22>.+?//\W*(.+?)\s*</h1>" Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Sparrowlord Posted August 31, 2009 Author Share Posted August 31, 2009 I figured out the problem, it appears when _IEDocReadHTML() is executed it changed all of the source code ( which I copied mine from firefox ).. once I wrote the output to a file from _IEDocReadHTML() it was noticeable. It made my tag all capital letters, and removed the quotes around "r_outline". I adjusted this accordingly and all is working well now. Many thanks. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now