martijn Posted September 26, 2006 Posted September 26, 2006 I use the following script to parse an html file: $arr = StringRegExp($text,"(<td.*?>|<td>)",3) and that works. But if I use $arr = StringRegExp($text,"(<td.*?>)",3) it does not work. It leaves out the <td> ? Can anyone explain?
Moderators SmOke_N Posted September 26, 2006 Moderators Posted September 26, 2006 (edited) #Include <array.au3> $arr = _SRE_Between($text, '<td', '>') _ArrayDisplay($arr, 'Array') Func _SRE_BetweenEX($s_String, $s_Start, $s_End, $iCase = 'i') If $iCase <> 'i' Then $iCase = '' $a_Array = StringRegExp ($s_String, '(?' & $iCase & _ ':' & $s_Start & ')(.*?)(?' & $iCase & _ ':' & $s_End & ')', 3) If @extended & IsArray($a_Array) Then Return $a_Array Return SetError(1, 0, 0) EndFunc ;==>_SRE_BetweenEX Edit: Forgot Code Tags Edited September 26, 2006 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
martijn Posted September 26, 2006 Author Posted September 26, 2006 (edited) Thank you for your reply, but my question was about the use of StringRegExp. I need more complex regular expressions, but to start with I used a simple script. This 'simple script' already gives me headaches.Can someone please explain why I does not match a simple <td> in my second code sample? I thought that a ? would limit the 'greediness' and return the smallest match possible. And in combination with a * (none or more matches) it expect it to return items with no matches as well. So both <td class="..."> and <td>.Many thanks in advance Edit: I've tried your script, but that doesn't work either. Here's the code I used#Include <array.au3> $text = "<td remark=this one shows up><td><td remark=and this one too><td><table><td question=but the empty td's don't show>" $arr = _SRE_BetweenEX($text, '<td', '>') _ArrayDisplay($arr, 'Array') Func _SRE_BetweenEX($s_String, $s_Start, $s_End, $iCase = 'i') If $iCase <> 'i' Then $iCase = '' $a_Array = StringRegExp ($s_String, '(?' & $iCase & _ ':' & $s_Start & ')(.*?)(?' & $iCase & _ ':' & $s_End & ')', 3) If @extended & IsArray($a_Array) Then Return $a_Array Return SetError(1, 0, 0) EndFunc ;==>_SRE_BetweenEX Edited September 26, 2006 by martijn
martijn Posted September 26, 2006 Author Posted September 26, 2006 Nutster said: It should still give the dollar sign. Yes, the .*? should give 0 characters, but that part of the stringregexp function is not working properly. I am in the midst of tracking down the problem. If all else fails, I may just rewrite the repeater/predictor code.This seems to be a bug in the StringRegExp function
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now