Sign in to follow this  
Followers 0
jezzzzy

StringRegExp

19 posts in this topic

Trying to parse some HTML and can't seem to return the value between two <span> tags. This is what I came up with based on the help file - but it doesn't work.

$var = StringRegExp($string,'(?:<span id="header">)\S(?: </span>)',3)

I'm trying to return \S (any non-white character) between the two <span></span> tags.

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Trying to parse some HTML and can't seem to return the value between two <span> tags. This is what I came up with based on the help file - but it doesn't work.

$var = StringRegExp($string,'(?:<span id="header">)\S(?: </span>)',3)

I'm trying to return \S (any non-white character) between the two <span></span> tags.

http://www.autoitscript.com/forum/index.ph...te=_SRE_Between

#inlcude <array.au3>;only for _ArrayDisplay()
$var = _SRE_Between($string, '"header">', '<')
_ArrayDisplay($var, 'Title')
Func _SRE_Between($s_String, $s_Start, $s_End, $iCase = 'i')
    If $iCase <> 'i' Then $iCase = ''
    $a_Array = StringRegExp ($s_String, '(?' & $iCase & _
            ':' & $s_Start & ')(.*?)(?' & $iCase & _
            ':' & $s_End & ')', 3)
    If IsArray($a_Array) Then Return $a_Array
    Return SetError(1, 0, 0)
EndFunc   ;==>_SRE_Between

Edit:

Pasted the wrong _SRE out of my library... funny, might be something that broke EnCodeIt now that I'm looking at it.

Edited by SmOke_N

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Perfect Thanks.

Edit: Was perfect until I upgraded to the new beta (3.2.1.8) and now it says my $var is not an array.

Edited by jezzzzy

Share this post


Link to post
Share on other sites

Now $var just returns 0 in a non array variable. Not sure where stringregexp is failing...

Share this post


Link to post
Share on other sites

I'm thinking I might be using character that need to be escaped... weird that it worked before the beta upgrade

here's what I have

$wWebLink = _SRE_Between($string,'"Order.aspx?Id=','">')

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

I'm thinking I might be using character that need to be escaped... weird that it worked before the beta upgrade

here's what I have

$wWebLink = _SRE_Between($string,'"Order.aspx?Id=','">')

Ok, it's a UDF that use SRE, it doesn't check to make sure your "syntax" is correct for using SRE.... See that question mark, you need to have it so it is used as a regular chr not a special character. Edited by SmOke_N

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Ok. Still not working right. Question mark issue is resolved. But here is a similar problem. My line looks like this

$wCondition = _SRE_Between($string,'"ConditionSpan">','<')
oÝ÷ ٩ݶLÂå¢,)¶¬r^ëGmëg(Ø­ùó­?vÞ¶ëG]

Still no luck. Any ideas why the new Beta makes this not work? I upgraded because I needed some of the new shell commands. So reverting back to an older version isn't my favorite choice.

Share this post


Link to post
Share on other sites

The UDF I gave you doesn't take account for \n new lines.

I copied your example in a txt file, then ran this and it returned "New"

#include <array.au3>;only for _ArrayDisplay()
$var = _SRE_Between(StringStripWS(FileRead(@DesktopDir & '\htmlsretest.txt'), 8), '"ConditionSpan"\>', '\<')
_ArrayDisplay($var, 'Title')
Func _SRE_Between($s_String, $s_Start, $s_End, $iCase = 'i')
    If $iCase <> 'i' Then $iCase = ''
    $a_Array = StringRegExp ($s_String, '(?' & $iCase & _
            ':' & $s_Start & ')(.*?)(?' & $iCase & _
            ':' & $s_End & ')', 3)
    If IsArray($a_Array) Then Return $a_Array
    Return SetError(1, 0, 0)
EndFunc   ;==>_SRE_Between

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

This does not work for me. $var returns 0. I've double checked that my _SRE_Between looks like this

$wCondition = _SRE_Between($string,'"ConditionSpan"\>', '\<')
oÝ÷ Ø
ÞÊ®²)à¿}µñ·­hGb´wöÇ^~)âµér¶§{ bëajÝý²¯x&zènW¦±·jëjYh~l¨¶«EèÆý³
+ìz_¢»a{l"¶=ÚqèÁÚq©eÊ«~éܶ*'¶¥¢»ayö¶ØZ¶Ê&zئzËaz·°Y[y­=Úv§vÊ&zئzËaz·°¢}ý´kçm+-¢z½¨¥jëh×6
$wCondition = _SRE_Between($string,'"ConditionSpan"\>[\n]','[\n]<')
oÝ÷ Ú«¨µéÚ
Edited by jezzzzy

Share this post


Link to post
Share on other sites

Yes I'm using 3.2.1.8... and your not using the example as I've shown.


[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Here's another example of what you might try to do, if you can't get \n correctly.

#include <array.au3>;only for _ArrayDisplay()
$File2Read = FileRead(@DesktopDir & '\htmlsretest.txt')
$Strip_NewLines_And_LeadingSpaces = StringStripWS(StringReplace(StringStripCR($File2Read), @LF, ''), 7)
$var = _SRE_Between($Strip_NewLines_And_LeadingSpaces, '"ConditionSpan">', '<')
_ArrayDisplay($var, 'Title')
Func _SRE_Between($s_String, $s_Start, $s_End, $iCase = 'i')
    If $iCase <> 'i' Then $iCase = ''
    $a_Array = StringRegExp ($s_String, '(?' & $iCase & _
            ':' & $s_Start & ')(.*?)(?' & $iCase & _
            ':' & $s_End & ')', 3)
    If IsArray($a_Array) Then Return $a_Array
    Return SetError(1, 0, 0)
EndFunc   ;==>_SRE_Between
Edited by SmOke_N

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

I apologize. I copied your function verbatim and tried to adjust your function to my use... I thought I only modified the part where you pulled your string from the text file. I am using a varialbe that holds the HTML, not a text file. Not sure why that would have broken it. I'm fairly sure I just replaced this

StringStripWS(FileRead(@DesktopDir & '\htmlsretest.txt'), 8)oÝ÷ Û­Øb±«­¢+ØÀÌØíÍÑÉ¥¹

Is that what broke it? Not sure why it would work reading from a text file and not from a variable. I guess by now you can tell i'm no RegExp expert.

Yes I'm using 3.2.1.8... and your not using the example as I've shown.

Edited by jezzzzy

Share this post


Link to post
Share on other sites

I apologize. I copied your function verbatim and tried to adjust your function to my use... I thought I only modified the part where you pulled your string from the text file. I am using a varialbe that holds the HTML, not a text file. Not sure why that would have broken it. I'm fairly sure I just replaced this

StringStripWS(FileRead(@DesktopDir & '\htmlsretest.txt'), 8)oÝ÷ Û­Øb±«­¢+ØÀÌØíÍÑÉ¥¹

Is that what broke it? Not sure why it would work reading from a text file and not from a variable. I guess by now you can tell i'm no RegExp expert.

Look at the last edit I made... and try that.

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Stripping the newlines seems to work. Thank you.

However, seems like extra work for the app to have to parse through all of the page an extra time. I wish I knew why you're getting the correct result within your function without the extra step and I am not. Can you think of any reason why? I don't imagine the text file is the problem. Shouldn't it work from my variable?

Share this post


Link to post
Share on other sites

Stripping the newlines seems to work. Thank you.

However, seems like extra work for the app to have to parse through all of the page an extra time. I wish I knew why you're getting the correct result within your function without the extra step and I am not. Can you think of any reason why? I don't imagine the text file is the problem. Shouldn't it work from my variable?

The text file was read into a variable. I can only go off what you've provided, if you want to provide the site you are using _InetGetSource() or _InetGet with, I'll look to see possibly why it's happening.

I'm sure I could make conditions for it, but the function does what "I" originally set out for it to do. Maybe one of the SRE guru's could do a better job, as this was originally only set for the "first" SRE. I'm not yet versed enough in the "correct" perl expressions.

Maybe you could do some homework:

http://perldoc.perl.org/perlre.html#Regular-Expressions

And see if you might be able to make a good add in to it.


[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

I've also tested running from a text file as you did and it worked for me too. What is different as far as the StringRegExp function between feeding it the string as a variable or feeding it the string from a fileread function?

Share this post


Link to post
Share on other sites

I've also tested running from a text file as you did and it worked for me too. What is different as far as the StringRegExp function between feeding it the string as a variable or feeding it the string from a fileread function?

I don't really know, maybe it's CR and LF vs CRLF.

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Thank you for all of your help Smoke. You've given me enough to get it functional. I will run through the perl references you gave me and I will check the wiki again. I appreciate it.

Share this post


Link to post
Share on other sites

Thank you for all of your help Smoke. You've given me enough to get it functional. I will run through the perl references you gave me and I will check the wiki again. I appreciate it.

Can't tell you how man $ and \n \n+ and \s* and \s+ and other combo's I did.

#include <array.au3>;only for _ArrayDisplay()
$File2Read = FileRead(@DesktopDir & '\htmlsretest.txt')
$var = _SRE_Between($File2Read, '"ConditionSpan"\>\s+', '\s+<')
_ArrayDisplay($var, 'Title')
Func _SRE_Between($s_String, $s_Start, $s_End, $iCase = 'i')
    If $iCase <> 'i' Then $iCase = ''
    $a_Array = StringRegExp ($s_String, '(?' & $iCase & _
            ':' & $s_Start & ')(.*?)(?' & $iCase & _
            ':' & $s_End & ')', 3)
    If IsArray($a_Array) Then Return $a_Array
    Return SetError(1, 0, 0)
EndFunc   ;==>_SRE_Between

Better post this before the server eats my lunch again!


[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0