Sign in to follow this  
Followers 0
jvanegmond

Tutorial: Simple regular expression multiple result handling

6 posts in this topic

#1 ·  Posted (edited)

How often have you had some input like this:

<option value=1>Apple</option>
<option value=2>Pear</option>
<option value=3>Banana</option>
<option value=4>Orange</option>

You write a regexp to get the values and fruit names like so:

$ar = StringRegExp($in, "<option value=(.*?)>(.*?)</option>", 3)

And end up with this:

Posted Image

While you really just wanted:

Posted Image

Now you either have to adjust your code so it works with the one dimensional array, or you have to make a loop through all the elements and create a new array: But that is just everyone reinventing the same wheel all over again.

Come _ArrayCombineElements:

Func _ArrayCombineElements($arr, $num)
    Local $newArr[Ceiling(UBound($arr)/$num)][$num]
    $m = 0
    $n = 0
    For $i = 0 To UBound($arr)-1
        $newArr[$n][$m] = $arr[$i]
        $m += 1
        If $m >= $num Then
            $m = 0
            $n +=1
        EndIf
    Next
    Return $newArr
EndFunc

You have never seen a function dirtier, but you like getting dirty sometimes.

Example:

#include <Array.au3>

$in = "<option value=1>Apple</option>" & _
"<option value=2>Pear</option>" & _
"<option value=3>Banana</option>" & _
"<option value=4>Orange</option>"

$ar = StringRegExp($in, "<option value=(.*?)>(.*?)</option>", 3)

$br = _ArrayCombineElements($ar, 2)

_ArrayDisplay($br)

Func _ArrayCombineElements($arr, $num)
    Local $newArr[Ceiling(UBound($arr)/$num)][$num]
    $m = 0
    $n = 0
    For $i = 0 To UBound($arr)-1
        $newArr[$n][$m] = $arr[$i]
        $m += 1
        If $m >= $num Then
            $m = 0
            $n +=1
        EndIf
    Next
    Return $newArr
EndFunc

Script output: What you expected in the first place.

Edited by Manadar

Share this post


Link to post
Share on other sites



I have used RegExp a few times to look for patterns in strings, but I wasn't aware it could do this. That's really useful in many ways. Nice example thanks. :)

Share this post


Link to post
Share on other sites

Very nice example and tutorial. This will seriously come in handy.


[left][sub]We're trapped in the belly of this horrible machine.[/sub][sup]And the machine is bleeding to death...[/sup][sup][/sup][/left]

Share this post


Link to post
Share on other sites

I would have used option 4 like this:

#include<Array.au3>

$in = "<option value=1>Apple</option>" & _
        "<option value=2>Pear</option>" & _
        "<option value=3>Banana</option>" & _
        "<option value=4>Orange</option>"

$ar = StringRegExp($in, "<option value=(.*?)>(.*?)</option>", 4)

Local $aRet[UBound($ar)][UBound($ar[0]) - 1]
Local $aTemp
For $i = 0 To UBound($ar) - 1
    $aTemp = $ar[$i]

    For $n = 0 To UBound($ar[$i]) - 2
        $aRet[$i][$n] = $aTemp[$n + 1]
    Next
Next
$aTemp = 0

_ArrayDisplay($aRet)

Mat

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

I would have used option 4 like this:

#include<Array.au3>

$in = "<option value=1>Apple</option>" & _
        "<option value=2>Pear</option>" & _
        "<option value=3>Banana</option>" & _
        "<option value=4>Orange</option>"

$ar = StringRegExp($in, "<option value=(.*?)>(.*?)</option>", 4)

Local $aRet[UBound($ar)][UBound($ar[0]) - 1]
Local $aTemp
For $i = 0 To UBound($ar) - 1
    $aTemp = $ar[$i]

    For $n = 0 To UBound($ar[$i]) - 2
        $aRet[$i][$n] = $aTemp[$n + 1]
    Next
Next
$aTemp = 0

_ArrayDisplay($aRet)

Mat

I prefer your method. No need to say number of groups before hand.

#include<Array.au3>

$in = "<option value=1>Apple</option>" & _
        "<option value=2>Pear</option>" & _
        "<option value=3>Banana</option>" & _
        "<option value=4>Orange</option>"

$arr = _WinRegExp($in, "<option value=(.*?)>(.*?)</option>")
_ArrayDisplay($arr)

Func _WinRegExp($test, $pattern)
    $arr = StringRegExp($test, $pattern, 4)

    Local $newArr[UBound($arr)][UBound($arr[0])]
    Local $aTemp
    For $i = 0 To UBound($arr) - 1
        $aTemp = $arr[$i]

        For $n = 0 To UBound($arr[$i]) - 1
            $newArr[$i][$n] = $aTemp[$n]
        Next
    Next
    $aTemp = 0
    Return $newArr
EndFunc

Maybe I'll change the tutorial on the first page. This depends on why someone would prefer option 3 over 4.

Oh and I also added the global result... why not.

Edited by Manadar

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

I never understood why 4 returned a jagged array rather than a normal multidimensional array... But I think that the size of the sub-arrays are not fixed at being the same as the first.

In your example, there will always be exactly 2 matched groups, even if they are blank. What happens if you edited the expression to include the possiblity that "value" was not set? On a normal regex I would use:

"<option(?:\s+value=(.*?))?>(.*?)</option>"
Or:
"<option>(.*?)</option>|<option value=(.*?)>(.*?)</option>"

I imagine the sub arrays would be of different sizes. Then there is a problem. See Edit

Unfortunately I can't test right now... I was trying the other day to see if I could get a version of Au3Int online like haskell and ruby have so you can play with autoit in the browser... but didn't get much success. I'll have to try again as I could really use it right now.

Mat

Edit: I was wrong in my assumptions... All the arrays appear to have the same length. Furthermore, they reserve a space for matches even when they cannot be matched. So the question is... Why a jagged array in the first place?

Edited by Mat

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0