Jump to content

RegExp help


Recommended Posts

Hello,

what I am trying to accomplish is to parse string (with a song metadata) formated as a http get request into an array

Example:

Input:

artist=MyArtist&title=MySong&album=TheBestOf&year=2019&type=Music

Required output:

[0]|
[1]|MyArtist
[2]|MySong
[3]|TheBestOf
[4]|2019
[5]|Music

I am able to easily accompish this with a single StringRegExp command:

$result = StringRegExp($httpGetRequest,"artist=([^&]*)&title=([^&]*)&album=([^&]*)&year=([^&]*)&type=([^&]*)",1)

However, I would like to modify the RegExp pattern to be able to correctly parse the input even in case, when the parameters would be in different order, or some of them missing. With a help of https://regex101.com I've come to .a working solution with a regexp pattern like that (https://regex101.com/r/OafEQr/7):

RegExp:

(?:artist=([^&]*))|(?:title=([^&]*))|(?:album=([^&]*))|(?:year=([^&]*))|(?:type=([^&]*))

Result:

Match 1
Full match  0-15    artist=MyArtist
Group 1.    7-15    MyArtist
Match 2
Full match  16-29   title=MyTitle
Group 2.    22-29   MyTitle
Match 3
Full match  30-45   album=TheBestOf
Group 3.    36-45   TheBestOf
Match 4
Full match  46-55   year=2019
Group 4.    51-55   2019
Match 5
Full match  56-67   type=Music
Group 5.    61-67   Music

Anyway, when I transfered this regexp pattern into AutoIt, the result was unsatisfactory:

With
$result = StringRegExp($httpGetRequest,"(?:artist=([^&]*))|(?:title=([^&]*))|(?:album=([^&]*))|(?:year=([^&]*))|(?:type=([^&]*))",1)

Result:
[0]|MyArtist

With
$result = StringRegExp($httpGetRequest,"(?:artist=([^&]*))|(?:title=([^&]*))|(?:album=([^&]*))|(?:year=([^&]*))|(?:type=([^&]*))",3)

Result:

[0]|MyArtist
[1]|
[2]|MySong
[3]|
[4]|
[5]|TheBestOf
[6]|
[7]|
[8]|
[9]|2019
[10]|
[11]|
[12]|
[13]|
[14]|Music

In between I've come to following solution, which works, but is not so elegant, and about 10 times slower than the single StringRegExp command:

$result[1] = _ArrayToString(StringRegExp($httpGetRequest,"(?:artist=([^&]*))",1))
$result[2] = _ArrayToString(StringRegExp($httpGetRequest,"(?:title=([^&]*))",1))
$result[3] = _ArrayToString(StringRegExp($httpGetRequest,"(?:album=([^&]*))",1))
$result[4] = _ArrayToString(StringRegExp($httpGetRequest,"(?:year=([^&]*))",1))
$result[5] = _ArrayToString(StringRegExp($httpGetRequest,"(?:type=([^&]*))",1))

So, my question is: Would it be possible to accomplish this in a single StrigRegExp command?

Thank you.

Link to comment
Share on other sites

1 hour ago, dandz said:

So, my question is: Would it be possible to accomplish this in a single StrigRegExp command?

It depends from how your string will be in other cases.
These two examples extract everything you need to make a further comparsion with your array elements.

#include <Array.au3>
#include <StringConstants.au3>

Global $strString = 'artist=MyArtist&title=MySong&album=TheBestOf&year=2019&type=Music', _
       $arrResult

; First way
$arrResult = StringRegExp($strString, '(?:artist|title|album|year|type)=([^&]*)&?', $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($arrResult)

Global $arrResult[0][2]

; Second way
_ArrayAdd($arrResult, StringRegExpReplace($strString, '([^=]+)=([^&]*)&?', '$1|$2' & @CRLF))
_ArrayDisplay($arrResult)

In the first sample, only the value of the various "properties" is extracted, while in the second example, both properties and values are stored in the array, in order to have an "association" like "property-value", considering that in the string may miss the part of the value, but should never miss the part about the value.
Post your various tests, so we can help you :)

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

Successive empty captures in the resulting array are there because there is a difference between Perl regex and legacy PCRE1 regex (which current AutoIt implements). PCRE2 (is 4 year old already) now conforms to Perl with respect of unwanted empty captures but it isn't the version AutoIt uses.

Yet you can overcome this behavior difference by using the (?|  ) construct:

Local $s = "artist=MyArtist&title=MySong&album=TheBestOf&year=2019&type=Music"

Local $r = StringRegExp($s, "(?|(?:artist=([^&]*))|(?:title=([^&]*))|(?:album=([^&]*))|(?:year=([^&]*))|(?:type=([^&]*)))", 3)
_ArrayDisplay($r)

But that doesn't solve the issue of unordered, extra or missing elements.

To do that you need another idea:

Local $aData = [ _
    "title=MySong&album=TheBestOf&year=2019&artist=MyArtist&type=Music", _
    "comment=this is pure junk&title=MySong&year=2019&artist=MyArtist&album=TheBestOf&type=Music", _
    "title=MySong&year=2019&comment=excellent title&artist=MyArtist&album=TheBestOf" _
]

Local Static $aKeys = ["artist", "album", "year", "title", "type", "comment"]
Local $d = ObjCreate("Scripting.Dictionary"), $dummy

For $s In $aData
    $dummy = Execute(StringRegExpReplace($s, "(\w+)=([^&]*)&?", "$d.Add(""$1"", ""$2"") & ") & '""')
    For $k In $aKeys
        ConsoleWrite($k & " -> " & $d.Item($k) & @LF)
    Next
    ConsoleWrite(@CRLF)
    $d.RemoveAll
Next

This way you can add as many potential elements and pick them in fixed order.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

I was just trying to explin why a regex can't go back and forth in the subject string to capture elements found in varying order but expected to be process in fixed order. This was the actual OP question.
The method is generic and doesn't rely on an easy to split pattern, like in this particular case.

And regexes are sooo friendly...

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

damn mikell I was seconds away, I'd still give you a regex, but only in arrayfindall to delete the unwanted fields.

#include<array.au3>

Local $str = "artist=MyArtist&album=TheBestOf&comment=this is junk&title=MySong&year=2019&Comment2=this year was junk&type=Music"

local $a[0][2]
_ArrayAdd($a , $str , 0 , "=" , "&")
_ArrayDelete($a , _ArrayToString(_ArrayFindAll($a , "[^artist|title|album|year|type]" , 0 , 0 , 0 , 3) , ";"))

_ArrayDisplay($a)

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

From the strict efficiency point of view and with toy cases, mikell's or even better (sorry mikell) iamtheky's code are better than my execute regex way. Yet if elements are in unexpected order and are in sufficiently large number, then you'd have to sort the result array and use _ArraySearch repeatedly. That would eat up more cycles than the automagic lookup done by the associative array.

The method of applying Execute to the result of a carefully-crafted string based on regexreplace can do wonders. Use but don't abuse.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...