Jump to content

Need regex help please


Recommended Posts

Hi all,

I'm struggling to get some regex patterns right and am hoping someone better at them than me can help.

I have a string that looks like:

N: nikink-vm2, nikink-vm, xstf1111a9lt986 M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234

The O: is order number, the N: is name(s), the M: is mac address(es), the D : is old information (all alphanum plus colon, dash, and comma).

The O/N/M/D fields are all optional - they may or may not exist, and they might be in any order. (My script is part of a move to ensure all fields are present even if empty)

So if they *do* exist I want to parse out the values - for example GetOrderNumer returns "123", GetNames returns "nikink-vm2, nikink-vm, xstf1111a9lt986" and so on.

My regex patters so far are:

$NamesPattern = "(?:N:\s*([A-Za-z0-9-,\s]*)\s)"
    $MacssPattern = "(?:M:\s*([A-Za-z0-9-,\s]*)\s?[^O:|N:|M:|D:])"
    $OrderPattern = "(?:O:\s*([0-9]*)\s?)"
    $DetailsPattern = "(?:D:\s*([A-Za-z0-9-,\s]*)\s?[^O:|N:|M:])"

And they are very close to working correctly! But for example, if I move the N: filed from the front to the end of that string, the pattern returns only "nikink-vm2, nikink-vm," (completely missing the third entry). If I move the Mac field to the end it returns  "ABCDEFABCDEG, 12acac12aca" (cutting off the final 'c').

When I start refining the regex my lack of actual understanding shows.

For example using: (?:(N:\s*([A-Za-z0-9-]*(,?(\s)*)?)*\s*[^M:])) to capture names gives me a leading N: (which I don't want if it can be avoided). The [^M:] at the end is just my attempt to stop the pattern from returning the following 'M' and because the fields could be in any order that needs to account for O and D as well and I'm having a hell of a time getting that to work as well.

In all cases I'm testing this with the stringregexgui.au3 from the  StringRegExp help file.

Any help would be greatly appreciated.

:)

Link to comment
Share on other sites

Fiddled around with it - seems the problem hangs on detecting the end of the wanted text. How about that?

$teststring1 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234"
$teststring2 = "M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234 N: nikink-vm2, nikink-vm, xstf1111a9lt986"
$teststring3 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 O: 123 D: Stuff and details and things ON 1234 M: ABCDEFABCDEG, 12acac12acac "

Func getvalues($string, $letter)
    $pattern = "(?:" & $letter & ":\s*([A-Za-z0-9-,\s]*))([A-Z]:|$)"
    $matches = StringRegExp($string, $pattern,1)
    ConsoleWrite($pattern & " : " & $matches[0] & @CRLF)
    Return $matches[0]
EndFunc

ConsoleWrite($teststring1 & @CRLF)
getvalues($teststring1, "N")
getvalues($teststring1, "M")
getvalues($teststring1, "O")
getvalues($teststring1, "D")

ConsoleWrite(@CRLF & $teststring2 & @CRLF)
getvalues($teststring2, "N")
getvalues($teststring2, "M")
getvalues($teststring2, "O")
getvalues($teststring2, "D")

ConsoleWrite(@CRLF & $teststring3 & @CRLF)
getvalues($teststring3, "N")
getvalues($teststring3, "M")
getvalues($teststring3, "O")
getvalues($teststring3, "D")

 

Any of my own codes posted on the forum are free for use by others without any restriction of any kind. (WTFPL)

Link to comment
Share on other sites

4 hours ago, nikink said:

The O/N/M/D fields are all optional - they may or may not exist

This needs an error checking  ;)

$teststring3 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 D: Stuff and details and things ON 1234 M: ABCDEFABCDEG, 12acac12acac "
; $teststring3 = ""

Func getvalues($string, $letter)
   $pattern = "(?:" & $letter & ":\s*([A-Za-z0-9-,\s]*))([A-Z]:|$)"
   $matches = StringRegExp($string, $pattern,1)
   If not IsArray($matches) Then
      ConsoleWrite($pattern & " : " & "no match" & @CRLF)
      Return "no match"
   EndIf
   ConsoleWrite($pattern & " : " & $matches[0] & @CRLF)
   Return $matches[0]
EndFunc

ConsoleWrite(@CRLF & $teststring3 & @CRLF)
getvalues($teststring3, "N")
getvalues($teststring3, "M")
getvalues($teststring3, "O")
getvalues($teststring3, "D")

 

Link to comment
Share on other sites

Couldn't that pattern be streamlined ? And be more robust in case another letter (ex Z:) be part of a field ?

Local $teststring [] = ["O: 121212 N: nikink-vm2, nikink-vm, xstf1111a9lt986 D: Stuff and details and things ON 1234", _
    "M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234 N: nikink-vm2, nikink-vm, xstf1111a9lt986", _
    "N: nikink-vm2, nikink-vm, xstf1111a9lt986 D: Stuff and details and things ON 1234 M: ABCDEFABCDEG, 12acac12acac ", _
    ""]

For $i = 0 To UBound($teststring) - 1
  ConsoleWrite ("N = " & getvalues($teststring[$i], "N") & @CRLF)
  ConsoleWrite ("D = " & getvalues($teststring[$i], "D") & @CRLF)
  ConsoleWrite ("M = " & getvalues($teststring[$i], "M") & @CRLF)
  ConsoleWrite ("O = " & getvalues($teststring[$i], "O") & @CRLF)
  ConsoleWrite ("===================================" & @CRLF)
Next

Func getvalues($string, $letter)
  Local $pattern = $letter & ":\s(.*?)([NDMO]:|$)"
  Local $matches = StringRegExp($string, $pattern, 1)
  If not IsArray($matches) Then Return ""
  Return $matches[0]
EndFunc

 

Link to comment
Share on other sites

As often happens, I pondered this problem over breakfast and came up with a solution only to find it had been satisfactorily answered.

I can't let the pondering go to waste so regardless of the above solutions and discussions this is the way I would tackle it. I have borrowed test outlines from @Marc and indicated missing strings and keys (as blank) as suggested by @mikell and @Nine. My solution avoids regex. As an aside, It introduces Eval() which is a pretty cute feature of AutoIt.

$sTestString1 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234"
$sTestString2 = "M: ABCDEFABCDEG, 12acac12acac O: 123 D: Stuff and details and things ON 1234 N: nikink-vm2, nikink-vm, xstf1111a9lt986"
$sTestString3 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 O: 123 D: Stuff and details and things ON 1234 M: ABCDEFABCDEG, 12acac12acac "
; Deliberate key/string missing in string 4
$sTestString4 = "N: nikink-vm2, nikink-vm, xstf1111a9lt986 O: 123 D: Stuff and details and things ON 1234"
$sKeys = "NMODX" ; added an unexpected key to check error handling

For $y = 1 To 4
  ConsoleWrite("$sTestString" & $y & " = " & Eval("sTestString" & String($y)) & @CRLF)
  For $i = 1 To StringLen($sKeys)
    ConsoleWrite(StringMid($sKeys, $i, 1) & " = " & GetValue(Eval("sTestString" & String($y)), StringMid($sKeys, $i, 1)) & @CRLF)
  Next
Next

Func GetValue($sStr, $sKey)
  Local $iPos1 = StringInStr($sStr, $sKey & ":", 1)
  If $iPos1 = 0 Then Return ""
  Local $iPos2 = StringInStr($sStr, ":", 0, 1, $iPos1 + 2)
  Return StringMid($sStr, $iPos1 + 2, ($iPos2 = 0) ? (StringLen($sStr) - $iPos1 + 2) : (($iPos2 - 1) - ($iPos1 + 2)))
EndFunc   ;==>GetValue

 

Phil Seakins

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...