Sign in to follow this  
Followers 0
Swimming_Bird

Cant figure out how to parse with StringRegExp

14 posts in this topic

I cant figure out how it returns the data I want.

I'm trying to parse this string:

Atom "©nam" contains: Title
Atom "tvsh" contains: TV Show
Atom "stik" contains: Movie
Atom "©alb" contains: Album
Atom "©ART" contains: Artist
Atom "©cmt" contains: test

line2

line3
Atom "©gen" contains: Genre
Atom "trkn" contains: 1 of 1
Atom "tven" contains: 123
Atom "tves" contains: 0
Atom "tvsn" contains: 12
Atom "©day" contains: 1900

I'm going to use something along the lines of 'Atom "tvsh" contains:.+?Atom' to try and get what i want then just use a string trimmer to get only what i wanted.

However not only can i test to see if this will work. But i cant figure out how to get StringRegExp to return me what it found in any form.

Any help would be much appreciated, sorry but the help in the helpmenu for this didnt really explain much.

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I suck at RegExp, but had you thought of taken a different approach on it?

#include <array.au3>
Local $Strigs = 'Atom "©nam" contains: Title'&Chr(01)&'Atom "tvsh" contains: TV Show'&Chr(01)&'Atom "stik" contains: Movie'&Chr(01)&'Atom "©alb" contains: Album' _
&Chr(01)&'Atom "©ART" contains: Artist'&Chr(01)&'Atom "©cmt" contains: test'&Chr(01)&'Atom "©gen" contains: Genre'&Chr(01)&'Atom "trkn" contains: 1 of 1' _
&Chr(01)&'Atom "tven" contains: 123'&Chr(01)&'Atom "tves" contains: 0'&Chr(01)&'Atom "tvsn" contains: 12'&Chr(01)&'Atom "©day" contains: 1900'
Local $Find = 'Atom "©nam" contains: '&Chr(01)&'Atom "tvsh" contains: '&Chr(01)&'Atom "stik" contains: '&Chr(01)&'Atom "©alb" contains: ' _
&Chr(01)&'Atom "©ART" contains: '&Chr(01)&'Atom "©cmt" contains: '&Chr(01)&'Atom "©gen" contains: '&Chr(01)&'Atom "trkn" contains: ' _
&Chr(01)&'Atom "tven" contains: '&Chr(01)&'Atom "tves" contains: '&Chr(01)&'Atom "tvsn" contains: '&Chr(01)&'Atom "©day" contains: '
Local $SPStrings = StringSplit($Strigs, Chr(01))
Local $SPFind = StringSplit($Find, Chr(01))
Local $StringLook = ''
For $i = 1 To UBound($SPStrings) - 1; assume this is the file you are reading
    For $x = 1 To UBound($SPFind) - 1; this would be your keyword list
        If StringInStr($SPStrings[$i], $SPFind[$x]) Then
            $StringRep = StringReplace($SPStrings[$i], $SPFind[$x], '')
            $StringLook = $StringLook & $StringRep & Chr(01)
            ExitLoop
        EndIf
    Next
Next
$SPSTringLook = StringSplit(StringTrimRight($StringLook, 1), Chr(01))

_ArrayDisplay($SPSTringLook, 'Returned values')

Edited by SmOke_N

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Hi,

I agree, the help file needs fixing. [i know it used to have a working example... ?Nutster?]

I have modified it a bit to test some gear...

Hope it helps.

Best, randall

Local $sPattern, $sTest, $vResult, $nFlag

;$sPattern ="/^(.*\/)?([^\/.]+)\.(\w+)$/"

;$sPattern ="\.(\w+)$/"

;$sPattern ="([a-zA-Z]:[\\a-zA-Z 0-9]+?\\)"

;$sPattern ="\>"

;$sPattern ="[a-zA-Z]:[\\a-zA-Z 0-9]+?\.\w{0,9})"

;$sPattern ="[a-zA-Z]:lmao:[\\a-zA-Z 0-9]+?\.\w{0,9})" ; very funny; line 7 ahs a ": (" with no space between!

;$sPattern ="[\\a-zA-Z 0-9]+?(\.\w{0,9})"

;$sPattern ="(\.\w{0,9})"; gives ".gif"

$sPattern ="(\\[a-zA-Z 0-9]+?\.)"

;$sPattern = InputBox("StringRegExp Sample", "What is the pattern to test?")

;$sTest="http://example.com/images/test.gif"

$sTest="c:\examplecom\images\test.gif"

;$sTest = InputBox("StringRegExp Sample", "What is the line to test?")

$vResult = StringRegExp($sTest, $sPattern,1)

Select

Case @Error = 2

MsgBox(0,"","@Error="&@Error&@CRLF&"$vResult="&$vResult)

; Error. The pattern was invalid. $vResult = position in $sPattern where error occurred.

Case @Error = 0

if @Extended Then

if not IsArray($vResult) then

MsgBox(0,"","@Error="&@Error&@CRLF&"$vResult="&$vResult)

Else

MsgBox(0,"","@Error="&@Error&@CRLF&"$vResult[0]="&$vResult[0])

; Success. Pattern matched. $vResult matches @Extended

EndIf

Else

MsgBox(0,"","@Error="&@Error&@CRLF&"$vResult="&$vResult)

; Failure. Pattern not matched. $vResult = ""

EndIf

EndSelect

;MsgBox(0,"","$vResult="&$vResult)

exit

$vResult = StringRegExp($sTest, $sPattern,2)

MsgBox(0,"","$vResult="&$vResult)

$vResult = StringRegExp($sTest, $sPattern,3)

MsgBox(0,"","$vResult="&$vResult)

$sPattern = InputBox("StringRegExp Sample", "What is the pattern to test?")

$sTest = InputBox("StringRegExp Sample", "What is the line to test?")

;$nFlag = InputBox("StringRegExp Sample", "What flag to use? 0 - true/false, 1 - single pattern array return, 3 - global pattern array return")

$vResult = StringRegExp($sTest, $sPattern,1); $nFlag)

Select

Case @Error = 1

; Error. Flag is bad. $vResult = ""

Case @Error = 2

; Error. The pattern was invalid. $vResult = position in $sPattern where error occurred.

Case @Error = 0

if @Extended Then

; Success. Pattern matched. $vResult has the text from the groups or true (1), depending on flag.

Else

; Failure. Pattern not matched. $vResult = "" or false (0), depending on flag.

EndIf

EndSelect

MsgBox(0,"","$vResult="&$vResult)

Edited by randallc

Share this post


Link to post
Share on other sites

I suck at RegExp, but had you thought of taken a different approach on it?

#include <array.au3>
Local $Strigs = 'Atom "©nam" contains: Title'&Chr(01)&'Atom "tvsh" contains: TV Show'&Chr(01)&'Atom "stik" contains: Movie'&Chr(01)&'Atom "©alb" contains: Album' _
&Chr(01)&'Atom "©ART" contains: Artist'&Chr(01)&'Atom "©cmt" contains: test'&Chr(01)&'Atom "©gen" contains: Genre'&Chr(01)&'Atom "trkn" contains: 1 of 1' _
&Chr(01)&'Atom "tven" contains: 123'&Chr(01)&'Atom "tves" contains: 0'&Chr(01)&'Atom "tvsn" contains: 12'&Chr(01)&'Atom "©day" contains: 1900'
Local $Find = 'Atom "©nam" contains: '&Chr(01)&'Atom "tvsh" contains: '&Chr(01)&'Atom "stik" contains: '&Chr(01)&'Atom "©alb" contains: ' _
&Chr(01)&'Atom "©ART" contains: '&Chr(01)&'Atom "©cmt" contains: '&Chr(01)&'Atom "©gen" contains: '&Chr(01)&'Atom "trkn" contains: ' _
&Chr(01)&'Atom "tven" contains: '&Chr(01)&'Atom "tves" contains: '&Chr(01)&'Atom "tvsn" contains: '&Chr(01)&'Atom "©day" contains: '
Local $SPStrings = StringSplit($Strigs, Chr(01))
Local $SPFind = StringSplit($Find, Chr(01))
Local $StringLook = ''
For $i = 1 To UBound($SPStrings) - 1; assume this is the file you are reading
    For $x = 1 To UBound($SPFind) - 1; this would be your keyword list
        If StringInStr($SPStrings[$i], $SPFind[$x]) Then
            $StringRep = StringReplace($SPStrings[$i], $SPFind[$x], '')
            $StringLook = $StringLook & $StringRep & Chr(01)
            ExitLoop
        EndIf
    Next
Next
$SPSTringLook = StringSplit(StringTrimRight($StringLook, 1), Chr(01))

_ArrayDisplay($SPSTringLook, 'Returned values')
the problem is that string i showed you is returned ffrom an external program. Also the string wont always have all the values. Lasty when i tested this on my string that i generated (with all the values that i gave in the first post and in the same order) and it fell apart after the title. I'm still not sure exactly what ur script is doing but i'll look over it somemore.

Share this post


Link to post
Share on other sites

Hi,

I agree, the help file needs fixing. [i know it used to have a working example... ?Nutster?]

I have modified it a bit to test some gear...

Hope it helps.

Best, randall

i dont think the helpfile should try and teach regexp, it's way to complex an idea for a help file. I just wish i knew how the function returns what.

Share this post


Link to post
Share on other sites

I just wish i knew how the function returns what.

Do you mean you want to know

1. the internal workings of the function... - I have no idea; written in C for AutoIt I guess, with various parsing calls.

2. How to use it to return appropriate answers.

If the latter, my example script derived from the helpfile just lets you try various things.

Alternatively, Google "regular expression" and there are many tutorial sites.

Best, Randall

Share this post


Link to post
Share on other sites

no, i understand how regexps work.

what i dont understand is how i can get a string returned from the StringRegExp function, and what string is actually returned.

basically when i use the regexp:

'Atom "tvsh" contains:.+?Atom'

on the string i posted in the first thread i'd like to get a return of

'Atom "tvsh" contains:TV Show
Atom'

from which i could extract "TV Show"

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Here's one way, though I'm no expert!

$sPattern ='([Atom "tvsh" contains:]+[\\a-zA-Z 0-9]+?\w{0,20})'

$sTest='Atom "tvsh" contains: Pictures'

randall

PS

only looks like it works; not accurate

Google "regular expression" and there are many tutorial sites

Edited by randallc

Share this post


Link to post
Share on other sites

once again thanks for the help. but i dont have an issue with the actual regular expression. i'm fairly sure mine would work. what i dont understand is what flags i need to set and how my result is returned?

Share this post


Link to post
Share on other sites

yet AGAIN. i dont care about the regexp. All i want to know is how i can get what i want returned from the function.

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

yet AGAIN. i dont care about the regexp. All i want to know is how i can get what i want returned from the function.

Ok, I see a miscommunication. I don't think anyone's getting the question he's asking. If I understand correctly, you're asking how StringRegExp returns matches. It's all in the flag setting.

flag = 0; Just tells you if the pattern is in the test string.

flag = 1; Returns an array of all the matched groups, i.e. matches inside parentheses

flag = 3; Returns an array of the matched groups, and every other match if there are multiples.

For ease of use, I'll stick with flag = 1, because I think that's what you need. Here's my code that works for what you've said you want it to do.

$line1 = 'Atom "©nam" contains: Title'
$line2 = 'Atom "tvsh" contains: TV Show'
$line3 = 'Atom "stik" contains: Movie'
$line4 = 'Atom "©alb" contains: Album'
;The easiest way I can see to do this, is if these lines are returned in a file, just FileReadLine, and parse line by line

$pattern = '(?:Atom ")(.+)(?:" contains: )(.+)'
$display = ""

$display &= StripVars($line1,$pattern)
$display &= StripVars($line2,$pattern)
$display &= StripVars($line3,$pattern)
$display &= StripVars($line4,$pattern)

MsgBox(0,"Results", $display)

Func StripVars($line, $pattern)
    $result = StringRegExp($line, $pattern, 1);Flag 1 returns any matching groups in an array
    If IsArray($result) Then
        Return "Var="&$result[0]&" Data="&$result[1]&@CRLF
    EndIf
EndFunc

Basically, StringRegExp matches anything in parentheses, exception: (?:...) is a non-matching group. And returns all matches in an array. Hope that helps, let me know if you have any other questions. I'm discovering the power of regexps more and more all the time.

atomregexp.au3

Edited by neogia

[u]My UDFs[/u]Coroutine Multithreading UDF LibraryStringRegExp GuideRandom EncryptorArrayToDisplayString"The Brain, expecting disaster, fails to find the obvious solution." -- neogia

Share this post


Link to post
Share on other sites

SmOke_N

Chr(01) can be chr(1) too


-jaenster

Share this post


Link to post
Share on other sites

SmOke_N

Chr(01) can be chr(1) too

I'm a creature of habbit jaenster... thanks for the tutelage!! :o

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0