genius257 Posted August 25, 2017 Posted August 25, 2017 (edited) So I'm having a issue with StringRegExp when using the offer parameter and using the start of string anchor if the offset is greater than 1 I just wonder if it's a bug or it is supposed to work like that? See example below StringRegExp("abc", "^[a-z]", 1, 1) ConsoleWrite(@error&@CRLF);success StringRegExp("abc", "^[a-z]", 1, 2) ConsoleWrite(@error&@CRLF);failure Thanks in advance Edited August 25, 2017 by genius257 To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
iamtheky Posted August 25, 2017 Posted August 25, 2017 They should both error, carat goes on the inside StringRegExp("abc", "[^a-z]", 1, 1) ConsoleWrite(@error&@CRLF);failure StringRegExp("abc", "[^a-z]", 1, 2) ConsoleWrite(@error&@CRLF);failure StringRegExp("abc", "abc", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "bc", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "c", 1, 3) ConsoleWrite(@error&@CRLF) ;errors StringRegExp("abc", "ab", 1, 3) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "a", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^abc]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^bc]", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^c]", 1, 3) ConsoleWrite(@error&@CRLF) ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__)
genius257 Posted August 25, 2017 Author Posted August 25, 2017 NO. First RegEx is to get the "a", second RegEx is to get the "b" From the documentation: Quote Outside a character class, the caret matches at the start of the subject text, and also just after a non-final newline sequence if option (?m) is active. By default the newline sequence is @CRLF. Inside a character class, a leading ^ complements the class (excludes the characters listed there). To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
iamtheky Posted August 25, 2017 Posted August 25, 2017 (edited) ahh, i reversed it. context free is tough, but thats a start point and so it gets 'abc', and then 'bc' edit, tested real quick and with dashes im getting the largest susbset, not the smallest subset of the group - running more Edited August 25, 2017 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__)
genius257 Posted August 25, 2017 Author Posted August 25, 2017 yeah but i gets "[a-z]" anywhere, not only if it's the first in the string. "^[a-z]" will return "b" when used on "a0bc" To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
iamtheky Posted August 25, 2017 Posted August 25, 2017 (edited) whered you get the quote from? If you put that carat there you are only getting the first character, and only if it's letter, and only if its lowercase. What is the desired end goal? #include<array.au3> $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 1) _ArrayDisplay($aMatch) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 2) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 3) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 4) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("A0bc", "^[a-z]", 3, 1) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("A0bc", "^[a-z]", 3, 4) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) Edited August 25, 2017 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__)
genius257 Posted August 25, 2017 Author Posted August 25, 2017 (edited) From the StringRegExp documentation in the Anchors table in Remarks. I'm iterating through a string, looking for exact matches: Global $Types[][] = [ _ ['^("[^"]*"|''''[^'''']*'''')',"String"], _ ["^\$[_a-zA-Z0-9]+","Variable"] _ ] $sOutput = "" $sInput = '$var = "this is a test"' $iOffset = 1 #include <Array.au3> While 1 StringRegExp($sInput, "^\s*(\S)", 1, $iOffset) If @error<>0 Then ExitLoop $iOffset = @extended For $i=0 To UBound($Types, 1)-1 $a = StringRegExp($sInput, $Types[$i][0], 1, $iOffset-1) If @error=0 Then $iOffset=@extended $sOutput&=$Types[$i][1]&";" ExitLoop EndIf Next WEnd I do know there are better ways of doing this, I'm just wondering if it's supposed to fail when using "^" and offset greater than 1 Edited August 25, 2017 by genius257 To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
mikell Posted August 25, 2017 Posted August 25, 2017 5 minutes ago, genius257 said: I'm just wondering if it's supposed to fail when using "^" and offset greater than 1 Obviously yes ! ^ matches at the start of the subject text , while offset is The string position to start the match First position (just after ^) is offset 1, so others (offset > 1) won't match if the ^ anchor is used - and if you don't use a workaround
genius257 Posted August 25, 2017 Author Posted August 25, 2017 (edited) Thanks @mikell. It seems silly to me, as i see it, the offset would define where the string would be trimmed and matched, but i guess not. guess I'll haft to sub string myself and just add the @extended to the offset... >.> Edited August 25, 2017 by genius257 To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
mikell Posted August 25, 2017 Posted August 25, 2017 (edited) 14 minutes ago, genius257 said: guess I'll haft to sub string myself and just add the return to the offset... This is the workaround indeed Using offset you force the position where to start the match, so you'll jump into troubles if you do this with the ^ anchor in the pattern $offset = 1 $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) $offset = @extended ConsoleWrite($res[0]&@CRLF) $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) $offset = @extended ConsoleWrite($res[0]&@CRLF) $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) Edited August 25, 2017 by mikell
genius257 Posted August 25, 2017 Author Posted August 25, 2017 (edited) 14 minutes ago, mikell said: This is the workaround indeed Using offset you force the position where to start the match, so you'll jump into troubles if you do this with the ^ anchor in the pattern $offset = 1 $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) $offset += StringLen($res[0]) $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) $offset += StringLen($res[0]) $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) kinda. more like: StringRegExp(StringMid($sInput, $offset), "^[a-z]", 1) but it works now i guess.. Edited August 25, 2017 by genius257 forgot the anchor in the pattern To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
mikell Posted August 25, 2017 Posted August 25, 2017 (edited) Sorry, I edited my previous example, not sure you saw it... much better anyway... Edit .... because it's easy to use in a loop Edited August 25, 2017 by mikell
genius257 Posted August 25, 2017 Author Posted August 25, 2017 The main problem with your solution is that if not using the anchor, it will match anywhere in the string. This will make it useless if the purpose it to iterate though it and process every char or do something else, should the match(es) fail. I appreciate all the help Anyway this is my result (I think my offset calculation will be wrong in some cases and should be adjusted at a later time, but it works for now ) Global $Types[][] = [ _ ['^("[^"]*"|''''[^'''']*'''')',"String"], _ ["^\$[_a-zA-Z0-9]+","Variable"] _ ] $sOutput = "" $sInput = FileRead(@ScriptFullPath) $sInput = '$var="this is a test" &"test"' $iOffset = 1 While 1 StringRegExp(StringMid($sInput, $iOffset), "^\s*(\S)", 1) If @error<>0 Then ExitLoop $iOffset += @extended-1 ConsoleWrite(StringMid($sInput, $iOffset-1)&@CRLF) $bMatch=False For $i=0 To UBound($Types, 1)-1 $a = StringRegExp(StringMid($sInput, $iOffset-1), $Types[$i][0], 1) If @error=0 Then $iOffset+=@extended-2 $sOutput&=$Types[$i][1]&";" $bMatch=True ExitLoop EndIf Next If Not $bMatch Then $sOutput&="Unknown"&";" WEnd MsgBox(0, "", $sOutput) To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
AspirinJunkie Posted August 25, 2017 Posted August 25, 2017 Maybe i misunderstood something but if you use an offset in StringRegExp and want to match from the beginning of the current position then you have to use \G instead of ^: StringRegExp("abc", "\G[a-z]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "\G[a-z]", 1, 2) ConsoleWrite(@error&@CRLF) genius257 1
genius257 Posted August 25, 2017 Author Posted August 25, 2017 2 minutes ago, AspirinJunkie said: Maybe i misunderstood something but if you use an offset in StringRegExp and want to match from the beginning of the current position then you have to use \G instead of ^: StringRegExp("abc", "\G[a-z]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "\G[a-z]", 1, 2) ConsoleWrite(@error&@CRLF) Ah! you are right! Thank you ^^' Totally missed that. To show your appreciation My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now