nikink Posted December 10, 2007 Posted December 10, 2007 (edited) Hi folks, I have taken on a little project that requires the parsing of certain strings into mandatory and optional components. So given a test string "aaaa(bbb)c(d)(e)" the optional components are between the parentheses, and the mandatory are the other bits. I need the output to be every combination of mandatory+optional components (without parentheses). Thus for my test string the output would be: "aaaac" "aaaabbbcde" "aaaabbbcd" "aaaabbbce" "aaaacde" "aaaacd" "aaaace" "aaaabbb" #include <array.au3> Global $var = "st(pw)n(ta)b" Global $optionalvar, $testforerror Global $results[1] ; Need to get all optional parameters out... all those params between each pair of ( and ) If StringInStr($var, "(") Then $optionalvar = StringTrimLeft($var,StringInStr($var, "(")) ConsoleWrite("Found a ( : " & $optionalvar & @CR) If StringInStr($optionalvar, ")") Then $optionalvar = StringTrimRight($optionalvar,(StringLen($var) - StringInStr($var, ")")+1)) ConsoleWrite("Found matching )... $optionalvar = " & $optionalvar & @CR) If StringInStr($optionalvar, ")") Or StringInStr($optionalvar, "(") Then ConsoleWrite("Error! Mismatching parenthesis in $optionalvar: " & $optionalvar & @CR) Exit EndIf Else ConsoleWrite("Parsing error! NO ) FOUND!" & @CR) Exit EndIf Else If StringInStr($var, ")") Then ConsoleWrite("Parsing error! FOUND ), did not find ( to match!" & @CR) Exit EndIf ConsoleWrite("$optionalvar = " & $optionalvar & @CR) EndIf For $i = 1 To Ubound($results)-1 ConsoleWrite($results[$i] & @CR) Next But I'm getting in a bit over my head and could use some help or advice. Note I'm also trying to validate the string so that "aaaa(bb(" or "aaaa(bb)" (mismatched brackets in other words) are not accepted / error out nicely. I figured storing the results in an array is the sensible option (even if it requires a lot of ReDimming to collect all the permutations, but am open to other suggestions if someone can see an easier / more efficient way. Can anyone help me? Edited December 10, 2007 by nikink
erebus Posted December 10, 2007 Posted December 10, 2007 If I were you, I would divide data in the script using commas (or any other delimiter) like: $var = "aaaa,bbb,c,d,e" If you want to omit a component, you may add double commas: $var = "aaaa,,c,," After that you may StringSplit your variable using the comma delimiter and in this way you will know if an optional parameter was given (not empty) or not (empty). It's all about string manipulation, check StringSplit in the helpfile and obey some rules in the variable syntax so as your script work correctly.
PsaltyDS Posted December 10, 2007 Posted December 10, 2007 This version will test for matched parens, and then split the input into parts as you described: #include <array.au3> Global $var = "st(pw)n(ta)b" Global $optionalvar, $testforerror Global $results[1] Global $fMatchError = False ; Test for matched parens. Does not allow nested parens. $sTest = $var While 1 $iLParen = StringInStr($sTest, "(") If $iLParen Then $sTest = StringTrimLeft($sTest, $iLParen) $iLParen = StringInStr($sTest, "(") $iRParen = StringInStr($sTest, ")") If ($iRParen = 0) Or (($iLParen > 0) And ($iLParen < $iRParen)) Then $fMatchError = True ExitLoop Else $sTest = StringTrimLeft($sTest, $iRParen) EndIf Else ExitLoop EndIf WEnd If $fMatchError Or StringInStr($sTest, ")") Then MsgBox(16, "Error", "Input string has mismatched parens: " & $var) Exit EndIf ; Split sting into parts $results = StringSplit($var, "()", 0) For $i = 1 To UBound($results) - 1 ConsoleWrite("Debug: $results[" & $i & "] = " & $results[$i] & @LF) Next Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
nikink Posted December 11, 2007 Author Posted December 11, 2007 Awesome PasltyDS! Thanks! I notice your solution doesn't pick up a stray ), but but that's a (probably) simple tweak... now all I gotta do is work out how to create all the permutations of mandatory and optional components. It's a frustratingly 'simple' task... Seems so straightforward, but whenever I attempt to code a solution I end up in a mess... B-) For Each Instance Of Optional Component 1 For Each Instance of Optional Component 2 For Each Instance Of Optional Component n-1 For Each Instance Of Optional Component n Return String Each Instance Of Optional Component = "string" and "" So maybe Mandatory components = "string" and "string" Then the Results Array in PsaltyDS' script could be an array of 2 element arrays... And some kind of recursion function could cycle through the Results Array elements... Hmmm. Any thoughts from you clever AutoIters? Hints, tips, advice all welcome here!
PsaltyDS Posted December 11, 2007 Posted December 11, 2007 I notice your solution doesn't pick up a stray ), but but that's a (probably) simple tweak...It picks up stray right parens in my testing. Can you post a value for $var that doesn't work? Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
nikink Posted December 11, 2007 Author Posted December 11, 2007 Sure, Global $var = "st(pw))n(ta)b" Debug: $results[1] = st Debug: $results[2] = pw Debug: $results[3] = Debug: $results[4] = n Debug: $results[5] = ta Debug: $results[6] = b I was thinking perhaps do a check that there are equal numbers of Left and Right Parentheses before anything else, because your script finds a mismatched ( which would account for misformed input strings like "st(pw))n(tab".
PsaltyDS Posted December 11, 2007 Posted December 11, 2007 Sure enough, thanks for the example. This adds exactly that check: expandcollapse popup#include <array.au3> Global $var = "st(pw))n(ta)b" Global $optionalvar, $testforerror Global $results[1] Global $fMatchError = False ; Test for same number of L/R parens $avL = StringSplit($var, "(") $avR = StringSplit($var, ")") If $avL[0] = $avR[0] Then ; Test for matched parens. Does not allow nested parens. $sTest = $var While 1 $iLParen = StringInStr($sTest, "(") If $iLParen Then $sTest = StringTrimLeft($sTest, $iLParen) $iLParen = StringInStr($sTest, "(") $iRParen = StringInStr($sTest, ")") If ($iRParen = 0) Or (($iLParen > 0) And ($iLParen < $iRParen)) Then $fMatchError = True ExitLoop Else $sTest = StringTrimLeft($sTest, $iRParen) EndIf Else ExitLoop EndIf WEnd Else $fMatchError = True EndIf If $fMatchError Or StringInStr($sTest, ")") Then MsgBox(16, "Error", "Input string has mismatched parens: " & $var) Exit EndIf ; Split sting into parts $results = StringSplit($var, "()", 0) For $i = 1 To UBound($results) - 1 ConsoleWrite("Debug: $results[" & $i & "] = " & $results[$i] & @LF) Next I don't think this is optimal, and I keep thinking there is a clever RegExp out there for it. But my RegExp-Fu is not strong enough. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
nikink Posted December 12, 2007 Author Posted December 12, 2007 Yeah, I would've thought so too... But my regex-fu is practically non-existant! Now, does anyone have any ideas on how to generate all the combinations of elements? Anyone? Anyone? Bueller?
PsaltyDS Posted December 12, 2007 Posted December 12, 2007 Yeah, I would've thought so too... But my regex-fu is practically non-existant! Now, does anyone have any ideas on how to generate all the combinations of elements? Anyone? Anyone? Bueller? Generating every possible combination of a set of n elements is an interesting programming problem. Here's my shot at it: ;aaaa(bbb)c(d)e Global $results[6] = [5, "aaaa", "bbb", "c", "d", "e"] Global $sOut = "" Global $iMax = 2^$results[0] - 1 ConsoleWrite("Debug: There will be " & $iMax & " results." & @LF) For $n = 1 To $iMax $sOut &= $n & ": " For $b = 0 To $results[0] - 1 If BitAND($n, 2^$B) Then $sOut &= $results[$b + 1] Next $sOut &= @CRLF Next MsgBox(64, "Results", $sOut) Cheers! Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
nikink Posted December 13, 2007 Author Posted December 13, 2007 That's... umm... very cool, and clever... but I've been looking at it for a day now, and can't understand what's happening... Can you (or someone) explain it to me? I guess it's the BitAND that's throwing me, I don't understand what that's doing, what its purpose is... Thanks though, for all your help so far, your scripts are a lot more efficient than mine (and more importantly, they work! )
PsaltyDS Posted December 13, 2007 Posted December 13, 2007 That's... umm... very cool, and clever... but I've been looking at it for a day now, and can't understand what's happening... Can you (or someone) explain it to me? I guess it's the BitAND that's throwing me, I don't understand what that's doing, what its purpose is... Thanks though, for all your help so far, your scripts are a lot more efficient than mine (and more importantly, they work! ) The loop is based on the fact that you can associate each element with a bit in a binary number. By incrementing a binary number from 0 to all bits set, you produce every possible combination of those bits. But, we don't want 0 because that means no bits set, no element selected, so we start from 1 instead. ; Create an array to simulate the output of the earlier function Global $results[6] = [5, "aaaa", "bbb", "c", "d", "e"] ; Declare a variable to hold the output string Global $sOut = "" ; Create a binary number representing 1 bit set for each element in the set. ; The formula is 2 to the nth power, minus one. ; In this case 2^5 - 1 = 31, which in binary is 11111 (five ones). Global $iMax = 2^$results[0] - 1 ; Since all zeroes (no element) is not an option, there going to be 2^n - 1 results, too. ConsoleWrite("Debug: There will be " & $iMax & " results." & @LF) ; A loop to increment a number from 1 to $iMax (31 in this case). This series of numbers ; will represent every possible combination of bits set, and since we associate an element in the set ; with each bit, every possible combination of elements. For $n = 1 To $iMax ; String format for the line, i.e. "1: " thru "31: " $sOut &= $n & ": " ; For each value of $n, test the bits to see which elements to include ; In this case the 5 bits are 0 thru 4 For $b = 0 To $results[0] - 1 ; Test the bit with a logical AND, if the bit is set, add that element to the output line If BitAND($n, 2^$B) Then $sOut &= $results[$b + 1] Next ; End the current line before going on the the next $n $sOut &= @CRLF Next ; After the loop is done, display the results MsgBox(64, "Results", $sOut) Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
nikink Posted December 20, 2007 Author Posted December 20, 2007 (edited) Ok, I think I've got it all working. I'm sure it can be optimised though... This problem has been driving me up the wall for weeks now. This messy script will generate every combination of mandatory and optional components given a string of characters where optional characters are within parentheses and every result contains all mandatory components. All combinations of mandatory and optional components remain in order. The input string can be of any length and any combination of mandatory and optional components - but the more optional components listed, the slower the script (cuz of all the array juggling and combination generating). I'm sure there should be a faster way... and I know I've confused myself numerous times during this project, as I'm sure someone with fresh eyes, and better scripting skills will see immediately upon perusal! Comments and critiques and suggestions for improvement very welcome. Very very welcome! Ideally I'd love to get rid of the array juggling and do more with regex... anyway: expandcollapse popupinclude <array.au3> Global $var = "(0)1(0)1(0)1(0)(0)(0)1" Global $optionalvar, $testforerror Global $results[1] Global $fMatchError = False #region - test for paretheses validity - Thanks to PsaltyDS for this ; Test for same number of L/R parens $avL = StringSplit($var, "(") $avR = StringSplit($var, ")") If $avL[0] = $avR[0] Then ; Test for matched parens. Does not allow nested parens. $sTest = $var While 1 $iLParen = StringInStr($sTest, "(") If $iLParen Then $sTest = StringTrimLeft($sTest, $iLParen) $iLParen = StringInStr($sTest, "(") $iRParen = StringInStr($sTest, ")") If ($iRParen = 0) Or (($iLParen > 0) And ($iLParen < $iRParen)) Then $fMatchError = True ExitLoop Else $sTest = StringTrimLeft($sTest, $iRParen) EndIf Else ExitLoop EndIf WEnd Else $fMatchError = True EndIf If $fMatchError Or StringInStr($sTest, ")") Then MsgBox(16, "Error", "Input string has mismatched parens: " & $var) Exit EndIf #endregion ; Split sting into parts $results = StringSplit($var, "()", 0) Global $NumMandatoryComponents = Round((Ubound($results)/2)) For $i = 1 To UBound($results) - 1 If Not mod($i, 2) = 0 Then $results[$i] = "<M>" & $results[$i] Else $results[$i] = "<O>" & $results[$i] EndIf Next Global $sOut = "" Global $iMax = 2^$results[0] - 1 ConsoleWrite("Debug: There will be " & $iMax & " results." & @LF) ; Thanks to PsaltyDS for this construction For $n = 1 To $iMax For $b = 0 To $results[0] - 1 If BitAND($n, 2^$B) Then $sOut &= $results[$b + 1]; working code. nice. Next $sOut &= @LF ;@CRLF Next Global $arraySOut = StringSplit($sOut, @LF) Global $tempresult = "" For $i = 1 To Ubound($arraySOut) - 1 $compare = StringRegExp($arraySOut[$i], "<M>", 3) If $NumMandatoryComponents = Ubound($compare) Then $tempresult = StringRegExpReplace($tempresult, "<M>|<O>", "") ConsoleWrite("Debug: " & $i & ":" & $tempresult & @LF) EndIf Next If anyone can see ways to streamline this, or make it faster, that would be very much appreciated! And thanks to PsaltyDS for his/her input! (Edited to show improvements in efficiency I could find after a nights sleep) Edited December 20, 2007 by nikink
nikink Posted December 20, 2007 Author Posted December 20, 2007 And here it is commented: expandcollapse popup#include <array.au3> Global $var = "(1)(1)(1)(1)m" Global $optionalvar, $testforerror Global $results[1] Global $fMatchError = False #region - test for paretheses validity - Thanks to PsaltyDS for this ; Test for same number of L/R parens $avL = StringSplit($var, "(") $avR = StringSplit($var, ")") If $avL[0] = $avR[0] Then ; Test for matched parens. Does not allow nested parens. $sTest = $var While 1 $iLParen = StringInStr($sTest, "(") If $iLParen Then $sTest = StringTrimLeft($sTest, $iLParen) $iLParen = StringInStr($sTest, "(") $iRParen = StringInStr($sTest, ")") If ($iRParen = 0) Or (($iLParen > 0) And ($iLParen < $iRParen)) Then $fMatchError = True ExitLoop Else $sTest = StringTrimLeft($sTest, $iRParen) EndIf Else ExitLoop EndIf WEnd Else $fMatchError = True EndIf If $fMatchError Or StringInStr($sTest, ")") Then MsgBox(16, "Error", "Input string has mismatched parens: " & $var) Exit EndIf #endregion ; Split sting into parts $results = StringSplit($var, "()", 0) Global $NumMandatoryComponents = Round((Ubound($results)/2)) For $i = 1 To UBound($results) - 1 ; Mark each component as either Mandatory or Optional If Not mod($i, 2) = 0 Then ; Mandatory components always fall in Odd indexes of $results $results[$i] = "<M>" & $results[$i] Else ; Optional components always fall in Even indexes of $results $results[$i] = "<O>" & $results[$i] EndIf Next Global $sOut = "" ; create a string to hold the marked results Global $iMax = 2^$results[0] - 1 ; This is the total number of combinations of the results, the number of *valid* results is 2^(number of optional components) ;ConsoleWrite("Debug: There will be " & $iMax & " results." & @LF) ; Thanks to PsaltyDS for this construction. Generates every combination of the components in $results and puts them in a @LF delimited string For $n = 1 To $iMax For $b = 0 To $results[0] - 1 If BitAND($n, 2^$B) Then $sOut &= $results[$b + 1]; working code. nice. Next $sOut &= @LF Next ; Split the @LF delimited string of results into an array containing strings of Marked Components. This array is $iMax in size! Global $arraySOut = StringSplit($sOut, @LF) For $i = 1 To Ubound($arraySOut) - 1 ; $compare is the array of Mandatory components found by Regex looking for "<M>" upon each element within $arraySOut $compare = StringRegExp($arraySOut[$i], "<M>", 3) If $NumMandatoryComponents = Ubound($compare) Then ; If the regex returns an array of Mandatory components where the size equals the Number of Mandatory components then a valid combination has been found. $arraySOut[$i] = StringRegExpReplace($arraySOut[$i], "<M>|<O>", "") ; Strip the implanted markings ConsoleWrite("Debug: " & $i & ":" & $arraySOut[$i] & @LF) ; Output EndIf Next A single character to mark mandatory and optional would probably be a bit faster - and preferably non printable, as the string of characters could be ANY characters in theory. Thus it's not likely but I suppose possible that the combination of <M> or <O> could actually be part of the string.
PsaltyDS Posted December 21, 2007 Posted December 21, 2007 And thanks to PsaltyDS for his/her input!He/she says "You're welcome." And here it is commented:A single character to mark mandatory and optional would probably be a bit faster - and preferably non printable, as the string of characters could be ANY characters in theory. Thus it's not likely but I suppose possible that the combination of <M> or <O> could actually be part of the string.Glad you got it working. Merry Christmas! Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now