UEZ Posted December 2, 2013 Share Posted December 2, 2013 (edited) What about this here? expandcollapse popup$sText1 = "[ab][ab][ab][ab]X[cd][cd]" $sText2 = "abababababab" $sText3 = "aaaaabbbbbbccc" ConsoleWrite($sText1 & @LF) ShortenRepeatedCharacters($sText1) ConsoleWrite(@LF) ConsoleWrite($sText2 & @LF) ShortenRepeatedCharacters($sText2) ConsoleWrite(@LF) ConsoleWrite($sText3 & @LF) ShortenRepeatedCharacters($sText3) ConsoleWrite(@LF) Func ShortenRepeatedCharacters($sText) Local $aResult = StringSplit($sText, "", 2) Local $iStartChar = $aResult[0], $sString = $iStartChar, $i If UBound($aResult) = 1 Then ConsoleWrite($iStartChar & "{1}") Return EndIf For $i = 1 To UBound($aResult) - 1 If $aResult[$i] = $iStartChar Then ExitLoop $sString &= $aResult[$i] Next If $i < UBound($aResult) Then FindRepeatations($sString, $sText & " ") Else ConsoleWrite(StringLeft($sString, 1) & "{1}") ShortenRepeatedCharacters(StringMid($sString, 2)) EndIf EndFunc Func FindRepeatations($sSearch, $sText) Local $j, $c = 1, $bExit = False, $iLenSearch = StringLen($sSearch) For $j = $iLenSearch + 1 To StringLen($sText) - $iLenSearch Step $iLenSearch If StringMid($sText, $j, $iLenSearch) = $sSearch Then $c += 1 Else $bExit = True ExitLoop EndIf Next If $sSearch <> " " Then ConsoleWrite($sSearch & "{" & $c & "}") Local $sNewString = StringMid(StringTrimRight($sText, 1), $j) If $bExit Or StringLen($sNewString) = 1 Then ShortenRepeatedCharacters($sNewString) EndFuncNot fully tested!Edit: added some checks.Edit2: added some more checks.Br,UEZ Edited December 3, 2013 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 What about this here? expandcollapse popup#include <Array.au3> $sText1 = "[ab][ab][ab][ab][cd][cd]" $sText2 = "abababababab" $sText3 = "aaaaabbbbbbccc" ConsoleWrite($sText1 & @LF) ShortenRepeatedCharacters($sText1) ConsoleWrite(@LF) ConsoleWrite($sText2 & @LF) ShortenRepeatedCharacters($sText2) ConsoleWrite(@LF) ConsoleWrite($sText3 & @LF) ShortenRepeatedCharacters($sText3) ConsoleWrite(@LF) Func ShortenRepeatedCharacters($sText) Local $aResult = StringSplit($sText, "", 2) Local $iStartChar = $aResult[0], $sString = $iStartChar, $i If UBound($aResult) = 1 Then ConsoleWrite($iStartChar & "{1}" & @LF) Return EndIf For $i = 1 To UBound($aResult) - 1 If $aResult[$i] = $iStartChar Then ExitLoop $sString &= $aResult[$i] Next If $i <= UBound($aResult) Then FindRepeatations($sString, $sText & " ") EndFunc Func FindRepeatations($sSearch, $sText) Local $j, $c = 1, $bExit = False, $iLenSearch = StringLen($sSearch) For $j = $iLenSearch + 1 To StringLen($sText) - $iLenSearch Step $iLenSearch If StringMid($sText, $j, $iLenSearch) = $sSearch Then $c += 1 Else $bExit = True ExitLoop EndIf Next If $sSearch <> " " Then ConsoleWrite($sSearch & "{" & $c & "}" & @LF) Local $sNewString = StringMid(StringTrimRight($sText, 1), $j) If $bExit Or StringLen($sNewString) = 1 Then ShortenRepeatedCharacters($sNewString) EndFunc Not fully tested! Edit: added some checks. Br, UEZ Many thanks for this. This is working, but if I add @CRLF in between then it get confused: $sText1 = "[ab][ab][ab]"& @CRLF & "[ab]<cd><cd>" produces: ab][ab][ab] [ab]<cd><cd> [ab]{3} [ab]<cd><cd>{1} It also adds newlines in between the shortened item. Do you have any method like this in mind for the reverse procedure? Link to comment Share on other sites More sharing options...
UEZ Posted December 3, 2013 Share Posted December 3, 2013 (edited) Updated the code from post#21. Btw, @crlf between the string makes no sense imho.What do you mean with reverse procedure? Build up the string again from shorten string?#include <String.au3> $sShorten = "a{5}b{6}c{3}" ConsoleWrite(ExpandString($sShorten) & @LF & @LF) $sShorten = "[ab]{4}X{1}[cd]{2}" ConsoleWrite(ExpandString($sShorten) & @LF & @LF) Func ExpandString($sString) Local $aRepetitions = StringRegExp($sString, "\{(\d+)\}", "3") If @error Then Return SetError(1, 0, 0) Local $aWords = StringRegExp($sString, "(?U)(.+)\{\d*\}", "3") If @error Then Return SetError(2, 0, 0) If UBound($aRepetitions) <> UBound($aWords) Then Return SetError(3, 0, 0) Local $i, $sExpanded For $i = 0 To UBound($aWords) - 1 $sExpanded &= _StringRepeat($aWords[$i], $aRepetitions[$i]) Next Return $sExpanded EndFunc Br,UEZ Edited December 3, 2013 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
Gianni Posted December 3, 2013 Share Posted December 3, 2013 a further way #include <array.au3> Local $items[1], $temp, $x, $y $sText1 = "[ab][ab][ab]" & @CRLF & "[ab]<cd><cd>aaaaaaaabb[hello][hello]bbbbbbbcccccccccbn hello everybody bye bye" For $i = 1 To StringLen($sText1) $x = StringMid($sText1, $i, 1) If $x = "[" Then $temp = "" $y = 1 ContinueLoop EndIf If $x = "]" Then $y = 0 _ArrayAdd($items, $temp) ContinueLoop EndIf If $y Then $temp &= $x ContinueLoop EndIf _ArrayAdd($items, $x) Next _ArrayDelete($items, 0) $uniqueitems = _ArrayUnique($items) _ArrayDelete($uniqueitems, 0) For $i = 0 To UBound($uniqueitems) - 1 $temp = _ArrayFindAll($items, $uniqueitems[$i]) ConsoleWrite($uniqueitems[$i] & "{" & UBound($temp) & "}" & @CRLF) Next Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 a further way #include <array.au3> Local $items[1], $temp, $x, $y $sText1 = "[ab][ab][ab]" & @CRLF & "[ab]<cd><cd>aaaaaaaabb[hello][hello]bbbbbbbcccccccccbn hello everybody bye bye" For $i = 1 To StringLen($sText1) $x = StringMid($sText1, $i, 1) If $x = "[" Then $temp = "" $y = 1 ContinueLoop EndIf If $x = "]" Then $y = 0 _ArrayAdd($items, $temp) ContinueLoop EndIf If $y Then $temp &= $x ContinueLoop EndIf _ArrayAdd($items, $x) Next _ArrayDelete($items, 0) $uniqueitems = _ArrayUnique($items) _ArrayDelete($uniqueitems, 0) For $i = 0 To UBound($uniqueitems) - 1 $temp = _ArrayFindAll($items, $uniqueitems[$i]) ConsoleWrite($uniqueitems[$i] & "{" & UBound($temp) & "}" & @CRLF) Next Thanks for your code, but it does not work. For the $sText that you have defined, it produces: ab{4} {1} {1} <{2} c{11} d{3} >{2} a{8} b{13} hello{2} n{1} {4} h{1} e{5} l{2} o{2} v{1} r{1} y{4} It also removes all of the @CRLF. It should keep @CRLF at the shortened version inorder to be able to expand the shorter version to the original one later. Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 Updated the code from post#21. Btw, @crlf between the string makes no sense imho. What do you mean with reverse procedure? Build up the string again from shorten string? #include <String.au3> $sShorten = "a{5}b{6}c{3}" ConsoleWrite(ExpandString($sShorten) & @LF & @LF) $sShorten = "[ab]{4}X{1}[cd]{2}" ConsoleWrite(ExpandString($sShorten) & @LF & @LF) Func ExpandString($sString) Local $aRepetitions = StringRegExp($sString, "\{(\d+)\}", "3") If @error Then Return SetError(1, 0, 0) Local $aWords = StringRegExp($sString, "(?U)(.*)\{\d}", "3") If @error Then Return SetError(2, 0, 0) If UBound($aRepetitions) <> UBound($aWords) Then Return SetError(3, 0, 0) Local $i, $sExpanded For $i = 0 To UBound($aWords) - 1 $sExpanded &= _StringRepeat($aWords[$i], $aRepetitions[$i]) Next Return $sExpanded EndFunc Br, UEZ Thanks for your reply. Consider that you have a multiline text in clipboard and you want to shorter it and later expand it, but you don't want to loos the multiline format of it. I think it must be able to handle newlines @CRLF as well. Am I right? Also could you please make this code so that it can expand repeats of more than 9? Currently it can not reapet more than 9 times. Link to comment Share on other sites More sharing options...
UEZ Posted December 3, 2013 Share Posted December 3, 2013 (edited) Well, I would in this case split the text at @CRLF and parse each line and save the result also to each line. That means if you build it up again you just insert a @CRLF automatically after each row.What do you mean with 9? This should work also for more.Br,UEZ Edited December 3, 2013 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 Well, I would in this case split the text at @CRLF and parse each line and save the result also to each line. That means if you build it up again you just insert a @CRLF automatically after each row. What do you mean with 9? This should work also for more. Br, UEZ True, thanks for your reply. About 9: this $sShorten = "a{5}b{6}c{3}<sdgs>{3}[df]{10}" ConsoleWrite(ExpandString($sShorten) & @LF & @LF) produces 0 Link to comment Share on other sites More sharing options...
UEZ Posted December 3, 2013 Share Posted December 3, 2013 It returns error = 3 -> that means I've to check the regex stuff.I will post an updated version when issue is fixed.Br,UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 It returns error = 3 -> that means I've to check the regex stuff. I will post an updated version when issue is fixed. Br, UEZ Thanks Link to comment Share on other sites More sharing options...
UEZ Posted December 3, 2013 Share Posted December 3, 2013 Updated post#23 - should work now.Br,UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
Gianni Posted December 3, 2013 Share Posted December 3, 2013 (edited) Hi b0x4it the rules are not well understood by me .... are used, only the symbols "[" and "]" to delimit groups or even others, such as "<" and ">" or many others? how you can recreate the original string? after the letters clustered together, you lose their original position within the string, especially if they are not side by side Within the string, for example if you have aaabbbaaa, is obtained a(6) b(3) but do not know how they were located in the original string Edited December 3, 2013 by PincoPanco Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 Updated post#23 - should work now. Br, UEZ Thanks for your update. I found another issue: $sShorten = "[ab]{4}X{1}[cd]{2}<3>{10} a{10}" produces: [ab][ab][ab][ab]X[cd][cd]<3><3><3><3><3><3><3><3><3><3> a a a a a a a a a a it should produce this: [ab][ab][ab][ab]X[cd][cd]<3><3><3><3><3><3><3><3><3><3> aaaaaaaaaa Link to comment Share on other sites More sharing options...
b0x4it Posted December 3, 2013 Author Share Posted December 3, 2013 Hi b0x4it the rules are not well understood by me .... are used, only the symbols "[" and "]" to delimit groups or even others, such as "<" and ">" or many others? how you can recreate the original string? after the letters clustered together, you lose their original position within the string, especially if they are not side by side Within the string, for example if you have aaabbbaaa, is obtained a(6) b(3) but do not know how they were located in the original string Thanks for your nice question. What I am trying to acheive here is to somehow shorten a text file that include every character as well as newlines. So what I came up with was to find those repeated characters or group of characters and change them to one instance of the repeated item & something like {number}. So for example if we have : aaabbbbccc [as][as][as][as] [3456][3456][3456][3456][3456] we can save space and yet have the possibility to reproduce the original text by changing it to a{3}b{4}c{3} [as]{4} [3456]{5} If you think of it this way that you have a text file of size 200kb it may be decreased to 200b after the shortening, but the important fact is that this must be reversable. The little issue is that if the original text has a text of {number} fomat in it, then the reversing procedure repeats what is before thsi as well, which shouldn't. What I can propose for the solution is to use another format for shortening to lower the chance of ahving exactly similar text in the original text. For example we can use _{number)_ or ~{number}~ or ::{number}:: or //{number}. Several execelent codes have been shared in this topic, but none of them can cover all characters and all cases. Please let me know if still there is any part of this that is not clear. Link to comment Share on other sites More sharing options...
jdelaney Posted December 4, 2013 Share Posted December 4, 2013 (edited) This will build it back out: #include <array.au3> $sShorten = "[ab]{4}X{1}[cd]{2}<3>{10} a{10}" $a = StringRegExp($sShorten,"[\[{<][^\]}>]+[\]}>]|[^\[{<]",3) Local $string,$iRepeat For $i = 0 to UBound($a)-1 If StringRegExp($a[$i],"{.*",0) Then $iRepeat = StringRegExpReplace($a[$i],"({)(\d+)(})","\2") - 1 Else $iRepeat = 1 $sub = $a[$i] EndIf For $j = 1 To $iRepeat $string &= $sub Next Next ConsoleWrite($string & @CRLF) Edited December 4, 2013 by jdelaney IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
jdelaney Posted December 4, 2013 Share Posted December 4, 2013 (edited) This will do it all (not sure why you don't want to count the spaces, but removed that counting anyways) expandcollapse popup#include <array.au3> $sShorten = "[ab]{4}X{1}[cd]{2}<3>{10} a{10}" ;~ $sShorten = "a{3}b{4}c{3} [as]{4} [3456]{5}" ConsoleWrite("Start:" & @TAB & $sShorten & @CRLF) $sBuilt = Build($sShorten) ConsoleWrite("Build:" & @TAB & $sBuilt & @CRLF) $sNewShorten = Destruct($sBuilt) ConsoleWrite("Short:" & @TAB & $sNewShorten & @CRLF) Func Build($sShorten) Local $a = StringRegExp($sShorten,"[\[{<][^\]}>]+[\]}>]|[^\[{<]",3) Local $string,$iRepeat For $i = 0 to UBound($a)-1 If StringRegExp($a[$i],"{.*",0) Then $iRepeat = StringRegExpReplace($a[$i],"({)(\d+)(})","\2") - 1 Else $iRepeat = 1 $sub = $a[$i] EndIf For $j = 1 To $iRepeat $string &= $sub Next Next Return $string EndFunc Func Destruct($sLengthen) Local $a = StringRegExp($sLengthen,"[\[<][^\]>]+[\]}>]|[^\[<]",3) Local $last, $iCount=0, $string For $i = 0 To UBound($a) - 1 If $a[$i] = $last Then $iCount+=1 Else If $iCount>0 Then If Not StringRegExp($last,"\s", 0) Then $string&="{" & $iCount & "}" EndIf EndIf $string&=$a[$i] $last=$a[$i] $iCount=1 EndIf Next If $iCount > 1 And Not StringRegExp($last,"\s", 0) Then $string&="{" & $iCount & "}" Return $string EndFunc Returns: Start: [ab]{4}X{1}[cd]{2}<3>{10} a{10} Build: [ab][ab][ab][ab]X[cd][cd]<3><3><3><3><3><3><3><3><3><3> aaaaaaaaaa Short: [ab]{4}X{1}[cd]{2}<3>{10} a{10} I think this route is a lot more straight forward, since there are only 2 regexp's total (to break out the strings) Edited December 4, 2013 by jdelaney IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
b0x4it Posted December 4, 2013 Author Share Posted December 4, 2013 This will build it back out: #include <array.au3> $sShorten = "[ab]{4}X{1}[cd]{2}< 3>{10} a{10}" $a = StringRegExp($sShorten,"[\[{<][^\]}>]+[\]}>]|[^\[{<]",3) Local $string,$iRepeat For $i = 0 to UBound($a)-1 If StringRegExp($a[$i],"{.*",0) Then $iRepeat = StringRegExpReplace($a[$i],"({)(\d+)(})","\2") - 1 Else $iRepeat = 1 $sub = $a[$i] EndIf For $j = 1 To $iRepeat $string &= $sub Next Next ConsoleWrite($string & @CRLF) Excelent, will do some testing and get back to you if thare was any issue. Thank you very much! Link to comment Share on other sites More sharing options...
UEZ Posted December 4, 2013 Share Posted December 4, 2013 Thanks for your update. I found another issue: $sShorten = "[ab]{4}X{1}[cd]{2}<3>{10} a{10}" produces: [ab][ab][ab][ab]X[cd][cd]<3><3><3><3><3><3><3><3><3><3> a a a a a a a a a a it should produce this: [ab][ab][ab][ab]X[cd][cd]<3><3><3><3><3><3><3><3><3><3> aaaaaaaaaa You have a space in between! -> {10} a{10} should be {10}a{10} [ab]{4}X{1}[cd]{2}<3>{10}a{10} works. Br, UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now