Sign in to follow this  
Followers 0
Basement

Duplicating trees of an Array and rebuilding new array

18 posts in this topic

Hi there,

the following "problem" is going to make me crazy...maybe somebody will stop it ;-)

I've got an Array which contains a parts list and looks like this

Level.  |   Entry  |  Quantity            

1         |   a         |  2  

   2      |   b         |  1

   2      |   c         |  1

      3   |   d         |  2

1         |   e         |  1

   2      |   f          |  2

and so on...

the row "Level" contains the level of the component in the part list tree, "Entry" is the name of the component and "Quantity" the ... yes! the quantity.

So for instance "a" is the main component, "b" and "c" are the subcomponent in second level and "d" is a subcomponent of "c".

So far..so good.

the "Problem" which needs to be solved:

I want to create a new array, which is created based on the array above and contains duplicates of the single components corresponding to the given quantity but preserving the tree structure of the part list:

So the part list example should look like this after the succesful operation:

Level.  |   Entry  |  Quantity            

1         |   a         |  2  

   2      |   b         |  1

   2      |   c         |  1

      3   |   d         |  2

      3   |   d         |  2

1         |   a         |  2  

   2      |   b         |  1

   2      |   c         |  1

      3   |   d         |  2

      3   |   d         |  2

1         |   e         |  1

   2      |   f          |  2

   2      |   f          |  2

I already have a solution using a function named "_InsertItem" which allows to insert an item at any position of an 2-D array and then recreates the whole array by pushing the contents below the inserted item down.

The Problem: The "real" array is much bigger than the example array above (about 8000 lines and 20 rows).

And with every duplication of a tree and subtree the array grows and so the Array-Insert Operations become slower and slower because the array-rebuilding after insertion of a new item becomes more time-consuming.

So the question is: Is there an easier way to build the final array (containing the duplicates)?

My approach was to first just define the structure of the array in a way (that i don't know) and at the very end of the script buidling the final array by copying item after item from the original array following the defined structure. Because i think adding an array element at the end of an array is much less time-cosuming than inserting an element between two array elements and rebuilding the whole array again and again in a whole.

Much text, sorry....

Every help / idea is appreciated

Thanx

Daniel

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Basement,

Welcome to the AutoIt forum - and thanks for a fun little project. :D

This script counts the number of parts in the first array and then creates an array of the correct size which you then fill:

#include <Array.au3>

Global $aParts[6][3] = [[1, "a", 2 ], _
                [2, "b", 1], _
                [2, "c", 1], _
                [3, "d", 2], _
                [1, "e", 1], _
                [2, "f", 2]]

; Determine total number of parts
$iMax = 0
For $i = 0 To UBound($aParts) - 1
    $iMax += $aParts[$i][2]
Next

; Create new array of the correct size
Global $aNewParts[$iMax][3]

; Now loop through the arrays extracting from one and filling the other
$j = 0                            ; Index row for filling array
For $i = 0 To UBound($aParts) - 1 ; Index row for extracting array
    ; Add as many lines as there are parts
    For $k = 1 To $aParts[$i][2]
        For $n = 0 To 2
            $aNewParts[$j][$n] = $aParts[$i][$n]
        Next
        $j += 1
    Next
Next

_ArrayDisplay($aNewParts, "", Default, 8)
That should be very much faster as it does not use ReDim which is an extremely slow function when used on large arrays. ;)

M23

Edit:

I have just realised that the result is not quite what you want. Thinking cap back on! :(

Edited by Melba23

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Hi,

thanx for the fast answer  :bye:

i tested your script but it generates:

1|a|2
1|a|2
2|b|1
2|c|1
3|d|2
3|d|2
1|e|1
2|f|2
2|f|2
 
 
So the original tree structure is not preserved. 
your script duplicates every line for itself but not the whole tree under the line.
 
The correct result would be:
 

1         |   a         |  2  

   2      |   b         |  1

   2      |   c         |  1

      3   |   d         |  2

      3   |   d         |  2

1         |   a         |  2  

   2      |   b         |  1

   2      |   c         |  1

      3   |   d         |  2

      3   |   d         |  2

1         |   e         |  1

   2      |   f          |  2

   2      |   f          |  2

 

But i'm too stupid to program such a recursive algorithm, which i think is needed here.

Can you again help me ...  :pirate:

 

Best regards

 

Daniel

Share this post


Link to post
Share on other sites

Basement,

Our posts crossed - see my edit above. ;)

Thinking cap firmly in place. ;)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Basement,

What is the purpose of this array expansion? If we have an idea of why you need to do this, it might help us come up with an algorithm which gets to the end product more quickly. ;)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Hi,

the Array contains a part list of a machine.

the part list will be printed out and then every component is manually updated (signed) "by human hand" with a special number.

So if the very first component of a tree has the quantity 3 this means that the whole component subtree has to be duplicated tree times.

Then within the tree all subtrees have also to be duplicated, if quantity is more than one, and so on.

 

The problem: If the part list is printed out as it is (without duplicating the components/trees) signing every duplicate of the component is not possible.

I hope you understand what i mean (my english is not the best on planet ;-)

Best regards

Daniel

Share this post


Link to post
Share on other sites

Basement,

You say the actual array is some 8000 lines long - so expanded it will be in the many tens of thousands of lines. And you really want to print it out and get your staff manually searching and signing for components in this enormous list? What do your quality control people have to say about that? :wacko:

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Hi,

no, i wrote it unclear:

Not all of the lines have to be filled out. There is another filter (in a later step directly in Excel) which will result in about 200 - 300 lines.

But anyhow the basic list must duplicate all trees, because there are other filters in excel which select other lines.

Best regards

Daniel

Share this post


Link to post
Share on other sites

Basement,

That sounds more reasonable. ;)

Still thinking. :sweating:

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

That sounds like a job for a lite database.

Just saying.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Hi,

a solution only with autoit would be better... any further suggestions?

(or do you have a concrete example solution using SQLite? Because i'm unfortunately not familiar with this technique...)

What I've done in the meantime:

As the ReDim-Function is very slow (thanx Melba) i've googled and found a few functions, which check, if the array must be "Redimed" at all and if then ReDim the array not just for one Item but multiply the Array size by 1.5.

'?do=embed' frameborder='0' data-embedContent>>

The script is now a LITTLE BIT faster, but not as fast as anybody can work with it.

Any further ideas?

best regards

Daniel

Edited by Basement

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Here's my go at it. I have a script that does format the new array properly, and I've included _ArrayDisplay() functions so you can see the starting array and ending array to easily verify that this does what you wanted. This script can be faster if I knew more about the data you are working with. First I'll show you the code:

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_Run_Au3Stripper=y
#Au3Stripper_Parameters=/RM /SF = 1 /SV = 1 /PE
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****
Opt("MustDeclareVars",1)
#include <array.au3>
Local Const $Array[6][3] = [[1,"a",2],[2,"b",1],[2,"c",1],[3,"d",2],[1,"e",1],[2,"f",2]]
_ArrayDisplay($Array,"$Array")
Local $NewArray = _ExpandDuplicates($Array)
_ArrayDisplay($NewArray,"$NewArray")
Func _ExpandDuplicates(ByRef Const $Array)
Local Const $LevelsString = "ghi", $Zeros = "0000000000000000"
Local $Rows = UBound($Array) - 1, $String = "", $Binary = "", $PreviousLevel, $C2End, $CountElements[4]
#Region Convert this array to a string
For $C = 0 to $Rows
$CountElements[$Array[$C][0]] += 1;Count how many elements are in each level
If $Array[$C][0] <= $PreviousLevel Then
$C2End = $Array[$C][0]
For $C2 = $PreviousLevel to $C2End Step - 1
$String &= StringMid($LevelsString,$C2,1)
Next
EndIf
$Binary = StringFormat("%x",Hex($Array[$C][2],4))
$String &= StringMid($LevelsString,$Array[$C][0],1) & StringLeft($Zeros,4 - StringLen($Binary)) & $Binary
$Binary = StringFormat("%x",StringToBinary($Array[$C][1]))
$String &= StringLeft($Zeros,16 - StringLen($Binary)) & $Binary
$PreviousLevel = $Array[$C][0]
Next
;Close any levels
For $C = $PreviousLevel to 1 Step -1
$String &= StringMid($LevelsString,$C,1)
Next
#EndRegion Convert this array to a string
;ConsoleWrite($String & @CR)
#Region Get the maximum number of elements in a level
Local $Max = 0
For $C = 1 to 3
If $CountElements[$C] > $Max Then $Max = $CountElements[$C];Get the maximum value
Next
#EndRegion Get the maximum number of elements in a level
#Region Expand the string
Local $Match, $Offset, $ReplacementString, $Duplicate, $C2Max, $Replacements[$Max + 1], $Replacements2[$Max], $C3
For $C = 3 to 1 step -1
$C3 = -1
$Offset = 1
While 1
$Match = StringRegExp($String,StringMid($LevelsString,$C,1) & "(.*?)" & StringMid($LevelsString,$C,1),1,$Offset)
$Offset = @Extended
If @Error Then
ExitLoop
EndIf
$Duplicate = Int("0x" & StringLeft($Match[0],4))
If $Duplicate > 1 Then
$Match[0] = StringMid($LevelsString,$C,1) & $Match[0] & StringMid($LevelsString,$C,1)
$ReplacementString = $Match[0]
$C2Max = Floor(log($Duplicate)/log(2))
For $C2 = 1 to $C2Max
$ReplacementString &= $ReplacementString
Next
;ConsoleWrite("$Duplicate = " & $Duplicate & @CR & "$C2Max = " & $C2Max & @CR & "$C2Max = " & $C2Max & @CR)
$C2Max = $Duplicate - 2 ^ $C2Max
For $C2 = 1 to $C2Max
$ReplacementString &= $Match[0]
Next
;ConsoleWrite($ReplacementString & @CR)
$C3 += 1
$Replacements[$C3] = $Match[0]
$Replacements2[$C3] = $ReplacementString
EndIf
;_ArrayDisplay($Match,"$Match: " & StringMid($LevelsString,$C,1))
WEnd
;_ArrayDisplay($Replacements,"$Replacements, $C2 = " & $C2)
For $C2 = 0 to $C3
$String = StringReplace($String,$Replacements[$C2],$Replacements2[$C2])
;ConsoleWrite($String & @CR)
Next
Next
;ConsoleWrite($String & @CR)
#EndRegion Expand the string
#Region Convert the string back into an array
;Count how many elements are in each level to figure out how many elements we need total
Local $CMax = 0
For $C = 1 to 3
StringReplace($String,StringMid($LevelsString,$C,1),"")
Local $Replacements = @Extended
If $Replacements Then $CMax += $Replacements
Next
$CMax = $CMax/2
Local $NewArray[$CMax][3]
;ConsoleWrite("$Elements = " & $CMax & @CR)
$CMax -= 1
$C2 = 1
Local $Skip = 0
For $C = 0 to $CMax
$C3 = $C2
While 1;Skip redundant level markers
$C3 += 1
If StringInStr($LevelsString,StringMid($String,$C3,1)) Then
$C2 = $C3
Else
ExitLoop
EndIf
WEnd
$NewArray[$C][0] = StringInStr($LevelsString,StringMid($String,$C2,1))
;ConsoleWrite(StringMid($String,$C2,1) & @TAB)
$C2 += 1
$NewArray[$C][2] = Int("0x" & StringMid($String,$C2,4))
;ConsoleWrite(StringMid($String,$C2,4) & @TAB)
$C2 += 4
$NewArray[$C][1] = BinaryToString("0x" & StringReplace(StringMid($String,$C2,16),"00",""))
;ConsoleWrite(StringMid($String,$C2,16) & @CR)
$C2 += 16
Next
Return $NewArray
#EndRegion Convert the string back into an array
EndFunc
;This can support up to 46 levels. They will be represented by the letts g - Z
;Each entry will be represented by 16 hexadecimal characters. This means it can be up to 8 characters long.
;Each quantity will be represented by 4 hexadecimal characters. This means it can represent a number from 0 to 65535
#cs Reserved characters
0
1
2
3
4
5
6
7
8
9
a
b
c
d
e
f
Characters for levels
g - 1
h - 2
i - 3
j - 4
k - 5
l - 6
m - 7
n - 8
o - 9
p - 10
q - 11
r - 12
s - 13
t - 14
u - 15
v - 16
w - 17
x - 18
y - 19
z - 20
A - 21
B - 22
C - 23
D - 24
E - 25
F - 26
G - 27
H - 28
I - 29
J - 30
K - 31
L - 32
M - 33
N - 34
O - 35
P - 36
Q - 37
R - 38
S - 39
T - 40
U - 41
V - 42
W - 43
X - 44
Y - 45
Z - 46
#ce Reserved Characters 

First question: How many levels will the data you're working with use? This script can work for up to 46 levels, but that's probably overkill. If you need < 21 levels, then I can use all upercase letters, and get rid of stringformat(). I can also use a basic comparison for StringinStr and StringReplace.

2) How many characters long are the component names, at maximum? This script can accept input up to 8 characters, but this may be too short. I can increase the length, but this will make the final string the program uses longer, so it's a good idea to get close, but save maximum allowable length.

3) How many of any given item do you expect to ever have? The maximum value that can be in collumn 3 is  65,535, which is probably overkill. If we lower this number, the resulting string the program uses will be shorter.

4) This is a very small array to test this with. I'm sure you can't show me the actual data you're working with, but can you describe the data more? You mentioned it's 20 collumns, and 8,000 rows. Can you describe what values are in each collumn, or perhaps give a list of components so I can generate a bunch of random data to try this script on? I'd like to make sure this script can work for the real data you're using.

This script still needs some work, but it's easy to improve. Let me know what you think.

Edited by Oscis

Share this post


Link to post
Share on other sites

Basement, you should take a look at the DllStructCreate command. With this command you can create a large structure to build the array. Using a structure means that you can copy a row with 20 columns, or even an entire block with 50 rows and 20 columns with just two commands: A copy and a paste command. You copy a memory block in the structure from one position to another. This will be much, much faster than using arrays, where you only can copy a single cell at a time. When you have build the data in the structure, you can copy the data to an array.

I have made a little test with this code, where data is copied from a structure to an array. it runs in about 15 seconds.

Global $tData = DllStructCreate( "byte;byte;char[32];byte;byte" )
Global $aArray[100000][20]
For $i = 0 To 99999
  For $j = 0 To 19
    $aArray[$i][$j] = DllStructGetData( $tData, 3 )
  Next
Next

Since the start array is "only" 8000 rows and 20 columns, it should be possible to insert data into the structure within a foreseeable period.

I'm pretty sure it's the absolute fastest way to create the array.

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

I don't know whether this works properly for your needs and how the performance is:

#include <Array.au3>

Global $aParts[6][3] = [[1, "a", 2], _
                        [2, "b", 1], _
                        [2, "c", 1], _
                        [3, "d", 2], _
                        [1, "e", 1], _
                        [2, "f", 2]]

Global $iH, $iW, $i, $sString, $sLine, $sDelim = ";"
For $iH = 0 To UBound($aParts) - 1
    For $iW = 0 To UBound($aParts, 2) - 1
        $sLine &= $aParts[$iH][$iW] & $sDelim
    Next
    $sLine = StringTrimRight($sLine, 1)
    If StringLeft($sLine, 1) > 1 And StringRight($sLine, 1) > 1 Then
        For $i = 0 To StringRight($sLine, 1) - 1
            $sString &= $sLine & @CRLF
        Next
    Else
        $sString &= $sLine & @CRLF
    EndIf
    $sLine = ""
Next
Global $sResult, $j
$a1stLevels = StringRegExp($sString, "\b(1.+)", 3)

For $i = 0 To UBound($a1stLevels) - 2
    If StringRight($a1stLevels[$i], 1) > 1 Then
        For $j = 0 To StringRight($a1stLevels[$i], 1) - 1
            $sResult &= StringRegExpReplace($sString, "(?s)\b(" & $a1stLevels[$i] & ".+)\b" & $a1stLevels[$i + 1] & ".*", "$1")
        Next
    Else
        $sResult &= StringRegExpReplace($sString, "(?s)\b(" & $a1stLevels[$i] & ".+)\b" & $a1stLevels[$i + 1] & ".*", "$1")
    EndIf
Next

If StringRight($a1stLevels[$i], 1) > 1 Then
    For $j = 0 To StringRight($a1stLevels[$i], 1) - 1
        $sResult &= StringRegExpReplace($sString, "(?s).*\b(" & $a1stLevels[$i] & ".*)", "$1")
    Next
Else
    $sResult &= StringRegExpReplace($sString, "(?s).*\b(" & $a1stLevels[$i] & ".*)", "$1")
EndIf

Global $aResult = StringSplitW($sResult)
_ArrayDisplay($aResult)



; #FUNCTION# ========================================================================================================================================
; Name .................:   StringSplitW()
; Description ..........:   Splits  a string into columns instead of rows as it is done by SplitString(), like a csv file to a 2d array ;-)
; Syntax ...............:   StringSplitW($sString, $sDelimiter, $iWidthLen)
; Parameters ...........:   $sString - string to split
;                           $sDelimiter - [optional] the delimter how to split the string
;                           $iWidthLen - [optional] length of the row (amount of columns - default is 256)
; Return values .......:    Success - 2d array
;                           Error 1 - either $sString or $delimter is not set
;                           Error 2 - array width exceeded
;                           Error 3 - error splitting string
;
; Version .............:    v0.96 build 2015-01-20 beta
; Author ..............:    UEZ
; Modified ............:
; Remarks .............:    RegEx take from http://stackoverflow.com/questions/4476812/regular-expressions-how-to-replace-a-character-within-quotes
; Related .............:    StringSplit, StringReplace, StringRegExpReplace, StringLen, StringStripCR
; ===================================================================================================================================================
Func StringSplitW($sString, $sDelimiter = ";", $sQuotationMark = '"', $sDummy = "¦", $iWidthLen = 256)
    If $sString = "" Or $sDelimiter = "" Then Return SetError(1, 0, 0)
    Local $chk, $iWidth, $i, $j, $k, $iLen, $iMax = 1, $iMaxWidth
    Local $aPos[1], $l = 0
    Local $aSplit =  StringSplit(StringStripCR($sString), @LF)
    If @error Then Return SetError(3, 0, 0)
    Local $aVertical[$aSplit[0]][$iWidthLen], $iDelimiterLen = StringLen($sDelimiter) - 1, $sLine
    For $k = 1 To $aSplit[0]
        $iLen = StringLen($aSplit[$k])
        If $iLen > 1 Then
            $sLine = StringRegExpReplace($aSplit[$k], '(?m)\' & $sDelimiter & '(?=[^' & $sQuotationMark & ']*' & $sQuotationMark & '(?:[^' & $sQuotationMark & '\r\n]*' & $sQuotationMark & '[^' & $sQuotationMark & ']*' & $sQuotationMark & ')*[^' & $sQuotationMark & '\r\n]*$)', $sDummy)
            $chk = StringReplace($sLine, $sDelimiter, $sDelimiter)
            $iWidth = @extended
            If $iWidth > $iWidthLen Then Return SetError(2, 0, 0)
            If $iWidth >= $iMax Then $iMax = $iWidth + 1
            Switch $iWidth
                Case 0
                    $aVertical[$l][0] = $sLine
                Case Else
                    Dim $aPos[$iWidth * 2 + 2]
                    $j = 1
                    $aPos[0] = 1
                    For $i = 0 To $iWidth - 1
                        $aPos[$j] = StringInStr($sLine, $sDelimiter, 0, $i + 1) - 1
                        $aPos[$j + 1] = $aPos[$j] + 2 + $iDelimiterLen
                        $j += 2
                    Next
                    $aPos[UBound($aPos) - 1] = StringLen($sLine)
                    $j = 0
                    For $i = 0 To UBound($aPos) - 1 Step 2
                        $aVertical[$l][$j] = StringMid(StringReplace($sLine, $sDummy, $sDelimiter), $aPos[$i], $aPos[$i + 1] - $aPos[$i] + 1)
                        $j += 1
                    Next
                EndSwitch
                $l += 1
        EndIf
    Next
    ReDim $aVertical[$l][$iMax]
    Return $aVertical
EndFunc

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites

Hey,

@UEZ: Looks as this would work absolutely great for me...i will test it in my script and check the performance

@LarsJ: If the performance still is not acceptable i will check your version ... but i think then i have a few questions left...

Best regards

Daniel

Share this post


Link to post
Share on other sites

Here's my go at it. I have a script that does format the new array properly, and I've included _ArrayDisplay() functions so you can see the starting array and ending array to easily verify that this does what you wanted. This script can be faster if I knew more about the data you are working with. First I'll show you the code:

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_Run_Au3Stripper=y
#Au3Stripper_Parameters=/RM /SF = 1 /SV = 1 /PE
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****
Opt("MustDeclareVars",1)
#include <array.au3>
Local Const $Array[6][3] = [[1,"a",2],[2,"b",1],[2,"c",1],[3,"d",2],[1,"e",1],[2,"f",2]]
_ArrayDisplay($Array,"$Array")
Local $NewArray = _ExpandDuplicates($Array)
_ArrayDisplay($NewArray,"$NewArray")
Func _ExpandDuplicates(ByRef Const $Array)
Local Const $LevelsString = "ghi", $Zeros = "0000000000000000"
Local $Rows = UBound($Array) - 1, $String = "", $Binary = "", $PreviousLevel, $C2End, $CountElements[4]
#Region Convert this array to a string
For $C = 0 to $Rows
$CountElements[$Array[$C][0]] += 1;Count how many elements are in each level
If $Array[$C][0] <= $PreviousLevel Then
$C2End = $Array[$C][0]
For $C2 = $PreviousLevel to $C2End Step - 1
$String &= StringMid($LevelsString,$C2,1)
Next
EndIf
$Binary = StringFormat("%x",Hex($Array[$C][2],4))
$String &= StringMid($LevelsString,$Array[$C][0],1) & StringLeft($Zeros,4 - StringLen($Binary)) & $Binary
$Binary = StringFormat("%x",StringToBinary($Array[$C][1]))
$String &= StringLeft($Zeros,16 - StringLen($Binary)) & $Binary
$PreviousLevel = $Array[$C][0]
Next
;Close any levels
For $C = $PreviousLevel to 1 Step -1
$String &= StringMid($LevelsString,$C,1)
Next
#EndRegion Convert this array to a string
;ConsoleWrite($String & @CR)
#Region Get the maximum number of elements in a level
Local $Max = 0
For $C = 1 to 3
If $CountElements[$C] > $Max Then $Max = $CountElements[$C];Get the maximum value
Next
#EndRegion Get the maximum number of elements in a level
#Region Expand the string
Local $Match, $Offset, $ReplacementString, $Duplicate, $C2Max, $Replacements[$Max + 1], $Replacements2[$Max], $C3
For $C = 3 to 1 step -1
$C3 = -1
$Offset = 1
While 1
$Match = StringRegExp($String,StringMid($LevelsString,$C,1) & "(.*?)" & StringMid($LevelsString,$C,1),1,$Offset)
$Offset = @Extended
If @Error Then
ExitLoop
EndIf
$Duplicate = Int("0x" & StringLeft($Match[0],4))
If $Duplicate > 1 Then
$Match[0] = StringMid($LevelsString,$C,1) & $Match[0] & StringMid($LevelsString,$C,1)
$ReplacementString = $Match[0]
$C2Max = Floor(log($Duplicate)/log(2))
For $C2 = 1 to $C2Max
$ReplacementString &= $ReplacementString
Next
;ConsoleWrite("$Duplicate = " & $Duplicate & @CR & "$C2Max = " & $C2Max & @CR & "$C2Max = " & $C2Max & @CR)
$C2Max = $Duplicate - 2 ^ $C2Max
For $C2 = 1 to $C2Max
$ReplacementString &= $Match[0]
Next
;ConsoleWrite($ReplacementString & @CR)
$C3 += 1
$Replacements[$C3] = $Match[0]
$Replacements2[$C3] = $ReplacementString
EndIf
;_ArrayDisplay($Match,"$Match: " & StringMid($LevelsString,$C,1))
WEnd
;_ArrayDisplay($Replacements,"$Replacements, $C2 = " & $C2)
For $C2 = 0 to $C3
$String = StringReplace($String,$Replacements[$C2],$Replacements2[$C2])
;ConsoleWrite($String & @CR)
Next
Next
;ConsoleWrite($String & @CR)
#EndRegion Expand the string
#Region Convert the string back into an array
;Count how many elements are in each level to figure out how many elements we need total
Local $CMax = 0
For $C = 1 to 3
StringReplace($String,StringMid($LevelsString,$C,1),"")
Local $Replacements = @Extended
If $Replacements Then $CMax += $Replacements
Next
$CMax = $CMax/2
Local $NewArray[$CMax][3]
;ConsoleWrite("$Elements = " & $CMax & @CR)
$CMax -= 1
$C2 = 1
Local $Skip = 0
For $C = 0 to $CMax
$C3 = $C2
While 1;Skip redundant level markers
$C3 += 1
If StringInStr($LevelsString,StringMid($String,$C3,1)) Then
$C2 = $C3
Else
ExitLoop
EndIf
WEnd
$NewArray[$C][0] = StringInStr($LevelsString,StringMid($String,$C2,1))
;ConsoleWrite(StringMid($String,$C2,1) & @TAB)
$C2 += 1
$NewArray[$C][2] = Int("0x" & StringMid($String,$C2,4))
;ConsoleWrite(StringMid($String,$C2,4) & @TAB)
$C2 += 4
$NewArray[$C][1] = BinaryToString("0x" & StringReplace(StringMid($String,$C2,16),"00",""))
;ConsoleWrite(StringMid($String,$C2,16) & @CR)
$C2 += 16
Next
Return $NewArray
#EndRegion Convert the string back into an array
EndFunc
;This can support up to 46 levels. They will be represented by the letts g - Z
;Each entry will be represented by 16 hexadecimal characters. This means it can be up to 8 characters long.
;Each quantity will be represented by 4 hexadecimal characters. This means it can represent a number from 0 to 65535
#cs Reserved characters
0
1
2
3
4
5
6
7
8
9
a
b
c
d
e
f
Characters for levels
g - 1
h - 2
i - 3
j - 4
k - 5
l - 6
m - 7
n - 8
o - 9
p - 10
q - 11
r - 12
s - 13
t - 14
u - 15
v - 16
w - 17
x - 18
y - 19
z - 20
A - 21
B - 22
C - 23
D - 24
E - 25
F - 26
G - 27
H - 28
I - 29
J - 30
K - 31
L - 32
M - 33
N - 34
O - 35
P - 36
Q - 37
R - 38
S - 39
T - 40
U - 41
V - 42
W - 43
X - 44
Y - 45
Z - 46
#ce Reserved Characters 

First question: How many levels will the data you're working with use? This script can work for up to 46 levels, but that's probably overkill. If you need < 21 levels, then I can use all upercase letters, and get rid of stringformat(). I can also use a basic comparison for StringinStr and StringReplace.

2) How many characters long are the component names, at maximum? This script can accept input up to 8 characters, but this may be too short. I can increase the length, but this will make the final string the program uses longer, so it's a good idea to get close, but save maximum allowable length.

3) How many of any given item do you expect to ever have? The maximum value that can be in collumn 3 is  65,535, which is probably overkill. If we lower this number, the resulting string the program uses will be shorter.

4) This is a very small array to test this with. I'm sure you can't show me the actual data you're working with, but can you describe the data more? You mentioned it's 20 collumns, and 8,000 rows. Can you describe what values are in each collumn, or perhaps give a list of components so I can generate a bunch of random data to try this script on? I'd like to make sure this script can work for the real data you're using.

This script still needs some work, but it's easy to improve. Let me know what you think.

 

I will also try your script... i did not expect that lot of answers...wow!

What a great forum. I've got a lot to test to find out which of the scripts will work for me at best.

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

Hey,

here's another try, built by myself, which is very fast (tested in the original script with 8000 lines).

The comments are in german and maybe parts of the script could be optimized again (time is running...quick and dirty is the rule for me ;-)

I also built-in a piece of code which lets me see which entry was duplicated and which not

global $ArrayIn
local $i=0
#include "include/Array.au3"
;TestArray
Global $arr_ergebnisliste_final[8][4] = [[" ", ".1", "a", 2], _
                                        [" ", "..2", "b", 1], _
                                        [" ", "..2", "c", 2], _
                                        [" ", "...3", "d", 1], _
                                        [" ", "...3", "e", 2], _
                                        [" ", ".1", "f", 1], _
                                        [" ", "..2", "g", 2], _
                                        [" ", "..2", "h", 1]]

_ArrayDisplay ($arr_ergebnisliste_final, "Original)

While $i < UBound($arr_ergebnisliste_final)
    _DuplTree($arr_ergebnisliste_final, $i)
    $i+=1
WEnd

_ArrayDisplay ($arr_ergebnisliste_final, "After Copying")

;Make "c"'s to "duplicated" umwandeln
For $i=0 to UBound($arr_ergebnisliste_final)-1 Step +1
    if $arr_ergebnisliste_final[$i][0]="c" then $arr_ergebnisliste_final[$i][0]="duplicated"
Next

_ArrayDisplay ($arr_ergebnisliste_final, "Fertig")


Func _DuplTree(ByRef $ArrayIn, $Start)
    $stufenspalte=1 
    $mengenspalte=3
    $multiplier=$ArrayIn[$Start][$mengenspalte]
    ;Nur Anfangen zu berechnen, falls der Multiplikator des Zweiges größer ist als 1 und der Zweig nicht bereits dupliziert wurde!
    if $multiplier>1 AND Not ($ArrayIn[$Start][0]="b" OR $ArrayIn[$Start][0]="c") Then
        ;Erstmal die Zeile des Einstiegspunktes mit einen "b" als bereits bearbeitet markieren!
        ;Denn Zweige, die bereits einmal dupliziert wurden sollen beim nächsten Durchlauf nicht noch einmal dupliziert werden!
        if NOT($ArrayIn[$Start][0]="b" OR $ArrayIn[$Start][0]="duplicated" OR $ArrayIn[$Start][0]="c") Then
            ;Diese Zeilt wurde NOCH NIE bearbeitet oder dupliziert --> klarer Fall für ein "b"
            $ArrayIn[$Start][0]="b"
        Else
            ;Der Datensatz wurde bereits dupliziert...wird mit einem c gekennzeichnet, denn alle C's werden nicht mehr bearbeitet und später zu einem "dupliziert" geändert
            $ArrayIn[$Start][0]="c"
        EndIf
        local $treeSize=1 ; TreeSize ist schon mal eins, denn mindestens eine Zeile wird kopiert!
        local $startstufe=StringStripWS(StringReplace($ArrayIn[$Start][$stufenspalte], ".", ""),8)
        ;Im Array vom Startwert solange weiterlaufen bis ein Eintrag mit einer kleineren oder gleichen Stufe kommt.
        ;Denn dann ist das Ende des Zweiges erreicht und wir wissen schon mal, wie groß der zu duplizierende Zweig werden wird.
        For $i=$Start+1 to UBound($ArrayIn)-1 Step +1
            local $currstufe=StringStripWS(StringReplace($ArrayIn[$i][$stufenspalte], ".", ""),8)
            ;MsgBox(0, "Vergleich", "Startstufe=" & $startstufe & " --> Currstufe:" & $currstufe)
            If $currstufe>$startstufe then
                ;Array-Größe um eins erhöhen
                $treeSize+=1
            Else
                ExitLoop
            EndIf
        Next
        ;treeSize ist ermittelt und muss nun noch mit der übergebenen Anzahl multipliziert werden, da wir den Zweig ja u.U. mehrmals brauchen
        ;dann wissen wir zusammen mit der bereits vorhandenen Größe des Arrays, wie groß der gesamte neue Array sein wird.
        $arrSize=(UBound($ArrayIn)-$treeSize)+($treeSize*$multiplier)
        ;Temporären Array dimensionieren
        Dim $arrTmp[$arrSize][UBound($ArrayIn, 2)]

        ;Von 0 an solange im Array weiterwandern und 1:1 in den neuen Array kopieren, bis wir zum Startwert ($Start) kommen.
        For $i=0 to $Start-1 Step +1
            For $y=0 to UBound($ArrayIn, 2)-1 Step +1
                $arrTmp[$i][$y]=$ArrayIn[$i][$y]
            Next
        Next
        $Start_dupl=$Start
        ;Zweig nun nochmal durchlaufen und in den neuen Array rüberschaufeln
        ;Jedoch nur noch bis zum bereits gefundenen Zweigende und entsprechend der geforderten Anzahl
        For $z=1 to $multiplier Step +1
            ;MsgBox(0, "Durchlauf z=" & $z, "von " & $Start & " bis " & $Start+($z*$treeSize))
            For $i=$Start_dupl to ($Start_dupl+$treeSize)-1 Step +1
                For $y=0 to UBound($ArrayIn, 2)-1 Step +1
                    $arrTmp[$i][$y]=$ArrayIn[$i-(($z-1)*$treeSize)][$y]
                Next
                ;Falls KEIN "b" in der Spalte 0 steht gleich noch den Vermerk "dupliziert" in die Spalte 0 rein!
                ;So wird verhindert, das Zeilen, die breits dupliziert wurden im weiteren Verlauf NOCHMAL dupliziert werden!
                ;if NOT ($ArrayIn[$i-(($z-1)*$treeSize)][0]="b") then $arrTmp[$i][0]="dupliziert"
                if NOT($ArrayIn[$i-(($z-1)*$treeSize)][0]="b" OR $ArrayIn[$i-(($z-1)*$treeSize)][0]="c") AND $z>1 then $arrTmp[$i][0]="dupliziert"
            Next
            $Start_dupl=$Start_dupl+$treeSize
        Next
        local $dupl=0
        $Start_dupl=$Start+($multiplier*$treeSize)-1
        ;Nun noch die "alten", nicht duplizierten Elemente reinklatschen
        For $i=$Start+$treeSize to UBound($ArrayIn)-1 Step +1
                $dupl+=1
                For $y=0 to UBound($ArrayIn, 2)-1 Step +1
                    $arrTmp[($Start_dupl+$dupl)][$y]=$ArrayIn[$i][$y]
                Next
        Next
        #comments-end
        $ArrayIn=$arrTmp
    Else
        ;Der Multiplikator ist 1 (also nichts zu vervielfachen) ODER die Zeile hat ein "b", was heißt, dass sie bereits bearbeitet wurde!
        If $multiplier=1 then
            ;Falls der Multiplikator 1 ist wurde die Zeile (selbst wenn Sie schon mal mit einem anderen Zweig mit dupliziert wurde) SELBER
            ;garantiert noch nie multipliziert und wird auch nie multipliziert werden (z.b. aus Versehen doppelt) --> D.h. falls schon ein "dupliziert"
            ;drin steht, NICHT überschreiben mit "b"
            if NOT ($ArrayIn[$Start][0]="duplicated") Then
                $ArrayIn[$Start][0]="b"
            EndIf
        Else
            ;Falls der Multiplikator größer 1 ist aber ein b drin steht, wurde die Zeile bereits bearbeitet, aber auch schon dupliziert --> das "b" wird mit "dupliziert" überschrieben
            $ArrayIn[$Start][0]="duplicated"
        EndIf

    EndIf
    Return $ArrayIn
EndFunc
Edited by Basement

Share this post


Link to post
Share on other sites

#18 ·  Posted (edited)

I know I'm pretty late to the party but while I've found this little problem interesting, I was distracted elsewhere in the meantime.

Here's my try. It's very easy to manage more columns and I'm confident that the speed will be good. The basic idea is to first scan the array to compute the final number of array elements we'll need. Then a recursive copy takes care of nested duplication with the required multiplicity.

#include <Array.au3>
#include <Math.au3>
#include <String.au3>

; sample data
Local $aData = [ _
    [1, "a", 1], _
    [2, "b", 2], _
    [2, "c", 1], _
    [3, "d", 2], _
    [4, "e", 5], _
    [3, "f", 3], _
    [3, "g", 4], _
    [1, "h", 1], _
    [2, "i", 2], _
    [3, "j", 2], _
    [1, "k", 1] _
]

; first scan of the array to compute the number of rows we shall need after duplication
; we compute patial rows at every level 1 and accumulate into total rows
Local $iTotRows, $sPartRows, $iLevel = 0
For $i = 0 To UBound($aData) - 1
    If $aData[$i][0] < $iLevel Then
        If $iLevel Then $sPartRows &= _StringRepeat(')', $iLevel - $aData[$i][0])
        If $aData[$i][0] = 1 Then
            $iTotRows += Execute($sPartRows)
            $sPartRows = $aData[$i][2]
            $iLevel = 1
        Else
            $sPartRows &= "+" & $aData[$i][2]
            $iLevel = $aData[$i][0]
        EndIf
    ElseIf $aData[$i][0] = $iLevel Then
        $sPartRows &= "+" & $aData[$i][2]
    ElseIf $aData[$i][0] > $iLevel Then
        If $iLevel Then $sPartRows &= "*(1+"
        $sPartRows &= $aData[$i][2]
        $iLevel = $aData[$i][0]
    EndIf
Next
; produce the last entry
If $iLevel > 1 Then $sPartRows &= _StringRepeat(')', $iLevel - 1)
$iTotRows += Execute($sPartRows)

; create the final array once for all to avoid ReDim and such
Local $aDupData[$iTotRows][3]

; now populate it
_Copy(0)
_ArrayDisplay($aDupData)

Func _Copy($iIndex)
    Local Static $iDupIndex = 0
    Local Static $iLevel = 0
    Local Static $iMaxIndex = 0
    While $iIndex < UBound($aData)
        For $n = 1 To $aData[$iIndex][2]
            For $j = 0 To UBound($aDupData, 2) - 1
                $aDupData[$iDupIndex][$j] = $aData[$iIndex][$j]
            Next
            $iDupIndex += 1
            If $iIndex < UBound($aData) - 1 And $aData[$iIndex + 1][0] > $aData[$iIndex][0] Then
                $iLevel += 1
                _Copy($iIndex + 1)
            EndIf
        Next
        $iMaxIndex = _Max($iIndex, $iMaxIndex)
        If $iIndex < UBound($aData) - 1 And $aData[$iIndex + 1][0] < $aData[$iIndex][0] And $iLevel Then
            $iLevel -= 1
            $iIndex = $iMaxIndex
            Return
        Else
            $iIndex = $iMaxIndex + 1
        EndIf
    WEnd
EndFunc   ;==>_Copy
Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0