Sign in to follow this  
Followers 0
Myicq

Convert CSV coloumn count

4 posts in this topic

#1 ·  Posted (edited)

Just a small example, hope it will be useful to someone. (and not too trivial / simple)

If the method can be improved, I would be happy to learn about it.

Task : take a csv file with data, possibly organized in sets. Each line contains 1..n sets of data. Think 1 line of is a paper of labels. Each paper may have 1.. n labels where each label is 1 or more fields. Transform this to different column count of data set. Purpose is to transform customer data into different applications where different organization of data is needed. Source can of course be 1 column only.

Example

Spoiler
http://myurl.com/ABC;123;TEST1;http://myurl.com/ABC;122;TEST1
http://myurl.com/CAC;124;TEST1;http://myurl.com/BBC;121;TEST1
http://myurl.com/AQC;125;TEST1;http://myurl.com/CBC;126;TEST1

This is 3 lines with 2 set of data each. Each data set has 3 items.

If this is transformed into 4 columns, result should be

Spoiler
http://myurl.com/ABC;123;TEST1;http://myurl.com/ABC;122;TEST1;http://myurl.com/CAC;124;TEST1;http://myurl.com/BBC;121;TEST1
http://myurl.com/AQC;125;TEST1;http://myurl.com/CBC;126;TEST1

 

The code:

Spoiler
; convert column count X to Y
; A,B
; C,D
; E,F
;
; to
; A,B,C
; D,E,F
; -----------
; needed for function
; (library routine)
#include <File.au3>

; how many columns in OUTPUT
$no_of_col_out = 5

; how many columns in input data, each set.
$col_in_each_set = 3

; separator for INPUT (each line fields split by this)
$input_separator = ";"

; separator for OUTPUT
$out_separator = ";"

; input file name
$ifn = "myinputfile.txt"

; this is a variable for data
$aData = ""

; we read ALL datafile into ONE array of arrays.
; each line is an array and element in $aData
$readResult = _FileReadToArray($ifn, $aData,2, $input_separator)


; we open file for output here
$of = FileOpen("out.txt", 2) ; overwrite mode


; this is the main loop
; go over each line in field
$internal_column_counter = 1
for $i = 0 to UBound($aData)-1
        ; then loop over each field in the line
        local $a = $aData[$i]

        for $j = 0 to UBound($a)-1
                FileWrite($of, $a[$j])  ; output FIRST field

                ; if we reach number of fields in total (sets * fields per set) in a line,
                ; write a newline
                ; and start over
                if $internal_column_counter = $no_of_col_out * $col_in_each_set then
                    FileWrite($of, @CRLF)
                    $internal_column_counter = 1
                else
                    ; else just write a separator for next field
                    FileWrite($of, $out_separator)
                    $internal_column_counter+=1
                EndIf
        Next
Next

; add missing separators
; but not on lines that are blank
; ensures that files have same amount of fields on each line.
if $internal_column_counter >1 then
    local $missing_columns_count = ($no_of_col_out * $col_in_each_set)-$internal_column_counter
    for $i = 1 to $missing_columns_count
        FileWrite($of, $out_separator)
    Next
    FileWrite($of, @CRLF)
endif

; clean up
FileClose($of)

 

Is this example script too simple to be allowed here ?

 

 

Edited by Myicq
Updated script. Added clarification.

I am just a hobby programmer, and nothing great to publish right now.

Share this post


Link to post
Share on other sites



excuse me, is me or it seems it doesn't works correctly?
how can you say this is a 4-column output?

http://myurl.com/ABC;123;TEST1;http://myurl.com/ABC;122;TEST1;http://myurl.com/CAC;124;TEST1;http://myurl.com/BBC;121;TEST1
http://myurl.com/AQC;125;TEST1;http://myurl.com/CBC;126;TEST1

p.s.

if I haven't misunderstud the script's purpose, then it should be like this:

; convert column count X to Y
; A,B
; C,D
; E,F
;
; to
; A,B,C
; D,E,F
; -----------
; needed for function
; (library routine)
#include <File.au3>

; how many columns in OUTPUT
$no_of_col_out = 5

; how many columns in input data, each set.
; $col_in_each_set = 3 ; <-- not needed

; separator for INPUT (each line fields split by this)
$input_separator = ","

; separator for OUTPUT
$out_separator = ";"

; input file name
$ifn = "myinputfile.txt"

; this is a variable for data
$aData = ""

; we read ALL datafile into ONE array of arrays.
; each line is an array and element in $aData
$readResult = _FileReadToArray($ifn, $aData, 2, $input_separator)

; we open file for output here
$of = FileOpen("out.txt", 2) ; overwrite mode

Local $internal_column_counter = 0

; this is the main loop
; go over each line in field
For $i = 0 To UBound($aData) - 1
    ; then loop over each field in the line
    For $j = 0 To UBound(($aData[$i])) - 1 ; for each field in subarray
        $internal_column_counter += 1
        ; if last field in output OR we reached the last field in input
        If $internal_column_counter = $no_of_col_out Or ($i = UBound($aData) - 1 And $j = UBound(($aData[$i])) - 1) Then
            FileWrite($of, ($aData[$i])[$j] & @CRLF)
            $internal_column_counter = 0 ; reset output fields counter
        Else
            FileWrite($of, ($aData[$i])[$j] & $out_separator) ; just write this field
        EndIf
    Next
Next

 


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites
5 hours ago, Chimp said:

excuse me, is me or it seems it doesn't works correctly?
how can you say this is a 4-column output?

; edit : myicq
|----------------------------| |----------------------------| |----------------------------| |----------------------------| 
http://myurl.com/ABC;123;TEST1;http://myurl.com/ABC;122;TEST1;http://myurl.com/CAC;124;TEST1;http://myurl.com/BBC;121;TEST1
http://myurl.com/AQC;125;TEST1;http://myurl.com/CBC;126;TEST1

 

Hello Chimp,

No, it is obviously 12 columns, but 4 datasets instead of 2. Think "sheets of labels", where each line is a sheet. Sometimes you have less, other times more labels. Each label has 1, maybe 4, maybe 3 fields.

Your change correctly puts the number of fields on each line, but invalidates the datasets.

I have updated the script so that the correct number of blank fields are added.


I am just a hobby programmer, and nothing great to publish right now.

Share this post


Link to post
Share on other sites

Hi Myicq,

thanks for the clarification, with the dotted lines I can better understand the concept,
I had obviously misunderstood the final goal.
Thanks for sharing, may be handy sometime.


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0