Jump to content

Manipulating a CSV after import to array


Chimaera
 Share

Recommended Posts

I was wondering if you also tried my >_CSVSplit function.  If so, did you encounter any problems with it? You said other functions that you tried didn't seem to work so well, which is the reason I ask. You would need to read the file with FileRead().

Edit

Also potentially useful ==> >Array_Search_All_Dimensions. I don't know why you are getting errors, so you might like to try. Just check which returned entries ...(edit2 oops - I said the wrong thing)... contain the name in the first column by looking for the following types of matching indices: 1,0 - 24,0 - 74,0 (all ending in zero = 1st column). That solves your column number problem. It should be easy.

Edit2

Hmm, after taking another look at your previous post, I see you are passing a 2 dimensional array to _ArrayUnique which only caters for 1 dimensional arrays. Bah the online documentation is contradictory.

Edited by czardas
Link to comment
Share on other sites

  • Replies 56
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Im not sure im explaining this correctly

This iis what im thinking about

post-60350-0-44322500-1381907175_thumb.p

At the min using the arrayunique it grabs the names in the col0 (A) and then parses them according to that name which is fine but i wondered what would i need to change or use to be able to select and search by type  or stock code [C]

So what i am trying to select is the column to then let array unique use as the basis of the rest of the script and then isolate as before

Global $aSuppliers = _ArrayUnique($MainArray)


For $i = 1 To $aSuppliers[0]
    If IsArray($aSuppliers) Then
        $FindSupplier = _IsolateRecord($i)
        Sleep(100)
    EndIf

Next
    MsgBox(64, "All Done", "Splitting Finished")

Func _IsolateRecord($i)
    Local $FoundRecords = _ArrayFindAll($MainArray, $aSuppliers[$i])
    Local $NewArray[UBound($FoundRecords)][UBound($MainArray, 2)]

    For $x = 0 To UBound($FoundRecords) - 1
        For $y = 0 To UBound($MainArray, 2) - 1
            $NewArray[$x][$y] = $MainArray[$FoundRecords[$x]][$y]
        Next
;~      _ArrayDisplay($NewArray)
        _WriteCSV(@ScriptDir & "\" & $aSuppliers[$i] & ".csv", $NewArray, ",")
    Next
EndFunc   ;==>_IsolateSupplier

So still only 1 dimension i think just a different column for the search

Or is this a whole load of work?

Edited by Chimaera
Link to comment
Share on other sites

Im assuming by that you mean use ArraySort in Beta with the $SubItem and pass the changed array through the $IsolateRecord?

Edited by Chimaera
Link to comment
Share on other sites

I was refering to _ArrayFindAll(). So for exmple. Get unique Entries for the array, search for a particular entry in column 24 ($subitem = 23) or whatever.

If that doesn't work then use my _Array_Search_All_Dimensions and look for all indices ending ',23'. Use the first index  to read the rows of the main array and then populate your new arrays.

For example you find the entry '£200' at positions 1,23 | 5,5 | 12,11 | 16,23 Then all items matching '$200' appearing in column 24 (0-based remember) will end with ',23'. The new array needs two rows and needs to be populated with rows 1 and 16 from the original array.

Code example below.

Edited by czardas
Link to comment
Share on other sites

the thing is i need to split the CSV with the chosen column not an entry within it so instead of Array Unique being column 0 i can choose it to be column 9 for eg so it splits the csv by the product type

I suppose array unique will have to make way for a different type of selection

Does your CSV split allow for the type of split ive mentioned?

To answer your previous question whilst i remember i couldnt get yours to work on splitting thats why i didnt use it

Link to comment
Share on other sites

Is this something like what you want?

;

#include <Array.au3>

Local $aMain[9][5]

; ################ This is not part of the example ##################

; Populate the erray for the example code later
For $i = 0 To 8
    $aMain[$i][0] = "ADAMS" ; Single Entry
    For $j = 1 To 4
        $aMain[$i][$j] = Random(1, 200, 1)
    Next
Next
For $i = 0 To 8 Step 3
    $aMain[$i][3] = "kippers"
    $aMain[$i +1][3] = "fish"
    $aMain[$i +2][3] = "eggs"
Next
$aMain[2][1] = "kippers"
$aMain[3][2] = "fish"
$aMain[8][4] = "fish"
$aMain[2][4] = "fish"

_ArrayDisplay($aMain, "Main Array") ; This is your main array

; ###################################################################

; Now lets create a new array that only contains the item 'fish' in the 4th column:
Local $aIndices = _Array_Search_All_Dimensions($aMain, "fish") ; Find elements containing fish
_ArrayDisplay($aIndices, "Matching Indices") ; These are all the elements containing fish

Local $iBound = $aIndices[0]
If $iBound = 0 Then Exit ; No fish today sorry :(

Local $aNewArray[$iBound][5], $iCount = 0, $iRow, $aSplit

For $i = 1 To $aIndices[0]
    $aSplit = StringSplit($aIndices[$i], ",", 2)
    If $aSplit[1] = 3 Then
        $iRow = $aSplit[0]
        For $j = 0 To 4
            $aNewArray[$iCount][$j] = $aMain[$iRow][$j]
        Next
        $iCount += 1
    EndIf
Next

If $iCount > 0 Then
    ReDim $aNewArray[$iCount][5]
Else
    MsgBox(0, "", "No fish today sorry")
    Exit
EndIf

_ArrayDisplay($aNewArray, "Fish in 4th column")

; #FUNCTION# ====================================================================================================================
; Name...........: _Array_Search_All_Dimensions
; Description ...: Searches through an array of up to 16 dimensions.
; Syntax.........: _Array_Search_All_Dimensions(ByRef $aArray, $vFind [, $iCasesense = 0])
; Parameters ....: $aArray     - [ByRef] The array to search
;                  $vFind      - The string to search for
;                  $iCasesense - Case sensitivity ==> 1 = case sensitive, 0 = case insensitive, default = 0
; Return values .: Success   - Returns a 1 dimensional array of matching indices with each dimension separated by a comma
;                  Failure   - Returns zero and sets @error to the following values:
;                  |@error = 1 : $aArray is not an array
; Author ........: czardas
; Modified.......:
; Remarks .......: This function can handle arrays of up to 16 dimensions
; Related .......:
; Link ..........:
; Example .......:
; ===============================================================================================================================

Func _Array_Search_All_Dimensions(ByRef $aArray, $vFind, $iCasesense = 0)
    #forceref $vFind
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)

    Local $aBound[17] = [UBound($aArray, 0),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    If $aBound[0] > 16 Then Return SetError(2, 0, 0) ; Only 16 dimensions are supported
    For $i = 1 To $aBound[0]
        $aBound[$i] = UBound($aArray, $i) -1
    Next

    Local $sExpression = "$aArray" & StringLeft("[$0][$1][$2][$3][$4][$5][$6][$7][$8][$9][$a][$b][$c][$d][$e][$f]", $aBound[0] *4)
    If $iCasesense Then $sExpression &= "="
    $sExpression &= "=$vFind"

    Local $sElements = "", $sPosition = StringLeft("$0&','&$1&','&$2&','&$3&','&$4&','&$5&','&$6&','&$7&','&$8&','&$9&','&$a&','&$b&','&$c&','&$d&','&$e&','&$f", $aBound[0]*7 -5)
    For $f = 0 To $aBound[16]
        For $e = 0 To $aBound[15]
            For $d = 0 To $aBound[14]
                For $c = 0 To $aBound[13]
                    For $b = 0 To $aBound[12]
                        For $a = 0 To $aBound[11]
                            For $9 = 0 To $aBound[10]
                                For $8 = 0 To $aBound[9]
                                    For $7 = 0 To $aBound[8]
                                        For $6 = 0 To $aBound[7]
                                            For $5 = 0 To $aBound[6]
                                                For $4 = 0 To $aBound[5]
                                                    For $3 = 0 To $aBound[4]
                                                        For $2 = 0 To $aBound[3]
                                                            For $1 = 0 To $aBound[2]
                                                                For $0 = 0 To $aBound[1]
                                                                    If Execute($sExpression) Then $sElements &= Execute($sPosition) & "|"
                                                                Next
                                                            Next
                                                        Next
                                                    Next
                                                Next
                                            Next
                                        Next
                                    Next
                                Next
                            Next
                        Next
                    Next
                Next
            Next
        Next
    Next
    $sElements = StringTrimRight($sElements, 1)

    Local $aResults[1] = [0]
    If $sElements Then $aResults = StringSplit($sElements, "|")
    Return $aResults
EndFunc ;==> _Array_Search_All_Dimensions

;

_Array_Search_All_Dimensions may seem a bit OTT, but you wouldn't notice that if it was part of a UDF. The extra loops (for extra dimensions) do not add any overhead. I haven't used _ArrayFindAll, so maybe that's faster. I haven't tested it. The helpfile only gives a 1 dimensional array example.

Edit - Improved the example code. I was a bit rushed earlier. _CSVSplit requires the file to be read using FileRead() and then the full string needs to be passed to the function

Edited by czardas
Link to comment
Share on other sites

Im not sure im explaining this correctly

This iis what im thinking about

attachicon.gifarray.PNG

At the min using the arrayunique it grabs the names in the col0 (A) and then parses them according to that name which is fine but i wondered what would i need to change or use to be able to select and search by type  or stock code [C]

So what i am trying to select is the column to then let array unique use as the basis of the rest of the script and then isolate as before

Global $aSuppliers = _ArrayUnique($MainArray)


For $i = 1 To $aSuppliers[0]
    If IsArray($aSuppliers) Then
        $FindSupplier = _IsolateRecord($i)
        Sleep(100)
    EndIf

Next
    MsgBox(64, "All Done", "Splitting Finished")

Func _IsolateRecord($i)
    Local $FoundRecords = _ArrayFindAll($MainArray, $aSuppliers[$i])
    Local $NewArray[UBound($FoundRecords)][UBound($MainArray, 2)]

    For $x = 0 To UBound($FoundRecords) - 1
        For $y = 0 To UBound($MainArray, 2) - 1
            $NewArray[$x][$y] = $MainArray[$FoundRecords[$x]][$y]
        Next
;~      _ArrayDisplay($NewArray)
        _WriteCSV(@ScriptDir & "\" & $aSuppliers[$i] & ".csv", $NewArray, ",")
    Next
EndFunc   ;==>_IsolateSupplier

So still only 1 dimension i think just a different column for the search

Or is this a whole load of work?

 

Hi Chimaera,

If you want to use the column 3 (you indicated with B ) to populate the array, just add the column number 4 (first column is nr.1 by default) and so _ArrayUnique() will extract unique elements from that column instead of from the first:

Global $aSuppliers = _ArrayUnique($MainArray, 4) ; <- will peek from column 4 of the 2D array $MainArray

(I hope I have understood your purpose)

bye

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

Thats getting there now

Global $aSuppliers = _ArrayUnique($MainArray, 2)

A test run with this reveals the correct parts of the array for that column

But it throws an error

 

Array variable subscript badly formatted.:

Local $NewArray[uBound($FoundRecords)][uBound($MainArray, 2)]
Local $NewArray[^ ERROR

My guess would be because it cant make the array because it looked at column 2 not column 1

How do i get it to still create the arrays when it doesn't start on the first column?

Edited by Chimaera
Link to comment
Share on other sites

Thats getting there now

Global $aSuppliers = _ArrayUnique($MainArray, 2)

A test run with this reveals the correct parts of the array for that column

But it throws an error

My guess would be because it cant make the array because it looked at column 2 not column 1

How do i get it to still create the arrays when it doesn't start on the first column?

 

because if _arrayfindall() find nothing, then you try to dim an array with [-1[

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

if you changed the column in wich the statement _arrayunique() has to search (column 2),

then you should modify the command
  Local $FoundRecords = _ArrayFindAll ($ MainArray, $ aSuppliers [$ i])
and add the column where to look, which is 1 in this case (because _ArrayFindAll() considers 0 the first column)

like this:

Local $FoundRecords = _ArrayFindAll($MainArray, $aSuppliers[$i], 0, 0, 0, 0, 1)

I can not test it on real data ..........try it :)

edit

corrected typos

Edited by PincoPanco

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

After countless telephone interruptions. I had time to come back to this. Yes PincoPanco, that's what I mentioned in post #23 and #25. I've now had time to look at what these functions do. _ArrayFindAll() is very similar to _Array_Search_All_Dimensions() excluding the start, stop and partial search parameters (it doesn't return precise locations though). I don't understand the problem here, so I'll let you folks get on with it.

Edited by czardas
Link to comment
Share on other sites

It is working now, thanks everyone for your patience.

czardus, i will look at your advanced search dimensions part later and see how to utilise it, many thanks

Link to comment
Share on other sites

Some weirdness i have noticed when splitting the CSV files.

If i open the csv in excel 2013 / 2007 it shows like this A & B are the same cell except they display differently

post-60350-0-52658800-1382019637_thumb.p

If i click B it then corrects A to match but i have to resave the csv.

But when the program accesses it and parses the csv it shows the cells correctly

post-60350-0-76653400-1382019644_thumb.p

Except it will not handle that data no matter what i try, it bangs on trying to do something but returns no files after the split other than the header labels

Is there a way to get it to deal with those date codes?

Link to comment
Share on other sites

It is working now, thanks everyone for your patience.

czardus, i will look at your advanced search dimensions part later and see how to utilise it, many thanks

 

I'm glad you are getting further. The array search function was intended to be as basic as possible - therefore only 3 parameters are used. Anyway _ArrayFindAll() seems suitable here.

Some weirdness i have noticed when splitting the CSV files.

If i open the csv in excel 2013 / 2007 it shows like this A & B are the same cell except they display differently

 

I gave up using Excel because of this kind of wierdness. CSV has to be one of the easiest possible formats to parse. There are so few rules that nothing can go wrong, unless you use a complicated program like Excel. It is beyond my comprehension why this kind of thing occurs.

Perhaps changing the delimiters in the date format will help, or maybe it just can't fit the information in the cell display area, so it blanks it. I think the changes it wants to save are changes to things like column width. When it tells you the format is incompatible with the changes you didn't actually make, my head just explodes. Of course it can't save those kind of changes - it's a csv and there are no display options in the format. Duh! Excel is just spooky.

I think our MVP water knows a lot more about Excel than most people around here. It's an amazing program really, but things like this do give me cause for concern.

Edit

I notice a lot of my csv files use yyyymmdd format with no delimiters and display correctly in Excel. What happens when you expand the width of the first column?

Edited by czardas
Link to comment
Share on other sites

You can record macros to find out how to change formats:

Columns("A:A").Select
    Selection.NumberFormat = "m/d/yyyy"

My excel was displaying just like yours...00:00:00.000, above works to change to the format in the quotes.

This is straight copied from excel, so you will need to add the $oExcel object before it, for each of the line items.

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

This one isn't just date though its like this

2012-03-08 00:00:00.000

 

and i cant change the format as he needs the timestamps so he can track when things go wrong, i suggested he might be better to remove the timestamp into a different column from the date when he exports it from the mysql.

He is looking into it atm.

As for column width i already have really long ones 5 times the standard width of a column in the same sheet and they work fine, it is only this date related one that fails :( Or did you mean something else?

Its very strange that it shows it in the AutoIt array window correctly yet wont deal with it when it comes to the export

Link to comment
Share on other sites

I'm not sure how to solve this. Is it just the display, or is it also corrupting the csv data (which I doubt)? I had similar issues and discovered that by dragging the column border I could make the correct date display. Try saving the csv with Excel and open in Notepad to check if it has been corrupted.

Edited by czardas
Link to comment
Share on other sites

Excel picks what it thinks will be the best format for your raw data.

My prior post will show your date string, which is like 2012-10-15 00:00:00.000 (which for some reason, excel displays as only a time value by default), as 10152012, in excel.

The raw data will remain as you see it in your array read out from excel (or wrote into excel).

Dates

Date Times

Times

Integers

can all be formatted look like any of the others.

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...