Jump to content

csv udf


ivan
 Share

Recommended Posts

@John C.:

Thanks. Today's the first time I've had to llok at this post. I hope to have an upgrade implementing some suggestions and cleaning a bug in the line break conversion function in v1.3.

IVAN

Thanks. That'll be great. I've nearly run out of ideas on this subject.

IVAN

Link to comment
Share on other sites

@John C.:

My first thought was to do a straight replacement of line break chars, that is, @CR, @LF or the combined @CRLF, as was done in the UDF. However, playing a bit with Microsoft excel csv files, I realized that line breaks are in fact embedded as part of the encoding, allowing for combinations of line breaks within fields.

While this should not pose a problem for the strict definition of csv (each record is @CR separated), there still is a problem of how to make the udf compliant with excel files that have multiple lines within a field. So far the UDF reads each line as a record and then the field, column, etc operations can be executed.

To acheive the compliance with Microsoft excel formats, the delimitation criteria must be extended to the whole file, so that a non delimited line break is the criteria for defining a record, while delimited line breaks are embedded breaks within fields.

The repercussion of this on the csv udf is that basic functions are affected, as the delimitation has to be executed first so that records and fields can be defined... I don't know if there's a way around, but it makes sense. It would make the initialization of the csv file slower, but operating will be much faster.

If you have any ideas or if you'd like to take a chance at modifying the udf, please do so.

IVAN

Link to comment
Share on other sites

  • 2 weeks later...

Hi ivan, this will be my first post in this forum.

I am pretty much a complete noob to programming but have managed to hack

together some code to extract data from a proprietry database and put it into excel.

The idea is to manipulate the data and get it back into the database. This is why I

am here.

I thought CSV might be the way to go and I found your script librarys. The function i

think I need is _CSVRecordsGetFieldValue and here is my code:-

#include <CSVLib_V1.3.au3>

;variables

$path = "C:\Documents and Settings\User1\Desktop\New Folder 

(2)\AutoIT-scripts\autoit csv libs\trident-test.csv"
$pColumnNumber = 3
$pRowNumber = 3
$result = 0



$result = _CSVRecordsGetFieldValue($path, $pColumnNumber, $pRowNumber)


MsgBox(0, "result =", $result)

I can't seem to make it work and I get the folowing error

Line 935 (File "C:\autoit csv libs\CSVLib_V1.3.au3"):

If $pRowNumber <= $lColumn[0] Then

If $pRowNumber <= $lColumn^ ERROR

Error: Subscript used with non-Array variable.

Can someone tell me where I am going wrong?

Link to comment
Share on other sites

  • 2 weeks later...

Hi ivan, this will be my first post in this forum.

I am pretty much a complete noob to programming but have managed to hack

together some code to extract data from a proprietry database and put it into excel.

The idea is to manipulate the data and get it back into the database. This is why I

am here.

I thought CSV might be the way to go and I found your script librarys. The function i

think I need is _CSVRecordsGetFieldValue and here is my code:-

#include <CSVLib_V1.3.au3>

;variables

$path = "C:\Documents and Settings\User1\Desktop\New Folder 

(2)\AutoIT-scripts\autoit csv libs\trident-test.csv"
$pColumnNumber = 3
$pRowNumber = 3
$result = 0
$result = _CSVRecordsGetFieldValue($path, $pColumnNumber, $pRowNumber)
MsgBox(0, "result =", $result)

I can't seem to make it work and I get the folowing error

Line 935 (File "C:\autoit csv libs\CSVLib_V1.3.au3"):

If $pRowNumber <= $lColumn[0] Then

If $pRowNumber <= $lColumn^ ERROR

Error: Subscript used with non-Array variable.

Can someone tell me where I am going wrong?

Sorry, been away for a while. Thanks for the post, and you found a documentation error.

The function documentation states that :

; Syntax: _CSVRecordsGetFieldValue(pFullPath, $pColumnNumber, $pRowNumber [,$pDelimiter] [,$pEnclose])

However, the code says:

Func _CSVRecordsGetFieldValue($pRecords, $pColumnNumber, $pRowNumber, $pDelimiter = -1, $pEnclose = -1)

To get it to work you need to retreive the records as follows:

$path = "C:\Documents and Settings\User1\Desktop\New Folder(2)\AutoIT-scripts\autoit csv libs\trident-test.csv"

$pColumnNumber = 3

$pRowNumber = 3

$result = 0

$records=_CSVFileReadRecords($path)

; then send the records instead of the file as follow:

$result = _CSVRecordsGetFieldValue($path, $pColumnNumber, $pRowNumber)

MsgBox(0, "result =", $result)

Link to comment
Share on other sites

Hi Ivan, thanks for your reply, but there was a mistake in it!

You wrote:-

To get it to work you need to retreive the records as follows:

$path = "C:\Documents and Settings\User1\Desktop\New Folder(2)\AutoIT-scripts\autoit csv libs\trident-test.csv"

$pColumnNumber = 3

$pRowNumber = 3

$result = 0

$records=_CSVFileReadRecords($path)

; then send the records instead of the file as follow:

$result = _CSVRecordsGetFieldValue($path, $pColumnNumber, $pRowNumber)

MsgBox(0, "result =", $result)

Change $path to $records then it works. And I thought I was the noob :)

You put me on the right tracks though. I am also glad to have helped in finding this error.

Cheers,

Shadowfiend.

Edited by shadowfiend
Link to comment
Share on other sites

Hi Ivan, thanks for your reply, but there was a mistake in it!

You wrote:-

To get it to work you need to retreive the records as follows:

$path = "C:\Documents and Settings\User1\Desktop\New Folder(2)\AutoIT-scripts\autoit csv libs\trident-test.csv"

$pColumnNumber = 3

$pRowNumber = 3

$result = 0

$records=_CSVFileReadRecords($path)

; then send the records instead of the file as follow:

$result = _CSVRecordsGetFieldValue($path, $pColumnNumber, $pRowNumber)

MsgBox(0, "result =", $result)

Change $path to $records then it works. And I thought I was the noob :)

You put me on the right tracks though. I am also glad to have helped in finding this error.

Cheers,

Shadowfiend.

Thanks for pointing out the gug. I must admit, I only took a glance at your initial question and then tried to figure out the problem. I think I'll try out the problem next time.

Correction to file will be uploaded on monday.

IVAN

Link to comment
Share on other sites

  • 3 weeks later...

Hey Ivan, your CSV UDF is very interesting, could you provide us with updated examples..

CODE

>Running AU3Check C:\Program Files\AutoIt3\SciTE\Defs\Au3Check\au3check.dat

C:\Desktop\CSVGuiExample_V1.1.au3(59,48) : ERROR: _CSVRecordToFields(): undefined function.

$OriginalFields = _CSVRecordToFields ($CSV[$i])

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3(66,55) : ERROR: _CSVFieldToString(): undefined function.

$TextFields[$j] = _CSVFieldToString ($TextFields[$j])

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3(188,88) : ERROR: _CSVReadRecords(): undefined function.

$lRecords = _CSVReadRecords (@ScriptDir & '\' & $FileName & $NextTest - 1 & $FileExt)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3(227,86) : ERROR: _CSVGetColumn(): undefined function.

$lColArr = _CSVGetColumn ($lRecords, $lColumnSelection, $SetDelimiter, $SetEnclose)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3(516,103) : ERROR: _CSVGetField(): undefined function.

$lField = _CSVGetField ($lRecords, $lColumnSelection, $lRecordSelection, $SetDelimiter, $SetEnclose)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3(519,101) : ERROR: _CSVStringToField(): undefined function.

MsgBox(0, 'Your field back to csv format', _CSVStringToField ($lField, $SetDelimiter, $SetEnclose)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

C:\Desktop\CSVGuiExample_V1.1.au3 - 9 error(s), 0 warning(s)

Don't bother, It's inside your monitor!------GUISetOnEvent should behave more like HotKeySet()
Link to comment
Share on other sites

  • 1 year later...

Thanks for that library, Ivan.

I have one idea:

in some version of Excel the default value of delimter is ';'.

Is it able to replace

$pDelimiter = ','
in each function for
$pDelimiter = $constDefaultDelimter
and place
Global $constDefaultDelimter = ','
at beginning of module.

So it's possible to replace $constDefaultDelimter at once.

Are there any problem?

Thanks,

Alexey

Link to comment
Share on other sites

  • 3 weeks later...

Hi,

I think I found error in _CSVRecordsAppendFields in string

$lAppendRecord = _CSVFieldsToRecord($pFields, -1, -1, $pDelimiter, $pEnclose, $pMode)

Let's look at declaration of _CSVFieldsToRecord:

Func _CSVFieldsToRecord($pFields, $pStartIndex = 1, $pNumCols = -1, $pDelimiter = -1, $pEnclose = -1, $pMode = 0)

So, we assign $pStartIndex = -1 and thts an error

Need to correct:

$lAppendRecord = _CSVFieldsToRecord($pFields, 1, -1, $pDelimiter, $pEnclose, $pMode)
Link to comment
Share on other sites

  • 3 weeks later...
  • 1 year later...

This project appears to be untouched in the past few years.

I did want to thank the author (Ivan) for a wonderful piece of work.

I have some unusual CSV files that need to be processed regularly and this project was very useful to automate this.

Just to clarify, the original CSV files are extracted from a system that I do not control. The CSV files have some problems and are very difficult to automate. For example, in many of the fields, users are allowed to type in a description. Often they start their description with a plus sign (+), minus sign (-) or equal sign (=). All three of these characters will generate an alert in Microsoft Excel (#NAME). The second major problem is that most fields also contain a comma which is also the delimiter character. Trying to automate this to post directly to a database is a nightmare.

Based in Ivan's work (the original version), I have modified the script to convert the +, -, and = signs to their HTML character code equivalents. I have also modified the script to convert any commas within fields to its HTML character code equivalent.

I have also created code that automatically copies the original CSV file to a directory of its own (so that there is a record in case the client reports errors or problems), writes out a new file (with a file name that includes the original plus the row count and field count plus the date.

Ivan, it is clear that you have invested a lot of time into your CSV UDF library. Many thanks from this user -- I appreciate the effort and time and want to again thank you for your contribution to solving a major problem for me.

[edit] PS. I have also added a progress bar so that it visually indicates progress on the CSV conversion.

Andy

Edited by ajpmirc
Link to comment
Share on other sites

  • 4 months later...

This UDF is a good starting point. Though I wanted to comment that using FileReadLine in _CSVFileReadRecords is a bad idea if you care about performance. Even using an SSD drive with similar read and write times comparable to a 15000 RPM U320 SCSI HDD, trying to reading in a small 500KB CSV takes almost a minute (that's really bad). I'd recommend changing it to something like,

$buffer = FileRead($lCVSFileHandle)
If @error = 1 Then ; Reading error, file open errors subsumed in FileOpen() above.
    FileClose($lCVSFileHandle)
    SetError(3)
    $lRecords[0] = 0
    Return 0
EndIf
$lRecords = StringSplit($buffer, @CRLF, 1)

This dramatically improves load times. Later in functions like _CSVSearch(...). Then consider scanning the full string in the $pRecord and once you've identified a successful candidate. At that point do the _CSVRecordGetFields(). i.e.

For $j = 1 To 1   ;UBound($lFields) - 1 Step 1  ;1
    If $pMode = 1 Then
        If Not _CSVConvertFieldToString($lFields[$j], $pEnclose) Then
            SetError(3, @extended)
            Return 0
        EndIf
    EndIf
    ;If StringInStr($lFields[$j], $pSearchStr, $pCaseSense) Then
    If StringInStr($pRecords[$i], $pSearchStr, $pCaseSense) Then
        ReDim $lResult[UBound($lResult) + 1][4]
        Local $max = UBound($lResult)
        $lResult[$max - 1][0] = $i ; record num
        $lResult[$max - 1][1] = $j ; col num
        $lResult[$max - 1][2] = $pRecords[$i] ; record
        $lFields = _CSVRecordGetFields($pRecords[$i], $pDelimiter, $pEnclose)
        If @error Then
            SetError(2, @error)
            Return 0
        EndIf
        $lResult[$max - 1][3] = $lFields[$j] ; field
        ;The following is useful for a person who just wants one record:
        ;ExitLoop 2
        ;Consider adding in another parameter specifying a hitcount
    EndIf
Next

I'm not trying to be a nit, just wanted to pass along some possible improvements. Thanks for sharing with us Ivan. :mellow:

Edited by Xtraeme
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...