18 posts in this topic
Good morning guys
I'd like to know if there is a way to convert a PDF in CSV or, eventually, in TXT, in order to read from it, like a database...
I have a PDF and I think ( I dind't search a lot on the forum ) with AutoIt, but I'd like work with Excel styles...
Does anyone know a good program which convert PDF to CSV?
PS: the PDF file is 5 MB, and it contains 439 pages...
Thanks everyone for the help
I'd like to write a piece of tool that would allow me to update a certain field in our Active Directory from a comma separated csv file composed like this:
This file, automatically generated, can hold more than 10k lines.
Thus, I need column A to be in one variable, column B in a second one and column C in a third one.
I'm really missing this part as updating the AD is fairly easy once the 3 variable are populated.
I see things like this:
Here's my attempts at the moment:
#include <File.au3> #include <Array.au3> Global $csv_file = @DesktopDir & "\Book1.csv" Global $aRecords If Not _FileReadToArray($csv_file,$aRecords) Then MsgBox(4096,"Error", " Error reading log to Array error:" & @error) Exit EndIf For $x = 1 to $aRecords Msgbox(0,'Record:' & $x, $aRecords[$x]) ; Shows the line that was read from file $csv_line_values = StringSplit($aRecords[$x], ",",1) ; Splits the line into 2 or more variables and puts them in an array ; _ArrayDisplay($csv_line_values) ; Shows what's in the array you just created. ; $csv_line_values holds the number of elements in array ; $csv_line_values holds the value ; $csv_line_values holds the value ; etc Msgbox(0, 0, $csv_line_values) Next Any help on this please?
Thanks in advance
I've had issues in the past dealing with excel so I decided to cut out the middle man and build a script that would take any file that opens in excel (csv, xml, xls, etc) and convert it into an array so I can handle the raw data in a cleaner way. I used czardas' CSV parser to do this and added a simple save in excel to save it as a csv to parse.
Func _CreateCSV($fnImportFile) $oExcel = ObjCreate("Excel.Application") $oExcel.Visible=False $oBook= $oExcel.Workbooks.Open($fnImportFile) $oSheet=$oBook.ActiveSheet $fnMaster=@TempDir&"\"&Chr(Round(Random(0,24),0)+64)&Random(0,24)&Chr(Round(Random(0,24),0)+64)&Random(0,24)&Chr(Round(Random(0,24),0)+64)&".csv" ConsoleWrite($fnMaster&@CRLF) $oSheet.SaveAs($fnMaster, 6) $oBook.Close(False) $oExcel.Quit $aReturnArray=_CSVSplit(FileRead($fnMaster)) FileDelete($fnMaster) If not @error Then Return $aReturnArray Else Return -1 EndIf EndFunc ; #FUNCTION# ==================================================================================================================== ; Name...........: _CSVSplit ; Description ...: Converts a string in CSV format to a two dimensional array (see comments) ; Syntax.........: CSVSplit ( $aArray [, $sDelim ] ) ; Parameters ....: $aArray - The array to convert ; $sDelim - Optional - Delimiter set to comma by default (see 2nd comment) ; Return values .: Success - Returns a two dimensional array or a one dimensional array (see 1st comment) ; Failure - Sets @error to: ; |@error = 1 - First parameter is not a valid string ; |@error = 2 - Second parameter is not a valid string ; |@error = 3 - Could not find suitable delimiter replacements ; Author ........: czardas ; Comments ......; Returns a one dimensional array if the input string does not contain the delimiter string ; ; Some CSV formats use semicolon as a delimiter instead of a comma ; ; Set the second parameter to @TAB To convert to TSV ; =============================================================================================================================== Func _CSVSplit($string, $sDelim = ",") ; Parses csv string input and returns a one or two dimensional array If Not IsString($string) Or $string = "" Then Return SetError(1, 0, 0) ; Invalid string If Not IsString($sDelim) Or $sDelim = "" Then Return SetError(2, 0, 0) ; Invalid string $string = StringRegExpReplace($string, "[\r\n]+\z", "") ; [Line Added] Remove training breaks Local $iOverride = 63743, $asDelim ; $asDelim => replacements for comma, new line and double quote For $i = 0 To 2 $asDelim[$i] = __GetSubstitute($string, $iOverride) ; Choose a suitable substitution character If @error Then Return SetError(3, 0, 0) ; String contains too many unsuitable characters Next $iOverride = 0 Local $aArray = StringRegExp($string, '\A[^"]+|("+[^"]+)|"+\z', 3) ; Split string using double quotes delim - largest match $string = "" Local $iBound = UBound($aArray) For $i = 0 To $iBound -1 $iOverride += StringInStr($aArray[$i], '"', 0, -1) ; Increment by the number of adjacent double quotes per element If Mod ($iOverride +2, 2) = 0 Then ; Acts as an on/off switch $aArray[$i] = StringReplace($aArray[$i], $sDelim, $asDelim) ; Replace comma delimeters $aArray[$i] = StringRegExpReplace($aArray[$i], "(\r\n)|[\r\n]", $asDelim) ; Replace new line delimeters EndIf $aArray[$i] = StringReplace($aArray[$i], '""', $asDelim) ; Replace double quote pairs $aArray[$i] = StringReplace($aArray[$i], '"', '') ; Delete enclosing double quotes - not paired $aArray[$i] = StringReplace($aArray[$i], $asDelim, '"') ; Reintroduce double quote pairs as single characters $string &= $aArray[$i] ; Rebuild the string, which includes two different delimiters Next $iOverride = 0 $aArray = StringSplit($string, $asDelim, 2) ; Split to get rows $iBound = UBound($aArray) Local $aCSV[$iBound], $aTemp For $i = 0 To $iBound -1 $aTemp = StringSplit($aArray[$i], $asDelim) ; Split to get row items If Not @error Then If $aTemp > $iOverride Then $iOverride = $aTemp ReDim $aCSV[$iBound][$iOverride] ; Add columns to accomodate more items EndIf EndIf For $j = 1 To $aTemp If StringLen($aTemp[$j]) Then If Not StringRegExp($aTemp[$j], '[^"]') Then ; Field only contains double quotes $aTemp[$j] = StringTrimLeft($aTemp[$j], 1) ; Delete enclosing double quote single char EndIf $aCSV[$i][$j -1] = $aTemp[$j] ; Populate each row EndIf Next Next If $iOverride > 1 Then Return $aCSV ; Multiple Columns Else For $i = 0 To $iBound -1 If StringLen($aArray[$i]) And (Not StringRegExp($aArray[$i], '[^"]')) Then ; Only contains double quotes $aArray[$i] = StringTrimLeft($aArray[$i], 1) ; Delete enclosing double quote single char EndIf Next Return $aArray ; Single column EndIf EndFunc ;==> _CSVSplit ; #INTERNAL_USE_ONLY# =========================================================================================================== ; Name...........: __GetSubstitute ; Description ...: Searches for a character to be used for substitution, ie one not contained within the input string ; Syntax.........: __GetSubstitute($string, ByRef $iCountdown) ; Parameters ....: $string - The string of characters to avoid ; $iCountdown - The first code point to begin checking ; Return values .: Success - Returns a suitable substitution character not found within the first parameter ; Failure - Sets @error to 1 => No substitution character available ; Author ........: czardas ; Comments ......; This function is connected to the function _CSVSplit and was not intended for general use ; $iCountdown is returned ByRef to avoid selecting the same character on subsequent calls to this function ; Initially $iCountown should be passed with a value = 63743 ; =============================================================================================================================== Func __GetSubstitute($string, ByRef $iCountdown) If $iCountdown < 57344 Then Return SetError(1, 0, "") ; Out of options Local $sTestChar For $i = $iCountdown To 57344 Step -1 $sTestChar = ChrW($i) $iCountdown -= 1 If Not StringInStr($string, $sTestChar) Then Return $sTestChar EndIf Next Return SetError(1, 0, "") ; Out of options EndFunc ;==> __GetSubstitute
The code above is pretty junk (my part at least) so I wanted to improve it... This will return an array of arrays based on the excel file
#include <Array.au3> Func _GetExcelArrays($fnImportFile) $oExcel = ObjCreate("Excel.Application") $oExcel.Visible=False $oBook= $oExcel.Workbooks.Open($fnImportFile) $sheetCount=$oBook.Worksheets.Count Local $aReturnArray[$sheetCount] For $x=1 to $sheetCount $oSheet=$oBook.Worksheets($x) $oSheet.Activate $fnMaster=@TempDir&"\"&Chr(Round(Random(0,24),0)+64)&Random(0,24)&Chr(Round(Random(0,24),0)+64)&Random(0,24)&Chr(Round(Random(0,24),0)+64)&".csv" $oSheet.SaveAs($fnMaster, 6) $aReturnArray[$x-1]=_CSVSplit(FileRead($fnMaster)) FileDelete($fnMaster) Next $oBook.Close(False) $oExcel.Quit Return $aReturnArray EndFunc
I have searched and found many posts on READING CSV. What I need to do is WRITE CSV.
Typically I have a Listview, I use @guinness's excellent _GUICtrlListView_CreateArray to read that LV to Array. Currently I use _FileWriteFromArray to write the output. It is fast and it is easy.
The benefits of using these Array functions are their inherit flexibility. The code is portable. Plug it into LV report and you can write to file instantly. No formatting issues.
To be sure, my problem is not with the Listview or Array. My problem is with the file write to CSV: it should be fast, accurate and efficient.
Part of the problem is Microsoft Excel's insistence that my CSVs are not what they appear to be...
I am inspired by the simplicity of SQLITE3.EXE's dot command options:
.headers on .mode csv .once output.csv
Problem is that I want to export the text packed in characters, such as ' " ' with fields delimited by another character, such as ' , '.
Current method using _FileWriteFromArray outputs data like this
12-04-2016| 0.00| 131.00|131|Description 12-04-2016| 0.00| 132.00|132|Description 132 rece 12-04-2016| 998.00| 0.00|998|Description Receipt 12-04-2016| 0.00| 900.00|900|Description What I want to achieve is this:
"12-04-2016"," 0.00"," 131.00","131","Description" "12-04-2016"," 0.00"," 132.00","132","Description 132 rece" "12-04-2016"," 998.00"," 0.00","998","Description Receipt" "12-04-2016"," 0.00"," 900.00","900","Description" I have done this. The long way. I have written the output one $aResult[$i] item at a time, spaced with the $text & $field markers. I have placed this in an array with additional columns just to fit in the formatting. I have also played with the idea of inserting the array into a sqlite db, then run sqlite3's dot commands on that.
Is there a different method? Is there an efficient method to specify FileWriteFromArraytoCSV? Does anyone have ideas? I am sure I will know how to code this, I just need advice on the most efficient way of going about it. If I missed any threads, please enlighten me.
Thanks for reading
I have been playing around with AutoIt for a couple weeks now, just making random things for fun. There is one thing I cannot wrap my head around for the life of me..
Here is what I am wanting to do:
I have a csv file with this following data/columns:
(let's say this csv file is located C:\Test Folder\Test.csv)
So the csv is three rows(for this example) and has four columns.
I cannot figure out how to get this data out and store it into an array that I can use to type each column into 4 different word docs that are open. I do not need help with moving the mouse to each word doc, but i just don't know how to get the data out, put it into an array.
Once I have the data in an array, I know I can figure out how to loop through, but getting to that point is my problem.
Any help would be much appreciated.