12 posts in this topic
I have a script that takes a large excel file, pulls out and reorganizes certain information I need, and spits out a trimmed down csv file which I uses to upload the information on my website. Some of this information contains characters with accents or em dashes. By default it would create a csv file in ANSI which I then uploaded but had to tell my website import system it was windows-1252 in order for it to look correct.
This was all working fine except now I need to add in a non-breaking space and non-breaking hyphen into parts of my output. At first I tried using ChrW(0xA0) and ChrW(0x2011) as replacements. A quick test in the console looked correct, however opening the csv output in notepad++ showed the space correctly but a ? for the hyphen and the file was still encoded as ANSI. I tried to view it as UTF-8 instead but this just made the space appear as xAO and also other characters appeared that way like my em dashes appeared as x97 and another symbol as xA7 etc.
If I instead do a convert to UTF-8 from notepad++ then those problems go away except the hyphen still displays as ?. I then noticed on the page I linked for the non-breaking hyphen it lists the UTF-8 hex as 0xE2 0x80 0x91 (e28091). I was unsure how to enter this in autoit but several things i tried all failed to get the hyphen inserted.
I need a way to get both the space and hyphen added correctly as either ANSI or UTF-8, but if it is UTF-8 then I need a way to convert all of the other data I extracted from the excel file.
I've included a test excel file with a single line and test script to create a csv demonstrating the problem.
Haven't had much time to code recently. However the following thread inspired me.
The debate about linear, parallel and binary search methods was rather interesting and, in an attempt to be diplomatic, I decided to combine @jchd's suggestion with @LarsJ's binary search example. I decided that the binary search algorithm required modification to make it more linear. As usual, 'if you invent something, it probably already exists and if it already exists, it exists for a reason'. My first attempt was not all that good. The code worked but was really a mess. I blame peer pressure (to post an example of a parallel search method). I will delete that old code in due course.
With a little memory jogging and a glance at the help file, the solution turned out to be quite easy: I just needed a better understanding of Euler. Further modification will be needed to work with more complicated unicode strings. The output could be returned as an array or a delimitered string. I'm not so interested in those details. I'm just going to post the algorithm for now and anyone, who wants to, can modify it to suit their needs. Both arrays must contain at least 1 element.
Local $aFoo = [0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,19,20,23,24,26,30,35,39,40,41] Local $aBar = [0,1,5,6,7,8,9,10,11,12,13,14,17,18,19,21,24,25,26,27,34,35,38,40] ParallelExponetialSearch($aFoo, $aBar) ; Compares two lists - returning positive matches. Each input array must be unique (individually) and in alphabetical order. Func ParallelExponetialSearch($aFoo, $aBar) Local $sFind, _ $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _ $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B While $iMin_F < $iMax_F And $iMin_B < $iMax_B ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next, ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped. If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements $Hi_F = $iMax_F $iMin_B += 1 $sFind = $aBar[$iMin_B] While $Lo_F < $Hi_F ; search $aFoo For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2)) $Lo_F = $iMin_F + 2^$i If $aFoo[$Lo_F] = $sFind Then $iMin_F = $Lo_F ; each match should be added to the output [perhaps an array] ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF) ExitLoop 2 ElseIf $aFoo[$Lo_F] > $sFind Then $Hi_F = $Lo_F -1 $iMin_F += Floor(2^($i -1)) $Lo_F = $iMin_F ContinueLoop 2 EndIf Next $iMin_F = $Lo_F ; minimum increment is one WEnd Else ; $aBar has more untested elements $Hi_B = $iMax_B $iMin_F += 1 $sFind = $aFoo[$iMin_F] While $Lo_B < $Hi_B ; search $aBar For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2)) $Lo_B = $iMin_B + 2^$i If $aBar[$Lo_B] = $sFind Then $iMin_B = $Lo_B ; each match should be added to the output [perhaps an array] ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF) ExitLoop 2 ElseIf $aBar[$Lo_B] > $sFind Then $Hi_B = $Lo_B -1 $iMin_B += Floor(2^($i -1)) $Lo_B = $iMin_B ContinueLoop 2 EndIf Next $iMin_B = $Lo_B ; minimum increment is one WEnd EndIf WEnd EndFunc ;==> ParallelExponetialSearch I hope this will be useful to someone. I believe it deserved a thread of its own!
I have a csv file with delimiters "," and @CRLF
After trying several different examples (simple ones!) I still haven't resolved an answer
#include <Array.au3> #include <File.au3> ;#include "ArrayMultiColSort.au3" Global $BatchDir = "C:\ncat\" FileChangeDir($BatchDir) $sFile = "mod1.csv" ; Read file into a 2D array Global $aArray _FileReadToArray($sFile, $aArray, $FRTA_NOCOUNT, ",") ; And here it is _ArrayDisplay($aArray, "Original", Default, 8) Firstly it throws a MsgBox error "No array variable passed to function" _ArrayDisplay(), so no displayed data
Second and probably more important how do I get _FileReadToArray() to split the imported array into rows and columns?
I tried "," & @CRLF without success.
We can get a list of file using the below code.
Local $aFileList = _FileListToArray(@DesktopDir, "*") Is there any option to use the above one recursively to get sub folders and their contents also.??
And also, is there any way to serialize the above array locally to some file and load it later when we want in another program on another machine so that we can compare its contents with a folder in different machine, which is not network connected also.?
Good morning guys
I'd like to know if there is a way to convert a PDF in CSV or, eventually, in TXT, in order to read from it, like a database...
I have a PDF and I think ( I dind't search a lot on the forum ) with AutoIt, but I'd like work with Excel styles...
Does anyone know a good program which convert PDF to CSV?
PS: the PDF file is 5 MB, and it contains 439 pages...
Thanks everyone for the help