Jump to content
ShawnW

Need help with File Encoding

Recommended Posts

ShawnW

I have a script that takes a large excel file, pulls out and reorganizes certain information I need, and spits out a trimmed down csv file which I uses to upload the information on my website. Some of this information contains characters with accents or em dashes. By default it would create a csv file in ANSI which I then uploaded but had to tell my website import system it was windows-1252 in order for it to look correct.

This was all working fine except now I need to add in a non-breaking space and non-breaking hyphen into parts of my output. At first I tried using ChrW(0xA0) and ChrW(0x2011) as replacements. A quick test in the console looked correct, however opening the csv output in notepad++ showed the space correctly but a ? for the hyphen and the file was still encoded as ANSI. I tried to view it as UTF-8 instead but this just made the space appear as xAO and also other characters appeared that way like my em dashes appeared as x97 and another symbol as xA7 etc.

If I instead do a convert to UTF-8 from notepad++ then those problems go away except the hyphen still displays as ?. I then noticed on the page I linked for the non-breaking hyphen it lists the UTF-8 hex as 0xE2 0x80 0x91 (e28091). I was unsure how to enter this in autoit but several things i tried all failed to get the hyphen inserted.

I need a way to get both the space and hyphen added correctly as either ANSI or UTF-8, but if it is UTF-8 then I need a way to convert all of the other data I extracted from the excel file.

I've included a test excel file with a single line and test script to create a csv demonstrating the problem.

test.xlsx

test.au3

Share this post


Link to post
Share on other sites
Subz

What about:

#include <Array.au3>
#include <Excel.au3>

$excelFile = FileOpenDialog("Select test file", @DesktopDir, "Excel files (*.xls*)")

$oExcel = _Excel_Open(False, False, False, False, True)
$oWorkbook = _Excel_BookOpen($oExcel, $excelFile, True, False)
$table = _Excel_RangeRead($oWorkbook, Default, Default, 1, True)
_Excel_BookClose($oWorkbook)

$table[0][0] &= ". pp." & ChrW(0xA0) & $table[0][1] & ChrW(0x2014) & $table[0][2]
_ArrayColDelete($table, 2)
_ArrayColDelete($table, 1)

$sTable = _ArrayToString($table, ",")
$hFileOpen = FileOpen(@DesktopDir & "\test.csv", 258)
FileWrite($hFileOpen, $sTable)
FileClose($hFileOpen)

 

Share this post


Link to post
Share on other sites
ShawnW

Thanks for the suggestion.

The first problem there is that 0x2014 is the Em Dash that although present in the excel file data, is not what i need when I add the hyphen in that line. I need the shorter dash like on the keyboard except one that is treated as part of the word when wrapping. I'm thinking it is 0x2011 but I cannot get that to display. In researching the dashes I should honestly be using an En Dash (0x2013) for showing a range of numbers, but that breaks in wrapping, and I didn't get to set the requirements for this, I just have to come up with the solution.

Secondly, FileWrite on the array does add commas for the csv but that is all it does. It does not add quotes around cells that contain a comma already, and if it did, it wouldn't escape existing quotes either. I didn't have a line like this in my example because I did not consider a different way of creating the csv, but I do have lines that would require a more accurate csv creation.

Share this post


Link to post
Share on other sites
Subz

What happens if you use the following after Excel has created the file?  Not sure about the hyphen, if I get a chance will look into it further.

$hFileOpen = FileOpen(@DesktopDir & "\test.csv", 257)
FileClose($hFileOpen)

 

Share this post


Link to post
Share on other sites
ShawnW

That didn't seem to change the encoding. I did play with it some more though based on this idea. It seems neither that nor writing to the file in that method changed it to UTF8. I did find that if i read it, then open it in overwrite mode it and write it then it changes to UTF8.

$str = FileRead(@DesktopDir & "\test.csv")
$hFileOpen = FileOpen(@DesktopDir & "\test.csv",130)
FileWrite($hFileOpen, $str)
FileClose($hFileOpen)

But that still leaves the problem of the hyphen.

Share this post


Link to post
Share on other sites
Subz

Sorry was in a rush to get out the door this morning, have you tried uploading the file using ChrW(0x2011)?  If you open it in Excel the spreadsheet renders it correctly, in Notepad it will show special characters because not all fonts support ChrW(0x2011) for example Sans-Serif.

Share this post


Link to post
Share on other sites
jchd

I highly recommend DejaVu Sans Mono to be used in Notepad++ (and Scite as well). It covers a large range of Unicode and is particularly readable for programming and general text edit. Both mentionned codepoints show correctly, among many others.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
ShawnW
15 hours ago, Subz said:

Sorry was in a rush to get out the door this morning, have you tried uploading the file using ChrW(0x2011)?  If you open it in Excel the spreadsheet renders it correctly, in Notepad it will show special characters because not all fonts support ChrW(0x2011) for example Sans-Serif.

If try and upload it still shows as '?'. Also, I don't get the same result after opening in Excel 2016. Running on test file and opening in Excel shows the '?' on both the UTF8 and ANSI versions.

I tried something simpler and copied the actual hyphen and just added it in string form. It looks correct in the scite editor, and if I output it to the console when running it shows correctly there as well. Yet after Excel writes it to the csv it breaks, and no matter where I open it, or what font I use, it shows as a '?'.

It seems the problem is in the way excel makes the csv, I may just do a more complex csv manual creation building on your 1st example. It seems the display problem i had with the hyphen after writing the file without excel was indeed the font problem you refer to. If I can't do it that way though I may just have to fight every OCD bone in my body and add it as the html code &#x2011; since it is only ever going to be displayed on a webpage that way.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • MadhaN
      By MadhaN
      Hi all,
      I have a csv file as below, I wand to find srno from csv and send corresponding ip and pass to commend cmd prompt. 
      Please guide me to create script .
      srno,name,ip,pass
      1,name1,ip1,pass1
      2,name2,ip2,pass2
       
       
       
       
    • ahha
      By ahha
      I think this is a very basic question, but I'm stumped after trying to solve it for weeks.  The program below illustrates the issue.  I have several instances of Excel open, each instance having several books open, each book with several sheets.  I'm able to list all this information, however I can't seem to figure out the sheet and workbook for a user selected range.  Any hints appreciated because at this point as I feel like a blind squirrel looking for a nut
       
      #AutoIt3Wrapper_run_debug_mode=Y #include <Array.au3> #include <Excel.au3> #include <MsgBoxConstants.au3> #include <Debug.au3> ;Illustrate issue I'm having. For a user seletecd range (possibly multiple $oWorkbooks open with multiple sheets), I need to determine the Excel application object of the selected cells and the sheet ;I need $oWorkbook, $WorkSheet, $Range ;Simulate issue - in real world user may have opened Excel and I have no knowledge of the object $oExcel1 = _Excel_Open() ;open first instance _Excel_BookNew($oExcel1) ;workbook with 3 sheets _Excel_BookNew($oExcel1) ;another workbook in same instance with 3 sheets $oExcel2 = _Excel_Open(Default, Default, Default, Default, True) ;open second instance _Excel_BookNew($oExcel2) ;workbook with 3 sheets _Excel_BookNew($oExcel2) ;another workbook in same instance with 3 sheets $oExcel3 = _Excel_Open(Default, Default, Default, Default, True) ;open third instance _Excel_BookNew($oExcel3) ;workbook with 3 sheets _Excel_BookNew($oExcel3) ;another workbook in same instance with 3 sheets ;now here's what I know without a priori knowledge of the objects ;the workbook names are unigue - that is there will never be a Book1 in any but one of the instances or filename (i.e. single open instance of a particular file) $aWorkBooks = _Excel_BookList() ;get an array of all workbooks open ;Success: a two-dimensional zero based array with the following information: ;col 0 - Object of the workbook ;col 1 - Name of the workbook/file ;col 2 - Complete path to the workbook/file If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Error listing workbooks.", "@error = '" & @error & "'" & @CRLF &"@extended = '" & @extended & "'") _DebugArrayDisplay($aWorkBooks, "List of all workbooks open. Col 0 = Object, Col 1 = workbook name, Col 2 = full filename path") ;at this point we have the Object associated with the book name but no full filename path as not saved yet ;now list the sheets for each Object Workbook For $i = 0 to UBound($aWorkBooks, $UBOUND_ROWS) - 1 ;0 based $aWorkSheets = _Excel_SheetList($aWorkBooks[$i][0]) ;Col 0 = Workbook object ;Success: a two-dimensional zero based array with the following information: ; 0 - Name of the worksheet ; 1 - Object of the worksheet If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Error listing Worksheets.", "@error = '" & @error & "'" & @CRLF & "@extended = '" & @extended & "'") _ArrayDisplay($aWorkSheets, "$aWorkSheets for $aWorkBooks[" & $i & "]") Next MsgBox($MB_SYSTEMMODAL + $MB_OK, "Info", "Select a range in any Excel instance, any Workbook, and any sheet. Then click OK.") ;I have spent weeks trying to figure this out. Looked at Water's UDF (excellent tight code) and got nothing using a default. All need $oExcel ;********** all this is attempts to get it and they all failed ;********** ;from this: https://www.autoitscript.com/autoit3/docs/functions/ObjGet.htm ;found a possible clue in comment "Error Getting an active Excel Object. <------- **ACTIVE** - so try it Local $oDefaultActiveExcelObject = ObjGet("", "Excel.Application") ; Get an existing Excel Object If @error Then MsgBox($MB_SYSTEMMODAL, "DEBUG", "Error Getting an active Excel Object. Error code: " & Hex(@error, 8)) Else MsgBox($MB_SYSTEMMODAL, "DEBUG", "Success - we got an active Excel Object") EndIf ;Now I have the object so get the rest of the info. We could check this instance against the opened ones. ;hard coded for testing. If $oDefaultActiveExcelObject = $oExcel1 Then MsgBox($MB_SYSTEMMODAL, "DEBUG", "$oExcel1 is the active Excel Object") Else If $oDefaultActiveExcelObject = $oExcel2 Then MsgBox($MB_SYSTEMMODAL, "DEBUG", "$oExcel2 is the active Excel Object") Else If $oDefaultActiveExcelObject = $oExcel3 Then MsgBox($MB_SYSTEMMODAL, "DEBUG", "$oExcel3 is the active Excel Object") Else MsgBox($MB_SYSTEMMODAL, "DEBUG", "ERROR - I have no idea what the active Excel Object is.") EndIf EndIf EndIf ;go ahead and get information MsgBox($MB_SYSTEMMODAL, "DEBUG", "$oDefaultActiveExcelObject.ActiveWorkbook.Name = '" & $oDefaultActiveExcelObject.ActiveWorkbook.Name & "'") ; <<<<<<<<<<<<<------------ this picked the wrong one. **So it looks like each instance has an active workbook.** ;At this point I'm really stumped. I probably should submit to the experts. ;I need to find $oExcel, $oWorkbook, $vWorkSheet, for the user selected range because I want to use ;$vRange = _Excel_RangeRead($oWorkbook, $vWorksheet, $oExcel.Selection.Address) ;this returns absolute like $D$3:$E$5 UNLESS it's a single cell then it returns only single absolute like $E$2 $vRange = _Excel_RangeRead("Book4", "Sheet2", "C2") ;this returns absolute like $D$3:$E$5 UNLESS it's a single cell then it returns only single absolute like $E$2 MsgBox(0, "Info", "The name of the active sheet is '" & $oExcel1.ActiveSheet.Name & "'") ;still need application object $oExcel1 MsgBox($MB_SYSTEMMODAL, "Info", "$vRange = '" & $vRange & "'") ;knowing $oExcel instance might be helpful ;$vRange = $oExcel.Selection.Address ;this returns absolute like $D$3:$E$5 UNLESS it's a single cell then it returns only single absolute like $E$2 ;I need to know the $oExcel ;I don't think I can use _Excel_BookAttach in any way as I need to know in advance a string, a filename, or an instance ;Au3Info not showing any distinctions - I'm stuck. MsgBox($MB_SYSTEMMODAL, "Info", "Pause before exit.") Exit  
    • XinYoung
      By XinYoung
      HI! ... this is a big one (at least for me) 
      You guys previously helped me copy the used range in column A and paste them into a Website one at a time in a loop. Cool! Now, for another function, I have 2 columns, A and B, and two input boxes in the Website. I'm having a hard time replicating the loop for the 2 columns. 
      This is how I'm opening the Excel workbook (copied from the previous function that only had 1 column). I need to also get the used range in column B.
      Func OpenExcelForCopy() Global $aBBTableData Global $oExcel = _Excel_Open() Global $oWorkbook = _Excel_BookOpen($oExcel, $ChosenFileName, Default, True, True) $oExcel.Sheets("CopyCourses").Activate ;~ Get all used cells in column A:A Global $aSearchItems = _Excel_RangeRead($oWorkbook, 1, $oWorkbook.Sheets("CopyCourses").Usedrange.Columns("A:A")) ;~ Duplicate the $aSearchItems Array Global $aSearchResult = $aSearchItems ;~ Loop through the array starting at 0 until the end of the array which is (Ubound($aSearchItems) - 1) For $i = 0 To UBound($aSearchItems) - 1 $aSearchResult[$i] = SearchCourseForCopy($aSearchItems[$i]) Next _Excel_RangeWrite($oWorkbook, Default, $aSearchResult, "C1") Finished() EndFunc ;==>OpenExcelForCopy Then we eventually get here. I don't think anything needs to change here but I'm not sure. This is where I paste the data from Column A into an input field (which is a search tool in a website). If the search is good, then we get to the tricky part...
      ;~ OK, we logged in and we searched for a course. Lets COPY it! Func CopyCourseBegin() Local $sResult $iSearchIndex = _ArraySearch($aBBTableData, "Course ID", 0, 0, 0, 1, 1, 0) ;~ If the course was not found, do this. If $iSearchIndex = -1 Then ;~ MsgBox(4096, "Search Error", "Item not found") $sResult = "Source Not Found" _Excel_RangeWrite($oWorkbook, Default, $aSearchResult, "C1") ;~ Now go back to the Excel sheet and search for the next one. ;~ If the course was found, begin the COPY process. Else For $i = 0 To UBound($aSearchItems) - 1 $aSearchResult[$i] = CopyCourseNow($aSearchItems[$i]) Next $sResult = "Copied" _Excel_RangeWrite($oWorkbook, Default, $aSearchResult, "C1") EndIf Return $sResult EndFunc ;==>CopyCourseBegin This is the "tricky part" where I'm confused. I can copy and paste what's in column A just fine, but I can't manage to replicate it for column B. I need to paste whats in Column B into "destinationCourseId"
      ;~ The course search was successful. COPY the course now. Func CopyCourseNow($_sSearchResult) ;~ Navigate to the course copy page. _IENavigate($oIE, $urlBBCourseCopy) ;~ Copy the SOURCE course ID from the Excel sheet ;~ Paste whats copied from column A into the Source Course ID text box Local $oForm = _IEGetObjByName($oIE, "selectCourse") Local $oSearchString = _IEFormElementGetObjByName($oForm, "sourceCourseId") _IEFormElementSetValue($oSearchString, $_sSearchResult) ;~ Paste whats copied from column B into the Destination Course ID text box ?!?!?!?! Local $oForm = _IEGetObjByName($oIE, "selectCourse") Local $oSearchString = _IEFormElementGetObjByName($oForm, "destinationCourseId") _IEFormElementSetValue($oSearchString, $_sSearchResult) ;~ Just exit cause im stuck :( _Exit() EndFunc ;==>CopyCourseNow After I paste the data from column A into "sourceCourseId" and column B into "destinationCourseId", I'll make it do some stuff. Then I need it to loop around until the used ranges in column A & B is finished.
      Does the entire code need to change now that there's two columns?
       
       
    • AnonymousX
      By AnonymousX
      Hello,
      I'm trying to be able to switch back and forth between multiple excel spreadsheets and I can't seem to get the WinActivate function to work, and bring the desired window the be the active window.
      Could I please get some assistance, I've tried a few things and nothing seems to work quite right. Below is a test case where I'm just trying to make the first excel sheet that was opened become the active window, and testing it by grabbing a cell value off that workbook. The message box produces the correct answer if both files are closed before running but the 2nd test file will appear to be the active window. If the code is run again without closing the excel files, nothing works (file does not appear to be active and message box will not give an answer).
       
      #include <Excel.au3> Opt("WinTitleMatchMode", 2) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase ;Open Test1 Excel Workbook local $oExcel = _Excel_open() Local $ofile = @ScriptDir & "\test1.xlsx" Local $oWorkbook = _Excel_BookOpen($oExcel,$ofile) ;Open Test2 Excel Workbook local $mExcel = _Excel_open() Local $mfile = @ScriptDir & "\test2.xlsx" Local $mWorkbook = _Excel_BookOpen($mExcel,$mfile) ; This workbook is completely blank WinActivate($oWorkbook); should make Test1 the active window local $read1 = _Excel_RangeRead($oWorkbook,Default,"B2"); Cell B1 in Test1 workbook contains the word Test MsgBox(0,0,$read1);Should returns the word test  
    • nooneclose
      By nooneclose
      I want to check some Excel data against data on a website in Chrome. I use Chrome because the site I use does not function properly in Internet Explorer or Firefox. I know how to do the Excel stuff I just can not figure out how to send to Chrome, let alone check to see if the data matches or not. I am also having trouble finding any help online while searching for Chrome functions for Autoit. I have a Chrome UDF installed but I still can not figure out how to get my code to properly function. (I am not posting code because I am  sure my code isn't right, to begin with)
      As usual, any and all help would be greatly appreciated. 
×