turbox Posted August 23, 2008 Posted August 23, 2008 Does somebody know any script or how to do a script which convert a file from ansi to unicode?
zFrank Posted August 23, 2008 Posted August 23, 2008 you can easily convert your script ansi to unicode using AutoIt Compiler. it can be found at C:\Program Files\AutoIt3\Aut2Exe Aut2Exe.exe is for unicode and Aut2ExeA.exe is for ansi... [font="Georgia"]GSM Expert[/font] but not AutoIt :DProud to be Admin Of : http://www.gsmhosting.net/visit my Forum... http://www.gsmhosting.net/vbb/index.php$Life = "Happy" If @Error Then $Life = "Risk"
zFrank Posted August 23, 2008 Posted August 23, 2008 what type of file? exe or au3? [font="Georgia"]GSM Expert[/font] but not AutoIt :DProud to be Admin Of : http://www.gsmhosting.net/visit my Forum... http://www.gsmhosting.net/vbb/index.php$Life = "Happy" If @Error Then $Life = "Risk"
AdmiralAlkex Posted August 23, 2008 Posted August 23, 2008 What about using FileOpen() and read the file as ansi and then write back as unicode?? Or do you want to do something else? .Some of my scripts: ShiftER, Codec-Control, Resolution switcher for HTC ShiftSome of my UDFs: SDL UDF, SetDefaultDllDirectories, Converting GDI+ Bitmap/Image to SDL Surface
turbox Posted August 24, 2008 Author Posted August 24, 2008 Actually is a dat file i tried fileopen(x, 32) but doesn't work. It reads only the half but if i save the file as unicode the it reads it all
Moderators SmOke_N Posted August 24, 2008 Moderators Posted August 24, 2008 (edited) If you are trying to convert an Ansi file to Unicode, then you wouldn't open it in Unicode, you'd open it with 0 or just FileRead().http://www.autoitscript.com/forum/index.ph...ic=21815&hl may help you. Edited August 24, 2008 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
turbox Posted August 25, 2008 Author Posted August 25, 2008 I posted the file that i want to readsettings.txt
Andreik Posted August 25, 2008 Posted August 25, 2008 Try this: $FILE_OPEN = FileOpen(@ScriptDir & "\SETTINGS.TXT",0) $DATA = FileRead($FILE_OPEN) FileClose($FILE_OPEN) $FILE_WRITE = FileOpen(@ScriptDir & "\Unicode_settings.txt",32+2) FileWrite($FILE_WRITE,$DATA) FileClose($FILE_WRITE)
turbox Posted August 25, 2008 Author Posted August 25, 2008 it reads untill Ύ and then stops. only when i save it as unicode it can be readen
PsaltyDS Posted August 25, 2008 Posted August 25, 2008 it reads untill Ύ and then stops. only when i save it as unicode it can be readenIf there are any null characters in the ANSI version, then you will have to read it in binary, remove or replace the nulls, then you can do BinaryToString() and save it in any format you want.Here is a similar situation where nulls had to be removed: Demo Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
rudi Posted August 25, 2008 Posted August 25, 2008 Hi.I've read up your Demo, thanks.I'm just wondering: Why are Chr(0) possible/allowed in ANSI, but not in Unicode? Or, asked differently: What's the representative of Chr(0) in Unicode?And: By replacing Chr(0) with "<null>", isn't basically the content of the file forged unduly? (I don't know, I just ask)Regards, Rudi. Earth is flat, pigs can fly, and Nuclear Power is SAFE!
PsaltyDS Posted August 25, 2008 Posted August 25, 2008 Hi. I've read up your Demo, thanks. I'm just wondering: Why are Chr(0) possible/allowed in ANSI, but not in Unicode? Or, asked differently: What's the representative of Chr(0) in Unicode? And: By replacing Chr(0) with "<null>", isn't basically the content of the file forged unduly? (I don't know, I just ask) Regards, Rudi. It has nothing to do with being "allowed" in ANSI. Chr(0) indicates EOF (End Of File) to AutoIt when encountered in a string, so it stops processing the string (or file) at that point. Reading the file in Binary avoids that issue to get the whole file read into memory, so that you can do something with the nulls before continuing string processing. Demo: $sStart = ":Start:" $sEnd = ":End:" $sString = $sStart & Chr(0) & $sEnd ConsoleWrite("Length = " & StringLen($sString) & @LF); 13 characters long ConsoleWrite("$sString = " & $sString & @LF); String seems to end at the null ConsoleWrite(@LF); Extra LF because the previous one get cut off by the null Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Moderators SmOke_N Posted August 25, 2008 Moderators Posted August 25, 2008 Just to iterate what salty is saying:Local $s_string = "I am a " & Chr(0) & "string with " & Chr(0) & "nulls" MsgBox(64, "Info", "$s_string = " & $s_string) Local $s_binary = StringToBinary($s_string) Local $s_strip_nulls = StringRegExpReplace($s_binary, "(00)|(.{2})", "\2") Local $s_convert_non_null = BinaryToString($s_strip_nulls) MsgBox(64, "Info", _ "$s_binary = " & $s_binary & @CRLF & @CRLF & _ "$s_strip_nulls = " & $s_strip_nulls & @CRLF & @CRLF & _ "$s_convert_non_null = " & $s_convert_non_null) Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
rudi Posted August 25, 2008 Posted August 25, 2008 Hi. 1.) OK, I can see, that Autoit handles even longer strings as "NULL-terminated". 2.) When a given ANSI file includes Chr(0) characters, and it shall be transformed to Unicode, I would have expected, that also these Chr(0) would have to be "transformed" to Unicode Stripping them out of the file will alter the file's content, isn't it? 3.) I don't really get the regex: Just to iterate what salty is saying:[snip] StringRegExpReplace($s_binary, "(00)|(.{2})", "\2") I can see, that it works so let me try to understand it: "(00)" seems to represent Chr(0). Due to the help file I thought that the syntax should be "\x##", that would come to "\x00"? "|" means or. (OK) "(.{2})" I don't get that one: "." = any character, {2} = repeated exactly 2 times? And why "\2", = backref the #2 match, isn't it? Honestly, I loose you here Regards, Rudi. Earth is flat, pigs can fly, and Nuclear Power is SAFE!
PsaltyDS Posted August 26, 2008 Posted August 26, 2008 Hi. 1.) OK, I can see, that Autoit handles even longer strings as "NULL-terminated". 2.) When a given ANSI file includes Chr(0) characters, and it shall be transformed to Unicode, I would have expected, that also these Chr(0) would have to be "transformed" to Unicode Stripping them out of the file will alter the file's content, isn't it? Yes, if you wanted to preserve file formatting with nulls, you would have to substitute a marker for the nulls, convert to Unicode, then put the nulls back. This would actually be easier with StringSplit($sString, Chr(0)). You just put the null back when you reassemble the string from the array with _ArrayToString(). 3.) I don't really get the regex: I can see, that it works so let me try to understand it: "(00)" seems to represent Chr(0). Due to the help file I thought that the syntax should be "\x##", that would come to "\x00"? "|" means or. (OK) "(.{2})" I don't get that one: "." = any character, {2} = repeated exactly 2 times? And why "\2", = backref the #2 match, isn't it? Honestly, I loose you here Regards, Rudi. The effect is working on hex digits rather than characters. The "(00)|(.{2})" means match 00 or any two hex numbers (one byte). If it matches 00 then the back reference to the match is "\1", and "\2" is nothing because the rest of the options don't get evaluated once there is a match. Therefore, 00 gets replaced with nothing. In the case where there is something there (i.e. 58 for 'X') the first part doesn't match, but the second part does because .{2} means any two digits. So "\1" is nothing and "\2" is 58, and 58 gets replaced with 58. That last may seem like a waste of time, but consider the string 'P' & @LF, which is 500A. If you don't make sure everything is handled two digits at a time, the 00 in the middle gets removed and you wind up with 5A ('Z'). SmOke_N is such a geek... Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
rudi Posted September 4, 2008 Posted September 4, 2008 [snip] That last may seem like a waste of time, but consider the string 'P' & @LF, which is 500A. If you don't make sure everything is handled two digits at a time, the 00 in the middle gets removed and you wind up with 5A ('Z'). Back from some days of holidays (France, from Colmar down to the Côte d'Azur) I find your reply: Really well explained , thanks. BTW: Even though I prefer freeware, if available, I've spent some bucks for RegExpBuddy (It's just a pitty, that such a genious tool seems not to be available for free -- up to now ) I't makes it very easy to understand RegEx examples found here and in other places. I'm enjoying several, even complex examples with ease right now Regards and tx again, Rudi. Earth is flat, pigs can fly, and Nuclear Power is SAFE!
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now