Ahmedmb Posted March 26, 2005 Share Posted March 26, 2005 how write UTF-8 TEXT file? Link to comment Share on other sites More sharing options...
somh Posted October 7, 2005 Share Posted October 7, 2005 how write UTF-8 TEXT file? my question too. Link to comment Share on other sites More sharing options...
peethebee Posted October 7, 2005 Share Posted October 7, 2005 Hi! I don't thnk that it is possible :-(. peethebee vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvGerman Forums: http://www.autoit.deGerman Help File: http://autoit.de/hilfe vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv Link to comment Share on other sites More sharing options...
footswitch Posted April 10, 2006 Share Posted April 10, 2006 Where is the UTF-8 text coming from? a file? I could demonstrate some binary file read/write... if I knew some specifics.Lar.hello. in my case, the UTF-8 text comes from .txt files saved from common Windows Notepad in UTF-8 format. i need to use this format in order to read variables into Macromedia Flash correctly.an utf-8 file read with the FileReadLine() returns strange chars for the first letters of the first line, and all chars out of the a...z A...Z 0...9 bound become incorrect.i use chars like à À Á á é í ó ú â ã ...can you provide that binary file read/write demonstration, please?thanks in advance for your support Link to comment Share on other sites More sharing options...
___Kevin___ Posted April 21, 2006 Share Posted April 21, 2006 I would like to know how to read UTF-8 text, too... Kevin Link to comment Share on other sites More sharing options...
___Kevin___ Posted October 23, 2006 Share Posted October 23, 2006 hello. in my case, the UTF-8 text comes from .txt files saved from common Windows Notepad in UTF-8 format. i need to use this format in order to read variables into Macromedia Flash correctly.an utf-8 file read with the FileReadLine() returns strange chars for the first letters of the first line, and all chars out of the a...z A...Z 0...9 bound become incorrect.i use chars like à À Á á é í ó ú â ã ...can you provide that binary file read/write demonstration, please?thanks in advance for your supportAny news on that? Is it still not possible to read UTF8 encoded text files?Thanks,Kevin. Link to comment Share on other sites More sharing options...
Lazycat Posted October 23, 2006 Share Posted October 23, 2006 Any news on that? Is it still not possible to read UTF8 encoded text files?Thanks,Kevin.Since Autoit support binary strings, this is possible. Look this topic: http://www.autoitscript.com/forum/index.php?showtopic=21815 Koda homepage ([s]Outdated Koda homepage[/s]) (Bug Tracker)My Autoit script page ([s]Outdated mirror[/s]) Link to comment Share on other sites More sharing options...
svkhtn Posted October 23, 2006 Share Posted October 23, 2006 I am also interested in this, but haven't found any solution yet. 1. I created a UTF-8 text file test.txt by Notepad. The text file (test.txt) contains this line "Kiểm tra Tiếng Việt." 2. Read it in raw mode. 3. Used ControlSendText to send it back to that opened Notepad (test.txt is currently open), but the text appeared wrong. $strFile = "test.txt" $file = FileOpen($strFile, 4) $text = FileRead($file,FileGetSize($strFile)) ControlSetText("test.txt", "", "Edit1", $text) Do I need to convert $text before sending it back to Notepad by ControlSetText? If so, how can I convert? Many thanks!!! hello. in my case, the UTF-8 text comes from .txt files saved from common Windows Notepad in UTF-8 format. i need to use this format in order to read variables into Macromedia Flash correctly. an utf-8 file read with the FileReadLine() returns strange chars for the first letters of the first line, and all chars out of the a...z A...Z 0...9 bound become incorrect. i use chars like à À Á á é í ó ú â ã ... can you provide that binary file read/write demonstration, please? thanks in advance for your support Link to comment Share on other sites More sharing options...
sulfurious Posted October 23, 2006 Share Posted October 23, 2006 UTF-8, friend or foe? Here is an ASCII string "I am not" Here is the ASCI hex49 20 61 6D 20 6E 6F 74 Now, UTF-8, being a variable length encoded standard, will use 8 to 16 bits to make up a string. The above string would most likely, in hex, be exactly the same. Now, UTF-16 uses 16 bits to encode, so the above would look like this49 00 20 00 61 00 6D 00 20 00 6E 00 6F 00 74 00 The problem here is that UTF-8, when using an extended character, will use only 8 bits normally. Consider the sample string "Kiểm tra Tiếng Việt". In UTF-16, it would be4B 00 69 00 C3 1E 6D 00 20 00 74 00 72 00 61 00 20 00 54 00 69 00 BF 1E 6E 00 67 00 20 00 56 00 69 00 C7 1E 74 00 Look at the same string in UTF-8, and see4B 69 E1 BB 83 6D 20 74 72 61 20 54 69 E1 BA BF 6E 67 20 56 69 E1 BB 87 74 Meanwhile, ASCII has no extended characters, so you would see the string as "Ki?m tra Ti?ng Vi?t" and hex as4B 69 3F 6D 20 74 72 61 20 54 69 3F 6E 67 20 56 69 3F 74 So, is UTF-8 friend or foe? In the world of AutoIt, I would consider it a foe. Reason? Because AutoIt has no true Unicode functionality. With a UTF-16 file, you can check for extended characters because you will definately see them in hex. Example? FF would be the highest value for the first 4 bits of a UTF-16 character, with the second 4 bits always being 00 for non-extended characters. Whereas anything above FF or the second 4 bits being NOT 00, indicates an extended character. Let me ask you, how are you going to tell a parsing routine when to "know" that a certain hex value is extended? Look at UTF-8 in a hex editor. Short of knowing what the text is, or seeing the text representation on the side, you would not know. There is no marker that I know of that you could use to change your script logic. UTF-16 at least gives you the capability to check bits. later, Sul Link to comment Share on other sites More sharing options...
sulfurious Posted October 23, 2006 Share Posted October 23, 2006 Hmm. Here is a test string UTF-8 I am not going to give in to Kiểm tra Tiếng Việt à À Á á é í ó ú â ã And here is the hex for that49 20 61 6D 20 6E 6F 74 20 67 6F 69 6E 67 20 74 6F 20 67 69 76 65 20 69 6E 20 74 6F 20 4B 69 E1 BB 83 6D 20 74 72 61 20 54 69 E1 BA BF 6E 67 20 56 69 E1 BB 87 74 0D 0A C3 A0 20 C3 80 20 C3 81 20 C3 A1 20 C3 A9 20 C3 AD 20 C3 B3 20 C3 BA 20 C3 A2 20 C3 A3 After looking at it some more, I am not sure how you could convert it. It looks like to make the character ể, it takes 3 hex value, being E1 BB 83. I thought maybe there was a marker somewhere, but I don't see one. Convert it to UTF-16 and then you can manipulate it. late, Sul Link to comment Share on other sites More sharing options...
svkhtn Posted October 24, 2006 Share Posted October 24, 2006 Hi sulfurious, Thank you very much for your explanation. By the way, if I just want to read it from a raw file (either UTF-8 or UTF-16) and use ControlSetText to set that unicode text to the notepad, can I display the unicode text correctly???? I just want to read and set the text (no need to change or manipulate that text). I tried but it didn't show properly using ControlSetText. Any idea please? Link to comment Share on other sites More sharing options...
svkhtn Posted October 25, 2006 Share Posted October 25, 2006 I think the main reason is because AutoIt is not able to handle unicode texts properly at the moment.As in another thread, I also asked something similar about unicode: http://www.autoitscript.com/forum/index.ph...03&hl=UTF-8At the current version of AutoIt, I think you can only read the unicode from a RAW file, manipulate it based on binary string, and write it again to another file. I tried to put the unicode text to the Clipboard, or use ControlSetText to set the text to notepad (which obviously can display unicode text properly) without any success so far . I think somehow AutoIt messes up the unicode text before it is assigned to a clipboard/textbox/control etc.I hope the next version of AutoIt will handle unicode better. Link to comment Share on other sites More sharing options...
xian7479 Posted October 25, 2006 Share Posted October 25, 2006 AutoIt cannot handle Unicode or UTF-8 by itself, but it can call system dlls which can handle Unicode or UTF-8 (the OS has to be Windows 2000 or later which supports Unicode). As long as you convert Unicode or UTF-8 text into ASCII before you process it in a nonbinary level, and convert it back before you write it in binary mode. But obviously, I don't think you can directly put Unicode text into Clipboard or Notepad because AutoIt does not support Unicode in nonbinary level. Link to comment Share on other sites More sharing options...
Aticsia Posted June 20, 2020 Share Posted June 20, 2020 On 3/27/2005 at 1:31 AM, Ahmedmb said: how write UTF-8 TEXT file? Just use notepad++ open that file, select encoding -> UTF8 after that return back autoit editor, you can type UTF-8 as must as you want Link to comment Share on other sites More sharing options...
Developers Jos Posted June 20, 2020 Developers Share Posted June 20, 2020 12 hours ago, Aticsia said: Just use notepad++ open that file, select encoding -> UTF8 after that return back autoit editor, you can type UTF-8 as must as you want Care to share why you thought it would be good idea to answer and 14 years old post as an first post in our forums? Jos SciTE4AutoIt3 Full installer Download page - Beta files Read before posting How to post scriptsource Forum etiquette Forum Rules Live for the present, Dream of the future, Learn from the past. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now