iCode Posted November 15, 2013 Share Posted November 15, 2013 just noticed this in the help file... "The recommended script format is UTF-8 with BOM. ANSI formats are not recommended for languages other than English as they can cause problems when run on machines with different locales." i understand that to mean that the actual au3 scripts to be compiled and used on non-EN systems should be UTF-8 with BOM, correct? if i am correct, than why are all of the include files i checked encoded in ANSI? FUNCTIONS: WinDock (dock window to screen edge) | EditCtrl_ToggleLineWrap (line/word wrap for AU3 edit control) | SendEX (yet another alternative to Send( ) ) | Spell Checker (Hunspell wrapper) | SentenceCase (capitalize first letter of sentences) CODE SNIPPITS: Dynamic tab width (set tab control width according to window width) Link to comment Share on other sites More sharing options...
czardas Posted November 15, 2013 Share Posted November 15, 2013 (edited) One of the best questions I've heard for a while. The first 128 characters are the same in both unicode and ansi, so there should be no conflicts. The extended ascii characters of the win-1252 code page are different from unicode characters (128 - 255) and this will cause conflicts with different systems. In other words you are safe to use characters 0 - 127 in both encoding systems. I hope this answers part of your question. I believe UTF-8 is recommended for the web and I think UTF-16 is more associated with windows. I don't understand the need for BOM - I think BOM may be a misunderstanding or unresolved issue between developers (I'm probably wrong but personally I think it's unfortunate). Hopefully someone will have further information to add. Edited November 15, 2013 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
jchd Posted November 15, 2013 Share Posted November 15, 2013 (edited) A BOM is a necessary evil. This is a consequence of the sad fact that every UTF8 without BOM file is a valid but erroneous ANSI file in almost all(*) variants of so-called ANSI. To convince yourself, compare these two readings of the exact same script whose meaning is different whether you interpret it in UTF8 w/o BOM ConsoleWrite("Nous avons demandé au vendeur d'expédier l'objet. Connectez-vous à votre compte PayPal pour consulter les détails de la transaction." & @LF) or emasculated as Windows western (latin-1) ConsoleWrite("Nous avons demandé au vendeur d'expédier l'objet. Connectez-vous à votre compte PayPal pour consulter les détails de la transaction." & @LF) (*) a number of non-UTF encodings widely used in Asia use double-byte representation where not all binary combinations (hi-lo) are valid. EDIT: forgot to mention that I strongly advocate for the whole AutoIt tool chain to only recognize and process UTF8 + BOM files, #includes included. That would definitely solve all questions about source encodings and promote universal non-ambiguity. Edited November 15, 2013 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
czardas Posted November 15, 2013 Share Posted November 15, 2013 (edited) Okay thanks for that explanation. I wasn't too far wrong. Let's go with the necessary evil of including byte order marks. Edited November 15, 2013 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
iCode Posted November 15, 2013 Author Share Posted November 15, 2013 thanks for the answers that explains why the include files are ANSI and also, i think, why my script size did not change substantially when i converted one to UTF-8 FUNCTIONS: WinDock (dock window to screen edge) | EditCtrl_ToggleLineWrap (line/word wrap for AU3 edit control) | SendEX (yet another alternative to Send( ) ) | Spell Checker (Hunspell wrapper) | SentenceCase (capitalize first letter of sentences) CODE SNIPPITS: Dynamic tab width (set tab control width according to window width) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now