AutoIt Forums: Latest Beta - AutoIt Forums

Jump to content

  • (17 Pages)
  • +
  • « First
  • 15
  • 16
  • 17
  • You cannot start a new topic
  • You cannot reply to this topic

Latest Beta

#321 User is offline   jchd 

  • Whatever your capacity, resistance is futile.
  • Icon
  • Group: AutoIt MVPs(MVP)
  • Posts: 1,344
  • Joined: 10-January 09
  • Gender:Male
  • Location:South of France

Posted 15 February 2010 - 01:58 AM

There is a bug in StringToASCIIArray with UTF-16 parameter: values in the returned array are ANDed with 0x00FF.
Replicate script:
[ autoIt ]    ( ExpandCollapse - Popup )
; ;; correct console display needs decent programming font with Unicode characters (e.g. DejaVu Sans Mono) and Unicode codepage in SciTE: ; ; in SciTEGlobalProperties change to code.page=65001: ;~ # Internationalisation ;~ # Japanese input code page 932 and ShiftJIS character set 128 ;~ #code.page=932 ;~ #character.set=128 ;~ # Unicode ;~ code.page=65001                  <<<<<<<<<<<<<<<<<<<<<<<<<<< ;~ #code.page=0 ; $str = e-grave 0x00E8, e-sharp 0x00E9, e-circumflex 0x00EA, space 0x0020, A-caron 0x01CD, a-caron 0x01CE, I-caron 0x01CF, fi ligature 0xFB01 Local $str = "éêè ǍǎǏfi" __ConsoleWrite('StringLen("' & $str & '"): ' & StringLen($str) & @LF)   ; correct but only for characters < 0x010000 (~UCS-2 charset) Local $a = StringToASCIIArray($str, Default, Default, 0)                ; length correct but values incorrectly masked with 0x00FF Local $b = StringSplit($str, '', 2) __ConsoleWrite('Glyph ' & @TAB & _ArrayToString($b, @TAB) & @LF) ConsoleWrite('UTF-16 ' & @TAB & _ArrayToString($a, @TAB) & @LF) For $i = 0 To UBound($a) - 1     $a[$i] = Hex($a[$i], 4) Next ConsoleWrite('  ' & @TAB & _ArrayToString($a, @TAB) & @LF) ;; UTF-8 is all correct $a = StringToASCIIArray($str, Default, Default, 2)                      ; length and contents are correct ConsoleWrite('UTF-8 ' & @TAB & _ArrayToString($a, @TAB) & @LF) For $i = 0 To UBound($a) - 1     $a[$i] = Hex($a[$i], 2) Next ConsoleWrite('  ' & @TAB & _ArrayToString($a, @TAB) & @LF) Exit Func __ConsoleWrite($sText)     Local $aResult = DllCall("kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sText, "int", -1, _                                 "ptr", 0, "int", 0, "ptr", 0, "ptr", 0)     Local $tText = DllStructCreate("char[" & $aResult[0] & "]")     DllCall("Kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sText, "int", -1, _                             "ptr", DllStructGetPtr($tText), "int", $aResult[0], "ptr", 0, "ptr", 0)     ConsoleWrite(DllStructGetData($tText, 1)) EndFunc ;==>__ConsoleWrite  

You may have to follow included instruction for correct display.
Here's the result:
[ autoIt ]    ( Popup )
;>Running:(3.3.5.4):C:\Program Files\AutoIt3\beta\autoit3.exe "D:\XLequit\AutoMAT\Test\try.au3" ;StringLen("éêè ǍǎǏfi"): 8 ;Glyph  é ê   è Ǎ   ǎ Ǐ fi ;UTF-16     233 234 232 32 205 206 207 1  ;  00E9 00EA 00E8 0020 00CD 00CE 00CF 0001 ;UTF-8  195 169 195 170 195 168 32  199 141 199 142 199 143 239 172 129  ;  C3  A9  C3  AA  C3  A8  20  C7  8D  C7  8E  C7  8F  EF  AC  81  

XP SP3 x86 (if that matters)

Edit: I didn't find the right markup that won't destroy spaces alignment in the result above. Giving up trying for today.

This post has been edited by jchd: 15 February 2010 - 02:07 AM


#322 User is offline   jchd 

  • Whatever your capacity, resistance is futile.
  • Icon
  • Group: AutoIt MVPs(MVP)
  • Posts: 1,344
  • Joined: 10-January 09
  • Gender:Male
  • Location:South of France

Posted 15 February 2010 - 02:55 AM

This is not the same issue as previous post!

I follow up here to Valik answer

Yes, there are issues with muli-'code units' for Unicode characters >= 0x010000 and, like you said, they are probably AutoIt-wide issues. It seems that string native functions use a class (or equivalent) where one character = one 16-bit codeunit in all cases. So besides the different terminology, current AutoIt UTF-16 is in fact essentially UCS-2.

There is nothing wrong with this as long as the product doesn't claim Unicode compliance under UTF-16 representation, or at least that the documentation clearly says so. Changing all the documentation to replace Unicode by UCS-2 would be technically correct, but certainly potentially more confusing to most users. A clear note in the features and datatype sections could be enough to warn concerned people.

Anyway AutoIt and its 'natural companion' (customized SciTE) can't be said to be horribly wrong. As I previously mentionned, there are little editors that treat full Unicode range under UTF-16 correctly, even today. FYI, I found only one that does: the VC++ editor (I use 2008 Express but no doubt newer versions do as well) but there are obviously many others.

It's likely that changing AutoIt core to support the full UTF-16 encoding with multi code units could reveal a delicate task and lead to even more issues (DllCall, DllStructs, ...). OTOH AutoIt doesn't appear to be under much pressure from users to support the 'exotic' (said with full respect to people who live there) character ranges, the new huge asian block or language tags, for instance.

Again, I'm not asking for full support tomorrow! I simply wanted to let you know.

I someone needs a ready-to-use script + font + instructions to look at that more closely, just ask, I'll be pleased to help.

#323 User is offline   Valik 

  • Do You Wanna Date My Avatar?
  • Icon
  • Group: Developers(Dev)
  • Posts: 14,944
  • Joined: 05-December 03
  • Gender:Male
  • Location:Silent Hill

Posted 15 February 2010 - 04:10 AM

Jon is retarded. I knew that already. Now you do, too. I've fixed his silly casts.

#324 User is offline   jchd 

  • Whatever your capacity, resistance is futile.
  • Icon
  • Group: AutoIt MVPs(MVP)
  • Posts: 1,344
  • Joined: 10-January 09
  • Gender:Male
  • Location:South of France

Posted 15 February 2010 - 05:08 AM

;-) OK this one was most probably very easy. BTW I suppose you mean the ushort -> uchar emasculation (Jon called the encoding joys "balls breaking" for a reason!).

#325 User is offline   Valik 

  • Do You Wanna Date My Avatar?
  • Icon
  • Group: Developers(Dev)
  • Posts: 14,944
  • Joined: 05-December 03
  • Gender:Male
  • Location:Silent Hill

Posted 15 February 2010 - 07:02 AM

Yes, I was referring to that. He was casting everything through UCHAR instead of UCHAR for ANSI/UTF-8 and USHORT for UTF-16.

#326 User is offline   Jon 

  • Do you wanna get punched in the face by my avatar?
  • Icon
  • Group: Admin
  • Posts: 8,467
  • Joined: 02-December 03
  • Gender:Male

Posted 15 February 2010 - 09:32 AM

View PostValik, on 15 February 2010 - 06:02 AM, said:

Yes, I was referring to that. He was casting everything through UCHAR instead of UCHAR for ANSI/UTF-8 and USHORT for UTF-16.

omg, I did the hard stuff and forgot the most basic cast. FFS.

#327 User is offline   Valik 

  • Do You Wanna Date My Avatar?
  • Icon
  • Group: Developers(Dev)
  • Posts: 14,944
  • Joined: 05-December 03
  • Gender:Male
  • Location:Silent Hill

Posted 15 February 2010 - 10:27 AM

Yeah... that was some messed up shit.

#328 User is offline   Jon 

  • Do you wanna get punched in the face by my avatar?
  • Icon
  • Group: Admin
  • Posts: 8,467
  • Joined: 02-December 03
  • Gender:Male

Posted 28 February 2010 - 11:45 AM

AutoIt v3.3.5.5 (Beta) Released:

There has been a significant rewrite of the Send/ControlSend code to better cope with Unicode characters. Those using characters <127 (USA/English/UK/etc) shouldn't notice any difference unless something has gone very wrong. Extended/Unicode users should test this release to see if there ary any improvements/disasters...

Spoiler



The following changes are script breaking changes:
Spoiler


Discuss the beta here.
Report issues here.
Download here.









#329 User is offline   GEOSoft 

  • Mr. Nice Guy = False
  • Icon
  • Group: AutoIt MVPs(MVP)
  • Posts: 8,208
  • Joined: 08-December 03
  • Gender:Male
  • Location:Nanaimo, BC, Canada

Posted 28 February 2010 - 05:28 PM

It's broken Jon.
On the first script I tried it returned an Error of Unknown Function Name for StringRegExpReplace()

#330 User is offline   Jon 

  • Do you wanna get punched in the face by my avatar?
  • Icon
  • Group: Admin
  • Posts: 8,467
  • Joined: 02-December 03
  • Gender:Male

Posted 28 February 2010 - 06:07 PM

Uploading a fixed version now.

#331 User is offline   GEOSoft 

  • Mr. Nice Guy = False
  • Icon
  • Group: AutoIt MVPs(MVP)
  • Posts: 8,208
  • Joined: 08-December 03
  • Gender:Male
  • Location:Nanaimo, BC, Canada

Posted 28 February 2010 - 06:08 PM

View PostJon, on 28 February 2010 - 09:07 AM, said:

Uploading a fixed version now.

Thanks.

EDIT: Working fine now.

This post has been edited by GEOSoft: 28 February 2010 - 06:17 PM


#332 User is offline   wraithdu 

  • Mass Spammer!
  • PipPipPipPipPipPip
  • Group: Full Members
  • Posts: 1,311
  • Joined: 21-November 07

Posted 01 March 2010 - 06:04 PM

I guess that would be 3.3.5.6 then. Might want to update the announcements as well.

  • (17 Pages)
  • +
  • « First
  • 15
  • 16
  • 17
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users