Jump to content

String :Encounterring a new question


Xwolf
 Share

Recommended Posts

Question:How do i know the bytes of a string?

For example:

$str = "abc好abc"

NOTE:

"好" is a chinese word which have 2 bytes.

how can i get the bytes of variable $str

Thanks for everyone. :)

Edited by Xwolf
Link to comment
Share on other sites

write it in a file and...

$o=FileOpen("test.txt",16)
$f=FileRead($o)
MsgBox(0 , "" , BinaryLen($f)-2);-2 for the @cr and @lf
FileClose($o)

btw that string is unicode andd all characters are 2 bytes so it's actually 14 bytes

Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro

Link to comment
Share on other sites

write it in a file and...

$o=FileOpen("test.txt",16)
$f=FileRead($o)
MsgBox(0 , "" , BinaryLen($f)-2);-2 for the @cr and @lf
FileClose($o)

btw that string is unicode andd all characters are 2 bytes so it's actually 14 bytes

No, each character is 1 byte. And if the topicstarter uses chinese chars thats his case. We can't determine if a char is chinese, korean, from outerspace or w/e... the standard is ; 1 char == 1 byte.
Link to comment
Share on other sites

No, each character is 1 byte. And if the topicstarter uses chinese chars thats his case. We can't determine if a char is chinese, korean, from outerspace or w/e... the standard is ; 1 char == 1 byte.

I think not. If you know the string is ASCII then your statement is true. But unicode characters are not one byte, and according to the help file, AutoIt supports Unicode.

Bob

I have no idea what code set outerspace uses.

Edited by bobchernow

--------------------bobchernow, Bob ChernowWhat a long strange trip it's beenUDFs: [post="635594"]Multiple Monitor Screen Resolution Change[/post]

Link to comment
Share on other sites

No, each character is 1 byte. And if the topicstarter uses chinese chars thats his case. We can't determine if a char is chinese, korean, from outerspace or w/e... the standard is ; 1 char == 1 byte.

and your point is? :|

try it yourself copy his string and write it in unicode and then don't even run my example use properties in windows and see what the file size is... just subtract 2 (thats the @crlf)

Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro

Link to comment
Share on other sites

a b c 好 a b c @CR @LF

1 1 1 2 1 1 1 1 1

1+1+1+2+1+1+1+1+1=10

Total is 10.

Am i right? :)

what encoding did you use ?

ansi won't recognize the char

unicode used 2 bytes for a-z and 好

and utf-8 actually uses 3 bytes for 好 and 1 byte for a-z

Edited by TheMadman

Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro

Link to comment
Share on other sites

You could do sth like this:

;  flag = 1 (default), binary data will be ANSI
;  flag = 2, binary data will be UTF16 Little Endian
;  flag = 3, binary data will be UTF16 Big Endian
;  flag = 4, binary data will be UTF8
$Encoding = 4
$string = "abc好abc"
$Bytes = BinaryLen(StringToBinary($string,$Encoding))
MsgBox(0,"Number of bytes", $Bytes)

This returns the number of bytes depending on the encoding :)

Edited by ProgAndy

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Link to comment
Share on other sites

You could do sth like this:

;  flag = 1 (default), binary data will be ANSI
;  flag = 2, binary data will be UTF16 Little Endian
;  flag = 3, binary data will be UTF16 Big Endian
;  flag = 4, binary data will be UTF8
$Encoding = 4
$string = "abc好abc"
$Bytes = BinaryLen(StringToBinary($string,$Encoding))
MsgBox(0,"Number of bytes", $Bytes)

This returns the number of bytes depending on the encoding :)

:) Great!

A new way to deal with this problem.

Cheer!

Peter

Link to comment
Share on other sites

what encoding did you use ?

ansi won't recognize the char

unicode used 2 bytes for a-z and 好

and utf-8 actually uses 3 bytes for 好 and 1 byte for a-z

ASCII or UFT8 ...?

In fact,i didn't know what i had used.

I just put "abc好abc" into the file test.txt .

And i found that the size of file was 8 bytes (without @CRLF).

IF i put the "enter"(with @CRLF) ,then the size of file is 10 bytes.

Edited by Xwolf
Link to comment
Share on other sites

$string = "abc好abc"
$z1 = StringToBinary($string,4)
ConsoleWrite($z1 & @CRLF)
$z2 = BinaryLen($z1)
ConsoleWrite($z2 & @CRLF)

;Result1
;0x616263E5A5BD616263
;9oÝ÷ Ù«­¢+ØÀÌØíé}¥±ô¥±=Á¸ ÅÕ½ÐìĹÑáÐÅÕ½Ðì°Äؤ((ÀÌØíèÄô¥±I ÀÌØíé}¥±¤)
½¹Í½±]É¥Ñ ÀÌØíèĵÀì
I1¤(ÀÌØíèÈô ¥¹Éå1¸ ÀÌØíèĤ)
½¹Í½±]É¥Ñ ÀÌØíèȵÀì
I1¤()¥±
±½Í ÀÌØíé}¥±¤(((íIÍÕ±ÐÈ(íÈÈäÀäí

Note:

The result2 was something strange.

Looking the result2 "abc好abc8"

The word "8" was behind the word "c".

In my opinion ,when i use code "ConsoleWrite($z1 & @CRLF)".

It should in this way

abc好abc

9

Help,help ...

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...