Jump to content

Bug: Tidy ignores keyword in a script


Recommended Posts

...here an example that's what I give tidy as input

FuNc myfunc ( )
DiM $A
EnDFuNcoÝ÷ ٩ݶ­q©±¶«­¢+ÙÕ9µåÕ¹ ¤)¥´ÀÌØí(ìQ¥äÉɽÈè¹áб¥¹ÉÑ̹ѥÙѱٰ¸(ìQ¥äÉɽÈè¹áб¥¹ÉÑ̹ѥÙѱٰ½ÈÑ¡±¥¹Ñȥи)¹Õ¹oÝ÷ Ø ÝN'rÂ)eÂä~)Þjëh×6;
Func mfUnC()
    Dim $A
EndFunc
However that bug is really uncritical, and as low priority. I just wanted to drop a note about it. Edited by Robinson1
Link to comment
Share on other sites

  • Developers

Strange.. Copied your source and tested it successfully without any problems.

Can you pm me the actual file you have the problem with?

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

That's strange.

Works fine for me (v2.0.28.2 too)

Really strange thing - now it works even to me !!!

But since I know computers I also know that this can't be, so I digged deeper. And yes finally I found the hidden trigger!

The thing is that the error only comes up if you save the file as UTF8 (with BOM marker) in SCiTe.

If you just create a new file and paste it, it is save as 'normal' ASCII file that just workes fine.

Edited by Robinson1
Link to comment
Share on other sites

  • Developers

Really strange thing - now it works even to me !!!

But since I know computers I also know that this can't be, so I digged deeper. And yes finally I found the hidden trigger!

The thing is that the error only comes up if you save the file as UTF8 (with BOM marker) in SCiTe.

If you just create a new file and paste it, it is save as 'normal' ASCII file that just workes fine.

Tidy doesn't support any unicode encoded files (yet).

I have Added the test for the UTF8 Bom which was missing and uploaded the new version of tidy to the beta directory.

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

  • 6 months later...

Tidy doesn't support any unicode encoded files (yet).

I have Added the test for the UTF8 Bom which was missing and uploaded the new version of tidy to the beta directory.

Encoding/Decoding UTF8 is not so difficult, just look up the API-Reference of these two API

kernel32.MultiByteToWideChar and

kernel32.WideCharToMultiByte

Const CP_ACP = 0

Const CP_UTF8 = 0xFDE9

To decode UTF8

call WideCharToMultiByte(UTF8-String...,with CP_ACP) and then

MultiByteToWideChar(...,with CP_UTF8)

To encode UTF8

just call WideCharToMultiByte(16bit-UnicodeString,..., CP_UTF8)

edit: Corrected some errors.

Edited by Robinson1
Link to comment
Share on other sites

  • Developers

Understand that it is easy to convert the double byte to WideChar, but Tidy will have to update the file so that means it needs to properly recognize the different doublebyte file formats and, after the Tidy operation, write it back in the same fileformat.

I haven't really looked at everything that needs to be done yet so first will have to understand how exactly the above needs to be handled before I start support for it in Tidy and possibly Obfuscator.

Another one that needs to be looked at is au3check but that is just for reading the file.

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Let me just clearify (and simplify) as I understood that:

MultiChar = Variable-width encoding = UTF8 => a char can be 1..4 Byte

WideChar = Fixed Width = Unicode => a char always is 2 byte

Okay I see UTF8 can be only accesses sequential as a stream and is not useful for internal string operation. It's good for storing as it saves spaces and provides backwards compatibly to 8-Bit ACCII.

However you convert it to Unicode and then you can freely read and write it in Array(Random Access).

Afterward convert it back to UTF8 and done. B)

All you need to pay attention is that now 1 char is 2 Byte - but normally the unicode type(wchar) already encapsulate this and so you

you MyUnicodeData as before instead of changing it to MyUnicodeData[i*2] (or MyUnicodeData[i+i]... or MyUnicodeData[i<<1] <- ;) ups no that's the 'optimiser' part)

All what need to be change is the type.

I see this converting process seperated from the string processing. When I use the API's or other lib you don't need to know all the details about how to handle UTF8 - it's done for you.

So far the theory :).

But it would be boring without the 'little' details/surprises of the daily life. Things you didn't thought of - that occur when you do it /implement it.

... and things I would be eager to hear about - to get a better picture about the 'reality' the real - beside the 'theoretical' view.

Edited by Robinson1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...