Jump to content

One more UniCode thread


Recommended Posts

I may be hitting my head against the wall too much here, but the challenge is interesting.

I know there are various ways to convert Unicode to Ascii. The simplest being Type from @Comspec. That is nice if you know that all of your values will be properly converted to Ascii. However, if you have a value such as this

@="Confusion ىىىىىىىىىىىىىىىىىىىىى"

then you will end up converting the ى to a ?. That does not really convert properly.

There is no trick it seems to reading or writing a unicode file. However, I have not been able to find how to parse a unicode file, line by line. I am assuming that what I want to happen is open the Unicode file, convert one line, read that line, parse that line however the script calls for, and then write that line back as unicode. Since these characters are unicode, they need to be written as unicode or they are not usable. That goes without saying I guess. So, how can one read only one line of a unicode file, when you have to GetSize of whole file in order to do any work with it.

I have tried the process in VBS, and I can get to a point with it, but then run out of my ability when it comes to API calls. I am sure there is a call that can handle it somewhere, but I am not at that point yet. So, I come back to autoit, with it's many functions or UDF's available. But, no fruit yet. So now I go back to batch.

Short of writing the whole thing in batch, which would require supplying additional com's not sent with xp (such as choice.com), or 3rd party com's, AND would be very painstaking code to keep track of, the only thing I thought was to use a for or for /f command. I found 2 ways to get to the Unicode file WITHOUT losing the unicode in a standard Type sequence. They are,

RunWait(@ComSpec & ' /c chcp 65001&& for %a in (epsilon.reg) do type "%a">eptest.txt&&chcp 437')

which has issues being run back to back. Do that from command prompt and it will work. Also

RunWait(@ComSpec & ' /u /c for %a in (epsilon.reg) do type "%a">eptest.txt')

this one does work because the /u outputs in unicode. Now these 2 pipe out to a text file, which is also in unicode. However, they would give one the opportunity to do a for /f token command and even a Findstr within that, but I cannot get it to work. DOS is short in way of len() or a similar feature to find the end of a string, much less chop it off (unless you supply choice.com). Next I went to Edlin, an oldie but sometimes a goodie. However, once again, no way to really edit only the last character.

Which leads me back to square one. How to read a line from a unicode file, parse it, and write a unicode line back to that or a new file.

I am dealing with .reg files here. I know I could export as 9x. I know I could find 3rd party stuff, or maybe start learing another language. Most of those options though require, AFAIK, more than just an exe or script that will just work. Which is my little thing. To make it just work. Supply the .exe and you need nothing else.

So, not to kick a dead horse, but is there anything on the horizon that could possibly accomplish this, or should one just quit kicking the horse?

Thanks for any input,

Sul

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...