In short, my advice is to read the codepage text, convert it into a Unicode string(s) and stick with the Unicode representation to ascertain every character will be interpreted as intended, whatever underlying system/user setting is used.
But that is
exactly what I am trying to do! I probably didn't make it clear that I was asking
how to do exactly that. I'm grateful to you for confirming that what I want to do is exactly what I should be doing.
I should have made this more clear. My script will start by reading a text file that has been output by WordPerfect for DOS in either codepage 437 or codepage 850 - the early versions of WordPerfect can't output ANSI or Unicode text.
If the text from WordPerfect is in codepage 437, and if the Windows system is in North America, then a simple OEM to ANSI conversion will make it easy for me to get Unicode text. That's because Windows checks the OEMCP setting in the registry to see what the local DOS code page should be. (This setting can't be changed by writing to the registry - it also requires a reboot.)
Similarly, if the text from WordPerfect is in codepage 850, and if the Windows system is in Western Europe, then a simple OEM to ANSI conversion will make it easy for me to get Unicode text.
However, for various reasons, the user of this script may not have the technical ability to force his WordPerfect setup into using the correct code page. So it's possible that the user will output codepage 437 text in a Western European system, or he might output codepage 850 text in a North American system. In that case, a simple OEM to ANSI conversion won't work, and I want to be able to handle that situation also.
So my question still is: is there a way to convert the contents of the WordPerfect-created text file from codepage 850 or 437 to ANSI (which in this case is directly convertible into Unicode)? In other words, what I want to do is
100 percent exactly what you are suggesting that I do. I am asking how to do it reliably.
P.S. I know that one answer is to use a third-party utility (the Windows port of the Linux iconv program), but I think there must be a way to accomplish this using the Windows API. I simply don't know what it is.
Edited by Edward Mendelson, 07 November 2010 - 12:08 AM.