Antiec Posted February 29, 2008 Posted February 29, 2008 Hi there! I have a little script which reads files in hex-mode, picks up some data and converts it to text with _HexToString(). It worked fine until I happened to bump into a file with the text with some special encoding like ISO-JIS or something (not UTF or ANSI). They're files which I can't edit or anything, so I'd need a way to re-encode the text. I also can read the file for the encoding information. So, is it possible to convert some random encoding to UTF or ANSI? Thx, Lassi
rudi Posted February 29, 2008 Posted February 29, 2008 (edited) Hello.So, is it possible to convert some random encoding to UTF or ANSI?Well, you will need to search for the correct translation tables. And you will need to know what encoding the current file is. Then it's basically a search/replace.You can try to write a program analysing the relative frequency of chars in your files to guess the encoding...Regards, Rudi. Edited February 29, 2008 by rudi Earth is flat, pigs can fly, and Nuclear Power is SAFE!
Antiec Posted March 3, 2008 Author Posted March 3, 2008 Hi, I'm not so sure if I understood what you said. But I can read the file to get the current encoding, like "ISO_IR 13" which is "JIS_X0201" (yeah, I'm reading DICOM-files). So do you mean that if I want to recode the file from JIS_X0201, I have to search some gibberish symbols and replace them with the responding UTF-8 -characters? That would mean that I had to search for thousands of characters . I hoped there was some easier way to do it. Maybe we just have to leave it be. Thx, Lassi.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now