|
|
|
"If you don't explicitly know the encoding of text represented as a byte-string, then effectively you have random binary data rather than text. Unfortunately it's an approach that 'just works' most of the time, but causes horrible problems when it doesn't."
I'd say it causes "bad problems" (because you usually sort of know what happened), and reserve "horrible problems" for when the program tries to convert the encoding and fails miserably, leaving the text completely mangled (e.g. try to unzip on Linux a .zip archive with japanese filenames created on windows).
Also, AFAIK, many (most?) encodings are ASCII-compatible, even for CJK.
Anonymous |
06/02/23 - 2:21 am | #
|
|
|
Commenting by HaloScan
|