I'm facing a pile of files with strange characters in their filenames, which represent german umlauts (öäüß) and other special characters.
I've verified that I can create files with umlauts manually:
Code: Select all
touch öäüß.txt
This means that the encoding is not displayed incorrectly, but has been altered during some copy process in the chain
So, I'm planning to rename them.
But of course, not manually
Here's an equivalent list I've figured out so far:
Code: Select all
├╢ ö
├╝ ü
├ñ ä
├ä=Ä
┬┤ '
As setting the language to german unicode in the bash:
Code: Select all
export LANG=de_DE@UTF-8
This is, because the locale for german UTF8 has not been built. Do so by calling reconfiguring the "locales":
Code: Select all
sudo dpkg-reconfigure locales
Now, with "de_DE.UTF-8" as language in the shell, I get the identical characters as with "en_US.UTF-8", so I assume:
The current encoding of the filenames *is* already UTF-8.
Unfortunately, I therefore suspect that some UTF-8 encoding was rendered in a single-byte encoding (therefore the 2-chars-for-one), but then re-interpreted as unicode, storing the 2 chars in UTF-8.
Therefore, when setting display encoding to a single-byte encoding (ISO-8859-1):
Code: Select all
export LANG=de_DE@ISO-8859-1
*sigh*
[REFERENCES]
http://www.mastblau.com/2009-01-20/word ... umstellen/ (Other encoding equivalent listings)