diff options
author | Joey Hess <joey@kitenet.net> | 2011-02-10 14:21:44 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2011-02-10 14:21:44 -0400 |
commit | fe55b4644e67bba60b35e07abcdd312b65c9d6f3 (patch) | |
tree | 4631f428f86f72d614f9b5388772b6ec58a3fb8d /doc/bugs/unhappy_without_UTF8_locale.mdwn | |
parent | e7a3475704f5366e89aebe78cefbeb58ff5ab181 (diff) |
Fix display of unicode filenames.
Internally, the filenames are stored as un-decoded unicode.
I tried decoding them, but then haskell tries to access the wrong files.
Hmm.
So, I've unhappily chosen option "B", which is to decode filenames before
they are displayed.
Diffstat (limited to 'doc/bugs/unhappy_without_UTF8_locale.mdwn')
-rw-r--r-- | doc/bugs/unhappy_without_UTF8_locale.mdwn | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/doc/bugs/unhappy_without_UTF8_locale.mdwn b/doc/bugs/unhappy_without_UTF8_locale.mdwn new file mode 100644 index 000000000..6f1df4fab --- /dev/null +++ b/doc/bugs/unhappy_without_UTF8_locale.mdwn @@ -0,0 +1,33 @@ +Try unsetting LANG and passing git-annex unicode filenames. + + joey@gnu:~/tmp/aa>git annex add ./Üa + add add add add git-annex: <stdout>: commitAndReleaseBuffer: invalid + argument (Invalid or incomplete multibyte or wide character) + +The same problem can be seen with a simple haskell program: + + import System.Environment + import Codec.Binary.UTF8.String + main = do + args <- getArgs + putStrLn $ decodeString $ args !! 0 + + joey@gnu:~/src/git-annex>LANG= runghc ~/foo.hs Ü + foo.hs: <stdout>: hPutChar: invalid argument (Invalid or incomplete multibyte or wide character) + +(The call to `decodeString` is necessary to make the input +unicode string be displayed properly in a utf8 locale, but +does not contribute to this problem.) + +I guess that haskell is setting the IO encoding to latin1, which +is [documented](http://haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html#v:latin1) +to error out on characters > 255. + +So this program doesn't have the problem -- but may output garbage +on non-utf-8 capable terminals: + + import System.IO + main = do + hSetEncoding stdout utf8 + args <- getArgs + putStrLn $ decodeString $ args !! 0 |