diff options
Diffstat (limited to 'doc/bugs/problems_with_utf8_names.mdwn')
-rw-r--r-- | doc/bugs/problems_with_utf8_names.mdwn | 46 |
1 files changed, 8 insertions, 38 deletions
diff --git a/doc/bugs/problems_with_utf8_names.mdwn b/doc/bugs/problems_with_utf8_names.mdwn index b734ddecf..fbdca41cd 100644 --- a/doc/bugs/problems_with_utf8_names.mdwn +++ b/doc/bugs/problems_with_utf8_names.mdwn @@ -1,6 +1,12 @@ This bug is reopened to track some new UTF-8 filename issues caused by GHC -7.4. Older versions of GHC, like the 7.0.4 in debian unstable, are not -affected. See the comments for details about the new bug. --[[Joey]] +7.4. In this version of GHC, git-annex's hack to support filenames in any +encoding no longer works. Even unicode filenames fail to work when +git-annex is built with 7.4. --[[Joey]] + +I now have a `ghc7.4` branch in git that seems to solve this, +for all filename encodings, and all system encodings. It will +only build with the new GHC. If you have this problem, give it a try! +--[[Joey]] ---- @@ -74,39 +80,3 @@ It looks like the common latin1-to-UTF8 encoding. Functionality other than otupu > > On second thought, I switched to this. Any decoding of a filename > > is going to make someone unhappy; the previous approach broke > > non-utf8 filenames. - ----- - -Simpler test case: - -<pre> -import Codec.Binary.UTF8.String -import System.Environment - -main = do - args <- getArgs - let file = decodeString $ head args - putStrLn $ "file is: " ++ file - putStr =<< readFile file -</pre> - -If I pass this a filename like 'ü', it will fail, and notice -the bad encoding of the filename in the error message: - -<pre> -$ echo hi > ü; runghc foo.hs ü -file is: ü -foo.hs: �: openFile: does not exist (No such file or directory) -</pre> - -On the other hand, if I remove the decodeString, it prints the filename -wrong, while accessing it right: - -<pre> -$ runghc foo.hs ü -file is: üa -hi -</pre> - -The only way that seems to consistently work is to delay decoding the -filename to places where it's output. But then it's easy to miss some. |