summaryrefslogtreecommitdiff
path: root/doc/bugs/problems_with_utf8_names.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'doc/bugs/problems_with_utf8_names.mdwn')
-rw-r--r--doc/bugs/problems_with_utf8_names.mdwn46
1 files changed, 8 insertions, 38 deletions
diff --git a/doc/bugs/problems_with_utf8_names.mdwn b/doc/bugs/problems_with_utf8_names.mdwn
index b734ddecf..fbdca41cd 100644
--- a/doc/bugs/problems_with_utf8_names.mdwn
+++ b/doc/bugs/problems_with_utf8_names.mdwn
@@ -1,6 +1,12 @@
This bug is reopened to track some new UTF-8 filename issues caused by GHC
-7.4. Older versions of GHC, like the 7.0.4 in debian unstable, are not
-affected. See the comments for details about the new bug. --[[Joey]]
+7.4. In this version of GHC, git-annex's hack to support filenames in any
+encoding no longer works. Even unicode filenames fail to work when
+git-annex is built with 7.4. --[[Joey]]
+
+I now have a `ghc7.4` branch in git that seems to solve this,
+for all filename encodings, and all system encodings. It will
+only build with the new GHC. If you have this problem, give it a try!
+--[[Joey]]
----
@@ -74,39 +80,3 @@ It looks like the common latin1-to-UTF8 encoding. Functionality other than otupu
> > On second thought, I switched to this. Any decoding of a filename
> > is going to make someone unhappy; the previous approach broke
> > non-utf8 filenames.
-
-----
-
-Simpler test case:
-
-<pre>
-import Codec.Binary.UTF8.String
-import System.Environment
-
-main = do
- args <- getArgs
- let file = decodeString $ head args
- putStrLn $ "file is: " ++ file
- putStr =<< readFile file
-</pre>
-
-If I pass this a filename like 'ü', it will fail, and notice
-the bad encoding of the filename in the error message:
-
-<pre>
-$ echo hi > ü; runghc foo.hs ü
-file is: ü
-foo.hs: �: openFile: does not exist (No such file or directory)
-</pre>
-
-On the other hand, if I remove the decodeString, it prints the filename
-wrong, while accessing it right:
-
-<pre>
-$ runghc foo.hs ü
-file is: üa
-hi
-</pre>
-
-The only way that seems to consistently work is to delay decoding the
-filename to places where it's output. But then it's easy to miss some.