From 9b91db825484f8e16ce5d1bb3daee6e6a8151206 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawk6QAwUsFHpr3Km1yQbg8hf3S7RDYf7hX4" Date: Thu, 26 Jan 2012 22:13:19 +0000 Subject: Added a comment --- .../comment_5_519cda534c7aea7f5ad5acd3f76e21fa._comment | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 doc/bugs/problems_with_utf8_names/comment_5_519cda534c7aea7f5ad5acd3f76e21fa._comment (limited to 'doc') diff --git a/doc/bugs/problems_with_utf8_names/comment_5_519cda534c7aea7f5ad5acd3f76e21fa._comment b/doc/bugs/problems_with_utf8_names/comment_5_519cda534c7aea7f5ad5acd3f76e21fa._comment new file mode 100644 index 000000000..96b0ffed0 --- /dev/null +++ b/doc/bugs/problems_with_utf8_names/comment_5_519cda534c7aea7f5ad5acd3f76e21fa._comment @@ -0,0 +1,11 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawk6QAwUsFHpr3Km1yQbg8hf3S7RDYf7hX4" + nickname="Lauri" + subject="comment 5" + date="2012-01-26T22:13:18Z" + content=""" +I also encountered Adam's bug. The problem seems to be that communication with the git process is done with `Char8`-bytestrings. So, when `L.unpack` is called, all filenames that git outputs (with `ls-files` or `ls-tree`) are interpreted to be in latin-1, which wreaks havoc if they are really in UTF-8. + +I suspect that it would be enough to just switch to standard `String`s (or `Data.Text.Text`) instead of bytestrings for textual data, and to `Word8`-bytestrings for pure binary data. GHC should nowadays handle locale-dependent encoding of `String`s transparently. + +"""]] -- cgit v1.2.3