diff options
Diffstat (limited to 'doc/bugs/Unicode_characters_lost__47__converted_in_metadata/comment_1_799fd2dc62dc51a6a8fa8a6d15d7a1f9._comment')
-rw-r--r-- | doc/bugs/Unicode_characters_lost__47__converted_in_metadata/comment_1_799fd2dc62dc51a6a8fa8a6d15d7a1f9._comment | 32 |
1 files changed, 0 insertions, 32 deletions
diff --git a/doc/bugs/Unicode_characters_lost__47__converted_in_metadata/comment_1_799fd2dc62dc51a6a8fa8a6d15d7a1f9._comment b/doc/bugs/Unicode_characters_lost__47__converted_in_metadata/comment_1_799fd2dc62dc51a6a8fa8a6d15d7a1f9._comment deleted file mode 100644 index 1ae278289..000000000 --- a/doc/bugs/Unicode_characters_lost__47__converted_in_metadata/comment_1_799fd2dc62dc51a6a8fa8a6d15d7a1f9._comment +++ /dev/null @@ -1,32 +0,0 @@ -[[!comment format=mdwn - username="joey" - subject="""comment 1""" - date="2015-03-04T14:31:21Z" - content=""" -What I'm seeing is the unicode arrow is replaced with 0092 and the elipsis -with &. It's losing the other byte. - -The problem seems to be in the base64 encoding that's done, when the metadata -value contains spaces or a few other problem characters. These same -unicode characters roundtrip through without a problem when not embedded -in a string with spaces. - -<pre> -*Utility.Base64> let s = "…" -*Utility.Base64> (s, fromB64 $ toB64 s) -("\8230","&") -</pre> - -git-annex also uses base64 for encoding some creds -(and for tagged pushes over XMPP, but only the JID is encoded). - -The real culprit is the use of `w82s`, which doesn't handle multi-byte -characters. I can easily fix this by using `encodeW8` instead. -Audited git-annex for other problem w82s uses and don't see any, so will -only need to fix this once. - -Added a quickcheck test for fromB64 . toB64 roundtripping. - -Unfortunately, the entered unicode characters didn't get saved right, -so git-annex can do nothing to fix data that was already entered. -"""]] |