summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-08-11 18:40:59 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2015-08-11 18:40:59 -0400
commit88aeb849f620a13da47508045daae461a223c997 (patch)
treea93b1d67d5fe887c7e958d9cffbea3d7014e496a /doc
parent96705a943615528f79a121e6e94101d5852ba44f (diff)
Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale.
Oh boy, not again. So, another place that the filesystem encoding needs to be applied. Yay. In passing, I changed decodeBS so if a NUL is embedded in the input, the resulting FilePath doesn't get truncated at that NUL. This was needed to make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I didn't see any where this wouldn't make sense. When a FilePath is used to operate on the filesystem, it'll get truncated at a NUL anyway, whereas if a String is being used for something else, it might conceivably have a NUL in it, and we wouldn't want it to get truncated when going through decodeBS. (NB: There may be a speed impact from this change.)
Diffstat (limited to 'doc')
-rw-r--r--doc/bugs/view_fails_with___34__invalid_character__34__.mdwn26
1 files changed, 26 insertions, 0 deletions
diff --git a/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn b/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn
index 4b6e97764..f77f5013f 100644
--- a/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn
+++ b/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn
@@ -28,3 +28,29 @@ local repository version: 5
supported repository version: 5
upgrade supported from repository versions: 0 1 2 4
"""]]
+
+> I'm assuming the setlocale part of this is a misconfigured system locale;
+> as also seen by an arch linux user in
+> <http://git-annex.branchable.com/bugs/cannot_change_locale___40__en__95__US.UTF-8__41__/>
+>
+> So, disregarding that part of the bug report, we still have the actual
+> failure.
+>
+> With LANG=C, setting and getting metadata like "Rondò Veneziano" fails,
+> as does generating views of that metadata.
+>
+> In all cases, it's an IO encoding failure, "commitBuffer: invalid argument (invalid character)"
+>
+> This only occurs when there's a space in the metadata; in this case the
+
+> value is base64ed. While the 'ò' comes back out as "\242", which is the right
+> character, it's not encoded using the filesystem encoding. This means that
+> the IO layer can't handle it, when not in a unicode locale. Instead, it
+> needs to come back out as "\56515\56498".
+>
+> Apparently this is a reversion; it worked in an earlier version of
+> git-annex. Commits such as 9b93278e8abe1163d53fbf56909d0fe6d7de69e9
+> or the conversion to Sandi may have caused the reversion, unsure.
+>
+> Fix is to apply the filesystem encoding when decoding base64ed values.
+> [[done]] --[[Joey]]