diff options
author | 2015-08-11 18:40:59 -0400 | |
---|---|---|
committer | 2015-08-11 18:40:59 -0400 | |
commit | 88aeb849f620a13da47508045daae461a223c997 (patch) | |
tree | a93b1d67d5fe887c7e958d9cffbea3d7014e496a /doc | |
parent | 96705a943615528f79a121e6e94101d5852ba44f (diff) |
Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale.
Oh boy, not again. So, another place that the filesystem encoding needs to
be applied. Yay.
In passing, I changed decodeBS so if a NUL is embedded in the input, the
resulting FilePath doesn't get truncated at that NUL. This was needed to
make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I
didn't see any where this wouldn't make sense. When a FilePath is used to
operate on the filesystem, it'll get truncated at a NUL anyway, whereas if
a String is being used for something else, it might conceivably have a NUL
in it, and we wouldn't want it to get truncated when going through
decodeBS.
(NB: There may be a speed impact from this change.)
Diffstat (limited to 'doc')
-rw-r--r-- | doc/bugs/view_fails_with___34__invalid_character__34__.mdwn | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn b/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn index 4b6e97764..f77f5013f 100644 --- a/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn +++ b/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn @@ -28,3 +28,29 @@ local repository version: 5 supported repository version: 5 upgrade supported from repository versions: 0 1 2 4 """]] + +> I'm assuming the setlocale part of this is a misconfigured system locale; +> as also seen by an arch linux user in +> <http://git-annex.branchable.com/bugs/cannot_change_locale___40__en__95__US.UTF-8__41__/> +> +> So, disregarding that part of the bug report, we still have the actual +> failure. +> +> With LANG=C, setting and getting metadata like "Rondò Veneziano" fails, +> as does generating views of that metadata. +> +> In all cases, it's an IO encoding failure, "commitBuffer: invalid argument (invalid character)" +> +> This only occurs when there's a space in the metadata; in this case the + +> value is base64ed. While the 'ò' comes back out as "\242", which is the right +> character, it's not encoded using the filesystem encoding. This means that +> the IO layer can't handle it, when not in a unicode locale. Instead, it +> needs to come back out as "\56515\56498". +> +> Apparently this is a reversion; it worked in an earlier version of +> git-annex. Commits such as 9b93278e8abe1163d53fbf56909d0fe6d7de69e9 +> or the conversion to Sandi may have caused the reversion, unsure. +> +> Fix is to apply the filesystem encoding when decoding base64ed values. +> [[done]] --[[Joey]] |