summaryrefslogtreecommitdiff
path: root/doc/bugs/view_fails_with___34__invalid_character__34__.mdwn
blob: f77f5013fddb9016590b907d06a29a467455cb21 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
### Please describe the problem.
I have a "person" attribute with the name of the people on the picture. I could create views and filter on that some time ago, but it looks like git annex now chokes on some illegal (accented?) characters. I have non-ASCII characters in names for sure.

### What steps will reproduce the problem?
Try to create a view using the "person" attribute.

### What version of git-annex are you using? On what operating system?
5.20150617-gac09445 on ArchLinux

### Please provide any additional information below.

[[!format sh """
% git annex view person="*"
/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
view  (searching...)

git-annex: fd:14: commitBuffer: invalid argument (invalid character)
failed
git-annex: view: 1 failed

% git annex version
/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
git-annex version: 5.20150617-gac09445
build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV Inotify DBus DesktopNotify XMPP DNS Feeds Quvi TDFA
key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E MD5E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 MD5 WORM URL
remote types: git gcrypt S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
local repository version: 5
supported repository version: 5
upgrade supported from repository versions: 0 1 2 4
"""]]

> I'm assuming the setlocale part of this is a misconfigured system locale;
> as also seen by an arch linux user in
> <http://git-annex.branchable.com/bugs/cannot_change_locale___40__en__95__US.UTF-8__41__/>
> 
> So, disregarding that part of the bug report, we still have the actual
> failure. 
> 
> With LANG=C, setting and getting metadata like "Rondò Veneziano" fails,
> as does generating views of that metadata.
> 
> In all cases, it's an IO encoding failure, "commitBuffer: invalid argument (invalid character)"
> 
> This only occurs when there's a space in the metadata; in this case the

> value is base64ed. While the 'ò' comes back out as "\242", which is the right
> character, it's not encoded using the filesystem encoding. This means that
> the IO layer can't handle it, when not in a unicode locale. Instead, it
> needs to come back out as "\56515\56498".
> 
> Apparently this is a reversion; it worked in an earlier version of
> git-annex. Commits such as 9b93278e8abe1163d53fbf56909d0fe6d7de69e9
> or the conversion to Sandi may have caused the reversion, unsure.
> 
> Fix is to apply the filesystem encoding when decoding base64ed values.
> [[done]] --[[Joey]]