aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/Metadata_charset_not_uniform.mdwn
blob: 80b643d5b31d0ca4cd8ce48de2dee903f5ae5613 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
### Please describe the problem.

Metadata are not stored in a consistent format. It seems more like git-annex chooses the "smallest" charset able to hold the data, i.e. US-ASCII, unless there are latin1 characters, and only UTF-8 if there are UTF-8 characters that are not in latin1

### What steps will reproduce the problem?

    % git init
    Initialized empty Git repository in /home/madduck/.tmp/cdt.GlIevu/.git/
    
    % git annex init
    init  ok
    (recording state in git...)
    
    % date > a
    
    % git annex add a
    add a ok
    (recording state in git...)
    
    % git annex metadata -s one=$(echo US-ASCII | iconv -tus-ascii) a
    metadata a 
      lastchanged=2016-09-25@13-18-57
      one=US-ASCII
      one-lastchanged=2016-09-25@13-18-57
    ok
    (recording state in git...)
    
    % git annex metadata -s two=$(echo lätin1 | iconv -tlatin1) a
    metadata a 
      lastchanged=2016-09-25@13-19-37
      one=US-ASCII
      one-lastchanged=2016-09-25@13-18-57
      two=lätin1
      two-lastchanged=2016-09-25@13-19-37
    ok
    (recording state in git...)
    
    % git annex metadata -s three=$(echo unicode… | iconv -tutf8) a  
    metadata a 
      lastchanged=2016-09-25@13-19-41
      one=US-ASCII
      one-lastchanged=2016-09-25@13-18-57
      three=unicode…
      three-lastchanged=2016-09-25@13-19-41
      two=lätin1
      two-lastchanged=2016-09-25@13-19-37
    ok
    (recording state in git...)
    
    % git annex metadata -g three a | iconv -tutf8                 
    unicode…
    
    % git annex metadata -g two a | iconv -tutf8 
    liconv: illegal input sequence at position 1
    
    % git annex metadata -g one a | iconv -tutf8 
    US-ASCII
    
    % git annex metadata -g two a | iconv -flatin1 -tutf8 
    lätin1

### What version of git-annex are you using? On what operating system?

6.20160808-1

[[!tag moreinfo]]