summaryrefslogtreecommitdiff
path: root/doc/tips/unlocked_files.mdwn
blob: 352ac60dbf18e9da2e8d45676b426fa8319fcad4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
Normally, git-annex stores annexed files in the repository, locked down,
which prevents the content of the file from being modified.
That's a good thing, because it might be the only copy, you wouldn't
want to lose it in a fumblefingered mistake.

	# git annex add some_file
	add some_file
	# echo oops > some_file
	bash: some_file: Permission denied

Sometimes though you want to modify a file. Maybe once, or maybe
repeatedly. To modify an annexed file, you have to first unlock it,
by running `git annex unlock`.

	# git annex unlock some_file
	# echo "new content" > some_file
	#

Back before git-annex version 6, and its v6 repository mode, unlocking a file
like this was a transient thing. You'd modify it and then `git annex add` the
modified version to the annex, and finally `git commit`. The new version of
the file was then back to being locked.

	# git annex add some_file
	add some_file
	# git commit

But, that had some problems. The main one is that some users want to be able
to edit files repeatedly, without manually having to unlock them every time.
The [[direct_mode]] was made all files be unlocked all the time, but it
had many problems of its own.

## enter v6 mode

This led to the v6 repository mode, which makes unlocked files remain
unlocked after they're committed, so you can keep changing them and
committing the changes whenever you'd like. It also lets you use more
normal git commands (or even interfaces on top of git) for handling
annexed files.

To get a repository into v6 mode, you can [[upgrade|upgrades]] it.
This will eventually happen automatically, but for now it's a manual process
(be sure to read [[upgrades]] before doing this):

	# git annex upgrade
	
Or, you can init a new repository in v6 mode.

	# git init
	# git annex init --version=6

## using it

Using a v6 repository is easy! Simply use regular git commands to add
and commit files. In a git-annex repository, git will use git-annex
to store the file contents, and the files will be left unlocked.

[[!template id=note text="""
Want `git add` to add some file contents to the annex, but store the contents of
smaller files in git itself? Configure annex.largefiles to match the former.

	git config annex.largefiles \
		"largerthan=100kb and not include=*.c"
"""]]

	# cp ~/my_cool_big_file .
	# git add my_cool_big_file
	# git commit -m "added my_cool_big_file to the annex"
	[master (root-commit) 92f2725] added my_cool_big_file to the annex
	 1 file changed, 1 insertion(+)
	  create mode 100644 my_cool_big_file
	# git annex find
	my_cool_big_file

You can make whatever modifications you want to unlocked files, and commit
your changes.

	# echo more stuff >> my_cool_big_file
	# git mv my_cool_big_file my_cool_bigger_file
	# git commit -a -m "some changes"
	[master 196c0e2] some changes
	 2 files changed, 1 insertion(+), 1 deletion(-)
	 delete mode 100644 my_cool_big_file
	 create mode 100644 my_cool_bigger_file

Under the hood, this uses git's [[todo/smudge]] filter interface, and
git-annex converts between the content of the big file and a pointer file,
which is what gets committed to git. 

A v6 repository can contain both locked and unlocked files. You can switch 
a file back and forth using the `git annex lock` and `git annex unlock`
commands. This changes what's stored in git between a git-annex symlink
(locked) and a git-annex pointer file (unlocked). To add a file to
the repository in locked mode, use `git annex add`; to add a file in
unlocked mode, use `git add`.

## using less disk space

Unlocked files are handy, but they have one significant disadvantage
compared with locked files: They use more disk space.

While only one copy of a locked file has to be stored, often
two copies of an unlocked file are stored on disk. One copy is in
the git work tree, where you can use and modify it,
and the other is stashed away in `.git/annex/objects` (see [[internals]]).

The reason for that second copy is to preserve the old version of the file,
when you modify the unlocked file in the work tree. Being able to access
old versions of files is an important part of git after all!

That's a good safe default. But there are ways to use git-annex that
make the second copy not be worth keeping:

[[!template id=note text="""
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
set, because direct mode made the same single-copy tradeoff.
"""]]

* When you're using git-annex to sync the current version of files acrosss
  devices, and don't care much about previous versions.
* When you have set up a backup repository, and use git-annex to copy
  your files to the backup.

In situations like these, you may want to avoid the overhead of the second
local copy of unlocked files. There's config setting for that.

	git config annex.thin true

After changing annex.thin, you'll want to fix up the work tree to
match the new setting:

	git annex fix

Note that setting annex.thin only has any effect on systems that support
hard links. Ie, not Windows, and not FAT filesystems.

## tradeoffs

Setting annex.thin can save a lot of disk space, but it's a tradeoff
between disk usage and safety. 

Keeping files locked is safer and also avoids using unnecessary
disk space, but trades off easy modification of files.

Pick the tradeoff that's right for you.