summaryrefslogtreecommitdiff
path: root/doc/tips/migrating_two_seperate_disconnected_directories_to_git_annex.mdwn
blob: 3d7e9dac4543c90c3bff6b3f980b249aa36180f9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
Scenario
--------

You are a new git-annex user. You have already files spread around many computers and wish to migrate those into git-annex, without having to recopy all files all over the place.

Let's say, for example, you have a server, named `marcos` and a workstation named `angela`. You have your audio collection stored in `/srv/mp3` in `marcos` and `~/mp3` on `angela`, but only `marcos` has all the files, and `angela` only has a subset.

We also assume that `marcos` has an SSH server.

How do you add all this stuff to git-annex?

Create the biggest git-annex repository
---------------------------------------

Start with `marcos`, with the complete directory:

    cd /srv/mp3
    git init
    git annex init
    git annex add

You may want to use [[direct mode]] if you want to avoid creating a forest of symlinks there, but this is generally error-prone and should be avoided.

This will checksum all files and add them to the `git-annex` branch of the git repository. Wait for this process to complete.

Create the smaller repo and synchronise
---------------------------------------

On `angela`, we want to synchronise the git annex metadata with `marcos`. We need to initialize a git repo with `marcos` as a remote:

    cd ~/mp3
    git init
    git remote add marcos marcos.example.com:/srv/mp3
    git fetch marcos
    git annex status # this should display the two repos
    git annex add .

This will, again, checksum all files and add them to git annex. Once that is done, you can verify that the files are really the same as marcos with `whereis`:

    git annex whereis

This should display something like:

    whereis Orange Seeds/I remember.wav (2 copies)
            b7802161-c984-4c9f-8d05-787a29c41cfe -- marcos (anarcat@marcos:/srv/mp3)
            c2ca4a13-9a5f-461b-a44b-53255ed3e2f9 -- here (anarcat@angela)
    ok

Once you are sure things went on okay, you can synchronise this with `marcos`:

    git annex sync

This will push the metadata information to marcos, so it knows which files are available on `angela`. From there on, you can freely get and move files between the two repos!