aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/git_annex_import_is_dangerous_if_you_have_unused_objects/comment_5_1e737b740bc7d95f3329e3481d55fd35._comment
blob: 89b088f1459c4fa1d6aa252c462d02bec5c0cbac (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[[!comment format=mdwn
 username="joey"
 subject="""comment 5"""
 date="2017-03-06T16:25:53Z"
 content="""
The difficulty with checking if the content to be imported is referred to
somewhere in the working tree is that there's no inexpensive way to
determine that. It would have to run `git log -n1 -S$KEY` for each file.
That can take quite a long time in repositories with a lot of history.
I clocked it at 12 seconds per file on an SSD; will be quite a
lot slower on a disc.

I suppose that check could be added with a --fast to skip the check.

PS, mbroadhead's is a good approach. Note though that the dropunused content
will be considered a duplicate by import since git-annex
version 6.20170214. Still, --deduplicate and --clean-duplicates won't
delete the files from the import location in this case, since there
are no copies of the content in the annex.
"""]]