summaryrefslogtreecommitdiff
path: root/doc/git-annex.mdwn
Commit message (Collapse)AuthorAge
* add database benchmarkGravatar Joey Hess2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The benchmark shows that the database access is quite fast indeed! And, it scales linearly to the number of keys, with one exception, getAssociatedKey. Based on this benchmark, I don't think I need worry about optimising for cases where all files are locked and the database is mostly empty. In those cases, database access will be misses, and according to this benchmark, should add only 50 milliseconds to runtime. (NB: There may be some overhead to getting the database opened and locking the handle that this benchmark doesn't see.) joey@darkstar:~/src/git-annex>./git-annex benchmark setting up database with 1000 setting up database with 10000 benchmarking keys database/getAssociatedFiles from 1000 (hit) time 62.77 μs (62.70 μs .. 62.85 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 62.81 μs (62.76 μs .. 62.88 μs) std dev 201.6 ns (157.5 ns .. 259.5 ns) benchmarking keys database/getAssociatedFiles from 1000 (miss) time 50.02 μs (49.97 μs .. 50.07 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 50.09 μs (50.04 μs .. 50.17 μs) std dev 206.7 ns (133.8 ns .. 295.3 ns) benchmarking keys database/getAssociatedKey from 1000 (hit) time 211.2 μs (210.5 μs .. 212.3 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 211.0 μs (210.7 μs .. 212.0 μs) std dev 1.685 μs (334.4 ns .. 3.517 μs) benchmarking keys database/getAssociatedKey from 1000 (miss) time 173.5 μs (172.7 μs .. 174.2 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 173.7 μs (173.0 μs .. 175.5 μs) std dev 3.833 μs (1.858 μs .. 6.617 μs) variance introduced by outliers: 16% (moderately inflated) benchmarking keys database/getAssociatedFiles from 10000 (hit) time 64.01 μs (63.84 μs .. 64.18 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 64.85 μs (64.34 μs .. 66.02 μs) std dev 2.433 μs (547.6 ns .. 4.652 μs) variance introduced by outliers: 40% (moderately inflated) benchmarking keys database/getAssociatedFiles from 10000 (miss) time 50.33 μs (50.28 μs .. 50.39 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 50.32 μs (50.26 μs .. 50.38 μs) std dev 202.7 ns (167.6 ns .. 252.0 ns) benchmarking keys database/getAssociatedKey from 10000 (hit) time 1.142 ms (1.139 ms .. 1.146 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.142 ms (1.140 ms .. 1.144 ms) std dev 7.142 μs (4.994 μs .. 10.98 μs) benchmarking keys database/getAssociatedKey from 10000 (miss) time 1.094 ms (1.092 ms .. 1.096 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.095 ms (1.095 ms .. 1.097 ms) std dev 4.277 μs (2.591 μs .. 7.228 μs)
* annex.thinGravatar Joey Hess2015-12-27
| | | | | | | | | | | | | | Decided it's too scary to make v6 unlocked files have 1 copy by default, but that should be available to those who need it. This is consistent with git-annex not dropping unused content without --force, etc. * Added annex.thin setting, which makes unlocked files in v6 repositories be hard linked to their content, instead of a copy. This saves disk space but means any modification of an unlocked file will lose the local (and possibly only) copy of the old version. * Enable annex.thin by default on upgrade from direct mode to v6, since direct mode made the same tradeoff. * fix: Adjusts unlocked files as configured by annex.thin.
* merge clean into smudge commandGravatar Joey Hess2015-12-04
| | | | | | The git filter config can be used to map the single git-annex command to the 2 actions, and this avoids "git annex clean" being used for this thing, it might have a better use for that name later.
* skeleton smudge/clean filtersGravatar Joey Hess2015-12-04
|
* addurl, importfeed: Changed to honor annex.largefiles settings, when the ↵Gravatar Joey Hess2015-12-02
| | | | | | | | | content of the url is downloaded. (Not when using --fast or --relaxed.) importfeed just calls addurl functions, so inherits this from it. Note that addurl still generates a temp file, and uses that key to download the file. It just adds it to the work tree at the end when the file is small.
* import: Changed to honor annex.largefiles settings.Gravatar Joey Hess2015-12-02
|
* improve annex.largefiles documentationGravatar Joey Hess2015-12-02
|
* more warnings about networked filesystemsGravatar Joey Hess2015-11-13
|
* pid locking configuration and abstraction layer for git-annexGravatar Joey Hess2015-11-12
| | | | (not actually used anywhere yet)
* Do verification of checksums of annex objects downloaded from remotes.Gravatar Joey Hess2015-10-01
| | | | | | | | | | | | | | | | * When annex objects are received into git repositories, their checksums are verified then too. * To get the old, faster, behavior of not verifying checksums, set annex.verify=false, or remote.<name>.annex-verify=false. * setkey, rekey: These commands also now verify that the provided file matches the key, unless annex.verify=false. * reinject: Already verified content; this can now be disabled by setting annex.verify=false. recvkey and reinject already did verification, so removed now duplicate code from them. fsck still does its own verification, which is ok since it does not use getViaTmp, so verification doesn't happen twice when using fsck --from.
* annex.hardlink extended to also try to use hard links when copying from the ↵Gravatar Joey Hess2015-09-14
| | | | | | | repository to a remote. Also, it used to only check that one of the repos was not in direct mode; now when either repo is direct mode, annex.hardlink won't have an effect.
* DOC: refer to corresponding manpage not to non-existing PREFERRED CONTENT ↵Gravatar Yaroslav Halchenko2015-09-02
| | | | section
* doc/*.mdwn: Minor fixes (typos, letter case)Gravatar Øyvind A. Holm2015-07-26
|
* got bash completion working for "git annex" not just "git-annex"Gravatar Joey Hess2015-07-16
| | | | | | | This needs a patch to git to cause the git-annex completion to be auto-loaded when completing "git annex <tab>". Otherwise, it will only load when "git-annex" is tab completed. Once loaded, it works for both uses. I've submitted the git patch to the git mailing list.
* typoGravatar Joey Hess2015-07-13
|
* doc updatesGravatar Joey Hess2015-07-10
|
* sync: When annex.autocommit=false, avoid making any commit of local changes, ↵Gravatar Joey Hess2015-07-07
| | | | while still merging with remote to the extent possible.
* Brought back the setkey plumbing command that was removed in 2011, since we ↵Gravatar Joey Hess2015-07-02
| | | | found a use case for it. Note that the command's syntax was changed for consistency.
* comment and warningGravatar Joey Hess2015-07-02
|
* explicitely describe exit status in the standard sectionGravatar anarcat2015-06-23
|
* Increased the default annex.bloomaccuracy from 1000 to 10000000Gravatar Joey Hess2015-06-16
| | | | | | | | | | | | | | | | | | | This makes git annex unused use around 48 mb more memory than it did before, but the massive increase in accuracy makes this worthwhile for all but the smallest systems. Also, I want to use the bloom filter for sync --all --content, to avoid dropping files that the preferred content doesn't want, and 1/1000 false positives would be far too many in that use case, even if it were acceptable for unused. Actual memory use numbers: 1000: 21.06user 3.42system 0:26.40elapsed 92%CPU (0avgtext+0avgdata 501552maxresident)k 1000000: 21.41user 3.55system 0:26.84elapsed 93%CPU (0avgtext+0avgdata 549496maxresident)k 10000000: 21.84user 3.52system 0:27.89elapsed 90%CPU (0avgtext+0avgdata 549920maxresident)k Based on these numbers, 10 million seemed a better pick than 1 million.
* dead --key: Can be used to mark a key as dead.Gravatar Joey Hess2015-06-09
|
* add and fix refs in man mainpageGravatar Antoine Beaupré2015-05-29
|
* add annex.used-refspecGravatar Joey Hess2015-05-14
|
* required: New command, like wanted, but for required content.Gravatar Joey Hess2015-04-18
| | | | Also refactored some code to reduce duplication.
* contentlocationn: New plumbing command.Gravatar Joey Hess2015-04-09
|
* rethought distributed fsck; instead add activity.log and expire commandGravatar Joey Hess2015-04-05
| | | | This is much more space efficient!
* WIP on making --quiet silence progress, and infra for concurrent progress barsGravatar Joey Hess2015-04-03
|
* Various typo fixes in doc/*.mdwnGravatar Øyvind A. Holm2015-04-02
|
* importfeed: Avoid downloading a redundant item from a feed whose guid has ↵Gravatar Joey Hess2015-03-31
| | | | | | | been downloaded before, even when the url has changed. To support this, always store itemid in metadata; before this was only done when annex.genmetadata was set.
* --auto is no longer a global option; only get, drop, and copy accept it.Gravatar Joey Hess2015-03-25
| | | | Not a behavior change unless you were passing it to a command that ignored it.
* finished splitting out man pages for all commandsGravatar Joey Hess2015-03-25
|
* separated man pages for all the maintenance commandsGravatar Joey Hess2015-03-24
|
* separated man pages for all the setup commands while at the gate in ATLGravatar Joey Hess2015-03-23
|
* Man pages for individual commands now available, and can be opened using ↵Gravatar Joey Hess2015-03-23
| | | | "git annex help <command>"
* splitting up the man pageGravatar Joey Hess2015-03-23
| | | | | | | | Common command man pages all split out and often expanded. A few sections split out into their own pages. Still need to do all the other commands..
* migrate: --force will force migration of keys already using the destination ↵Gravatar Joey Hess2015-03-23
| | | | backend. Useful in rare cases.
* Added a post-update-annex hook, which is run after the git-annex branch is ↵Gravatar Joey Hess2015-03-20
| | | | | | updated. Needed for git update-server-info. See https://github.com/datalad/datalad/issues/1#issuecomment-84094406
* checkpresentkey: New plumbing command to check if a key can be verified to ↵Gravatar Joey Hess2015-03-20
| | | | be present on a remote.
* readpresentkey: New plumbing command for checking location log.Gravatar Joey Hess2015-03-20
|
* registerurl: New plumbing command for mass-adding urls to keys.Gravatar Joey Hess2015-03-15
|
* fromkey: Add stdin mode.Gravatar Joey Hess2015-03-15
|
* fromkey --force: Skip test that the key has its content in the annex.Gravatar Joey Hess2015-03-15
|
* addurl: Added --raw option, which bypasses special handling of quvi, ↵Gravatar Joey Hess2015-03-05
| | | | bittorrent etc urls.
* add a linkGravatar Joey Hess2015-02-25
|
* wordingGravatar Joey Hess2015-02-25
|
* The file matching options are now only accepted by commands that can ↵Gravatar Joey Hess2015-02-06
| | | | actually use them.
* import: Support file matching options such as --exclude, --include, ↵Gravatar Joey Hess2015-02-06
| | | | --smallerthan, --largerthan
* groupwanted: New command to set the groupwanted preferred content expression.Gravatar Joey Hess2015-02-06
|
* rework Differences data typeGravatar Joey Hess2015-01-28
| | | | | | | | | | | | | | Eliminated complexity and future proofed. The most important change is that all functions over Difference are now total; any Difference that can be expressed should be handled. Avoids needs for sanity checking of inputs, and version skew with the future. Also, the difference.log now serializes a [Difference], not a Differences. This saves space and keeps it simpler. Note that [Difference] might contain conflicting differences (eg, [Version5, Version6]. In this case, one of them needs to consistently win over the others, probably based on Ord.