aboutsummaryrefslogtreecommitdiff
path: root/Annex/Content.hs
Commit message (Collapse)AuthorAge
* make sure that lockContentShared is always paired with an inAnnex checkGravatar Joey Hess2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | lockContentShared had a screwy caveat that it didn't verify that the content was present when locking it, but in the most common case, eg indirect mode, it failed to lock when the content is not present. That led to a few callers forgetting to check inAnnex when using it, but the potential data loss was unlikely to be noticed because it only affected direct mode I think. Fix data loss bug when the local repository uses direct mode, and a locally modified file is dropped from a remote repsitory. The bug caused the modified file to be counted as a copy of the original file. (This is not a severe bug because in such a situation, dropping from the remote and then modifying the file is allowed and has the same end result.) And, in content locking over tor, when the remote repository is in direct mode, it neglected to check that the content was actually present when locking it. This could cause git annex drop to remove the only copy of a file when it thought the tor remote had a copy. So, make lockContentShared do its own inAnnex check. This could perhaps be optimised for direct mode, to avoid the check then, since locking the content necessarily verifies it exists there, but I have not bothered with that. This commit was sponsored by Jeff Goeke-Smith on Patreon.
* fix windows buildGravatar Joey Hess2017-12-05
|
* honor annex.diskreserve when running youtube-dlGravatar Joey Hess2017-11-30
| | | | This commit was sponsored by André Pereira on Patreon.
* rethought --relaxed changeGravatar Joey Hess2017-11-30
| | | | | | | | | Better to make it not be surprising and slow, than surprising and fast. --raw can be used when it needs to be really fast. Implemented adding a youtube-dl supported url to an existing file. This commit was sponsored by andrea rota.
* youtube-dl workingGravatar Joey Hess2017-11-29
| | | | | | | | | Including resuming and cleanup of incomplete downloads. Still todo: --fast, --relaxed, importfeed, disk reserve checking, quvi code cleanup. This commit was sponsored by Anthony DeRobertis on Patreon.
* add gitAnnexTmpWorkDir and withTmpWorkDirGravatar Joey Hess2017-11-29
| | | | | | | | | Needed to run youtube-dl in, but could also be useful for other stuff. The tricky part of this was making the workdir be cleaned up whenever the tmp object file is cleaned up. This commit was sponsored by Ole-Morten Duesund on Patreon.
* enable LambdaCase and convert around 10% of places that could use itGravatar Joey Hess2017-11-15
| | | | | | | | | | | Needs ghc 7.6.1, so minimum base version increased slightly. All builds are well above this version of ghc, and debian oldstable is as well. Code that could use lambdacase can be found by running: git grep -B 1 'case ' | less and searching in less for "<-" This commit was sponsored by andrea rota.
* use unix-compat 0.5 on windowsGravatar Joey Hess2017-11-14
| | | | Re-applying ac57659e61f9743aebd35258e89752ced0040f9f
* Revert "use unix-compat 0.5 on windows"Gravatar Joey Hess2017-11-09
| | | | | | This reverts commit ac57659e61f9743aebd35258e89752ced0040f9f. Too early for this; needs newer Win32 version. Le sigh.
* use unix-compat 0.5 on windowsGravatar Joey Hess2017-11-09
| | | | | | | | | | | That version has my patches for the problems that Utility.PosixFiles was working around, so am able to get rid of that module now. This will later allow bringing back the custom-setup stanza in the cabal file. It will need to depend on unix-compat 0.5 on all OS's, which I'm not ready to do yet. This commit was sponsored by Nick Daly on Patreon.
* prevent exporttree=yes on remotes that don't support exportsGravatar Joey Hess2017-09-07
| | | | | | | | | Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.
* git annex get from exportsGravatar Joey Hess2017-09-04
| | | | | | | | | | | | | | Straightforward enough, except for the needed belt-and-suspenders sanity checks to avoid foot shooting due to exports not being key/value stores. * Even when annex.verify=false, always verify from exports. * Only get files from exports that use a backend that supports checksum verification. * Never trust exports, even if the user says to, because then `git annex drop` would drop content if the export seemed to contain a copy. This commit was supported by the NSF-funded DataLad project.
* annex.securehashesonlyGravatar Joey Hess2017-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cryptographically secure hashes can be forced to be used in a repository, by setting annex.securehashesonly. This does not prevent the git repository from containing files with insecure hashes, but it does prevent the content of such files from being pulled into .git/annex/objects from another repository. We want to make sure that at no point does git-annex accept content into .git/annex/objects that is hashed with an insecure key. Here's how it was done: * .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be written to it normally * So every place that writes content must call, thawContent or modifyContent. We can audit for these, and be sure we've considered all cases. * The main functions are moveAnnex, and linkToAnnex; these were made to check annex.securehashesonly, and are the main security boundary for annex.securehashesonly. * Most other calls to modifyContent deal with other files in the KEY directory (inode cache etc). The other ones that mess with the content are: - Annex.Direct.toDirectGen, in which content already in the annex directory is moved to the direct mode file, so not relevant. - fix and lock, which don't add new content - Command.ReKey.linkKey, which manually unlocks it to make a copy. * All other calls to thawContent appear safe. Made moveAnnex return a Bool, so checked all callsites and made them deal with a failure in appropriate ways. linkToAnnex simply returns LinkAnnexFailed; all callsites already deal with it failing in appropriate ways. This commit was sponsored by Riku Voipio.
* add KeyVariety typeGravatar Joey Hess2017-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.
* Avoid backtraces on expected failures when built with ghc 8; only use ↵Gravatar Joey Hess2016-11-15
| | | | | | | | | | | | | backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.
* make --json-progress work for url downloadsGravatar Joey Hess2016-09-09
|
* get, move, copy, mirror: Added --failed switch which retries failed copies/movesGravatar Joey Hess2016-08-03
| | | | | | | | | Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/
* reinject: When src file's content cannot be verified, leave it alone, ↵Gravatar Joey Hess2016-04-20
| | | | instead of deleting it.
* Preserve execute bits of unlocked files in v6 mode.Gravatar Joey Hess2016-04-14
| | | | | | | | | | | | | | When annex.thin is set, adding an object will add the execute bits to the work tree file, and this does mean that the annex object file ends up executable. This doesn't add any complexity that wasn't already present, because git annex add of an executable file has always ingested it so that the annex object ends up executable. But, since an annex object file can be executable or not, when populating an unlocked file from one, the executable bit is always added or removed to match the mode of the pointer file.
* hard links on windowsGravatar Joey Hess2016-04-08
| | | | | * annex.thin and annex.hardlink are now supported on Windows. * unannex --fast now makes hard links on Windows.
* refactorGravatar Joey Hess2016-03-09
|
* Always try to thaw content, even when annex.crippledfilesystem is set.Gravatar Joey Hess2016-03-09
|
* remove 163 lines of code without changing anything except importsGravatar Joey Hess2016-01-20
|
* migrate and rekey v6 unlocked file supportGravatar Joey Hess2016-01-07
|
* use TopFilePath for associated filesGravatar Joey Hess2016-01-05
| | | | | | | | | | | | | | | Fixes several bugs with updates of pointer files. When eg, running git annex drop --from localremote it was updating the pointer file in the local repository, not the remote. Also, fixes drop ../foo when run in a subdir, and probably lots of other problems. Test suite drops from ~30 to 11 failures now. TopFilePath is used to force thinking about what the filepath is relative to. The data stored in the sqlite db is still just a plain string, and TopFilePath is a newtype, so there's no overhead involved in using it in DataBase.Keys.
* convert isPointerFile from Annex to IOGravatar Joey Hess2016-01-01
|
* fix inode cache consistency bug when a merge unlocks a present fileGravatar Joey Hess2015-12-29
| | | | | | | | Since the file was present and locked, its annex object was not in the inode cache. So, despite not needing to update the annex object when the clean filter is run on the content by git merge, it does need to record the inode cache of the annex object. Otherwise, the annex object will be assumed to be bad, since its inode is not cached.
* fix windows buildGravatar Joey Hess2015-12-28
|
* annex.thinGravatar Joey Hess2015-12-27
| | | | | | | | | | | | | | Decided it's too scary to make v6 unlocked files have 1 copy by default, but that should be available to those who need it. This is consistent with git-annex not dropping unused content without --force, etc. * Added annex.thin setting, which makes unlocked files in v6 repositories be hard linked to their content, instead of a copy. This saves disk space but means any modification of an unlocked file will lose the local (and possibly only) copy of the old version. * Enable annex.thin by default on upgrade from direct mode to v6, since direct mode made the same tradeoff. * fix: Adjusts unlocked files as configured by annex.thin.
* populate unlocked files with newly available content when ingestingGravatar Joey Hess2015-12-22
| | | | | | This can happen when ingesting a new file in either locked or unlocked mode, when some unlocked files in the repo use the same key, and the content was not locally available before.
* make linkAnnex detect when the file changes as it's being copied/linked inGravatar Joey Hess2015-12-22
| | | | | | | | | This fixes a race where the modified file ended up in annex/objects, and the InodeCache stored in the database was for the modified version, so git-annex didn't know it had gotten modified. The race could occur when the smudge filter was running; now it gets the InodeCache before generating the Key, which avoids the race.
* implemented upgrade of direct mode repo to v6Gravatar Joey Hess2015-12-15
|
* update inode cache to cover file even when nothing needs to be done to linkAnnexGravatar Joey Hess2015-12-15
| | | | | | This covers the case where multiple files have the same content and are added with git add. Previously only the one that was linked to the annex got its inode cached; now both are.
* checked getKeysPresent; it's ok for v6 unlocked filesGravatar Joey Hess2015-12-11
| | | | | | When a v6 unlocked files is removed from the work tree, unused doesn't show it. When it gets removed from the index, unused does show it. This is the same as a locked file.
* finish v6 git-annex lockGravatar Joey Hess2015-12-11
| | | | This was a doozy!
* only make 1 hardlink max between pointer file and annex objectGravatar Joey Hess2015-12-11
| | | | | | | If multiple files point to the same annex object, the user may want to modify them independently, so don't use a hard link. Also, check diskreserve when copying.
* Merge branch 'master' into smudgeGravatar Joey Hess2015-12-11
|\
| * fsck: Failed to honor annex.diskreserve when checking a remote.Gravatar Joey Hess2015-12-11
| |
* | wipGravatar Joey Hess2015-12-11
| |
* | add generalized linkAnnex'Gravatar Joey Hess2015-12-10
| |
* | check InodeCache in inAnnex et alGravatar Joey Hess2015-12-10
| | | | | | | | | | | | This avoids querying the database when the content file doen't exist (or otherwise fails the provided check). However, it does add overhead of querying the database, and will certianly impact performance.
* | check inode cache in prepSendAnnexGravatar Joey Hess2015-12-10
| | | | | | | | | | This does mean one query of the database every time an object is sent. May impact performance.
* | make clear when code is using deprecated direct mode filesGravatar Joey Hess2015-12-09
| |
* | reorderGravatar Joey Hess2015-12-09
| |
* | use InodeCache when dropping a key to see if a pointer file can be safely resetGravatar Joey Hess2015-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Keys database can hold multiple inode caches for a given key. One for the annex object, and one for each pointer file, which may not be hard linked to it. Inode caches for a key are recorded when its content is added to the annex, but only if it has known pointer files. This is to avoid the overhead of maintaining the database when not needed. When the smudge filter outputs a file's content, the inode cache is not updated, because git's smudge interface doesn't let us write the file. So, dropping will fall back to doing an expensive verification then. Ideally, git's interface would be improved, and then the inode cache could be updated then too.
* | add inode cache to the dbGravatar Joey Hess2015-12-09
| | | | | | | | | | | | | | | | | | Renamed the db to keys, since it is various info about a Keys. Dropping a key will update its pointer files, as long as their content can be verified to be unmodified. This falls back to checksum verification, but I want it to use an InodeCache of the key, for speed. But, I have not made anything populate that cache yet.
* | move InodeSentinal from direct mode code to its own moduleGravatar Joey Hess2015-12-09
| | | | | | | | | | Will be used outside of direct mode for v6 unlocked files, and is already used outside of direct mode when adding files to annex.
* | link/copy pointer files to object content when it's addedGravatar Joey Hess2015-12-09
| |
* | clean filter should update location log when adding new content to annexGravatar Joey Hess2015-12-04
| |
* | basic clean filter workingGravatar Joey Hess2015-12-04
|/