aboutsummaryrefslogtreecommitdiff
path: root/Types
Commit message (Collapse)AuthorAge
* annex.merge-annex-branchesGravatar Joey Hess2018-02-22
| | | | | | | | | | | | | | Added annex.merge-annex-branches config setting which can be used to disable automatic merge of git-annex branches. I wonder if git-annex merge/sync/assistant should disable this setting? Not sure yet, so have not done so. May be that users will not set it in git config, but pass it via -c to commands that need it. Checking the config setting adds a very small overhead, but it's only checked once per command so should be insignificant. This commit was supported by the NSF-funded DataLad project.
* add --json-error-messages (not yet implemented)Gravatar Joey Hess2018-02-19
| | | | | | | | | | Added --json-error-messages option, which includes error messages in the json output, rather than outputting them to stderr. The actual rediretion of errors is not implemented yet, this is only the docs and option plumbing. This commit was supported by the NSF-funded DataLad project.
* fix --json-progress --json to be same as --json --json-progressGravatar Joey Hess2018-02-19
| | | | | | | Fix behavior of --json-progress followed by --json, in which the latter option disabled the former. This commit was supported by the NSF-funded DataLad project.
* add remote.<name>.annex-checkuuidGravatar Joey Hess2018-01-10
| | | | | | | | | | | | | | | | | Added remote.<name>.annex-checkuuid config, which can be set to false to disable the default checking of the uuid of remotes that point to directories. This can be useful to avoid unncessary drive spin-ups and automounting. Note that the UUID check is still done before writing to the repository, to avoid writing to the wrong repository if it got relocated. Check is also done before checkPresent to avoid getting confused about what is in which repo. This is effectively the same as the use of git-annex-shell with a uuid to check that the remote repository is the expected one. Did not bother with the check for retrieveKeyFile because it doesn't matter if the wrong repo is used then. This commit was sponsored by Trenton Cronholm on Patreon.
* Removed the testsuite build flagGravatar Joey Hess2017-12-20
| | | | | | | | | | Test suite is always included. Building with this flag disabled has actually been broken for some time, since Command.TestRemote uses tasty. Fewer build flags are better, so good time to drop it. This commit was sponsored by Thomas Hochstein on Patreon.
* reorgGravatar Joey Hess2017-12-14
|
* rethought --relaxed changeGravatar Joey Hess2017-11-30
| | | | | | | | | Better to make it not be surprising and slow, than surprising and fast. --raw can be used when it needs to be really fast. Implemented adding a youtube-dl supported url to an existing file. This commit was sponsored by andrea rota.
* youtube-dl workingGravatar Joey Hess2017-11-29
| | | | | | | | | Including resuming and cleanup of incomplete downloads. Still todo: --fast, --relaxed, importfeed, disk reserve checking, quvi code cleanup. This commit was sponsored by Anthony DeRobertis on Patreon.
* better dup key with -J fixGravatar Joey Hess2017-10-17
| | | | | | | | | | | | | | | This avoids all the complication about redundant work discussed in the previous try at fixing this. At the expense of needing each command that could have the problem to be patched to simply wrap the action in onlyActionOn once the key is known. But there do not seem to be many such commands. onlyActionOn' should not be used with a CommandStart (or CommandPerform), although the types do allow it. onlyActionOn handles running the whole CommandStart chain. I couldn't immediately see a way to avoid mistken use of onlyActionOn'. This commit was supported by the NSF-funded DataLad project.
* Improve behavior when -J transfers multiple files that point to the same keyGravatar Joey Hess2017-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a false start, I found a fairly non-intrusive way to deal with it. Although it only handles transfers -- there may be issues with eg concurrent dropping of the same key, or other operations. There is no added overhead when -J is not used, other than an added inAnnex check. When -J is used, it has to maintain and check a small Set, which should be negligible overhead. It could output some message saying that the transfer is being done by another thread. Or it could even display the same progress info for both files that are being downloaded since they have the same content. But I opted to keep it simple, since this is rather an edge case, so it just doesn't say anything about the transfer of the file until the other thread finishes. Since the deferred transfer action still runs, actions that do more than transfer content will still get a chance to do their other work. (An example of something that needs to do such other work is P2P.Annex, where the download always needs to receive the content from the peer.) And, if the first thread fails to complete a transfer, the second thread can resume it. But, this unfortunately means that there's a risk of redundant work being done to transfer a key that just got transferred. That's not ideal, but should never cause breakage; the same thing can occur when running two separate git-annex processes. The get/move/copy/mirror --from commands had extra inAnnex checks added, inside the download actions. Without those checks, the first thread downloaded the content, and then the second thread woke up and downloaded the same content redundantly. move/copy/mirror --to is left doing redundant uploads for now. It would need a second checkPresent of the remote inside the upload to avoid them, which would be expensive. A better way to avoid redundant work needs to be found.. This commit was supported by the NSF-funded DataLad project.
* metadata: Added --remove-all.Gravatar Joey Hess2017-09-28
| | | | | | | Motivation is to remove all metadata when it gets copied from a previous version of the file, and that is not deisrable. This commit was supported by the NSF-funded DataLad project.
* configuration and docs for tracking exportsGravatar Joey Hess2017-09-19
| | | | | | Not yet handled by sync or assistant. This commit was sponsored by Nick Daly on Patreon.
* add ExportTree table to export dbGravatar Joey Hess2017-09-18
| | | | | | | | | | | | New table needed to look up what filenames are used in the currently exported tree, for reasons explained in export.mdwn. Also, added smart constructors for ExportLocation and ExportDirectory to make sure they contain filepaths with the right direction slashes. And some code refactoring. This commit was sponsored by Francois Marier on Patreon.
* split out Types.ExportGravatar Joey Hess2017-09-15
|
* avoid unncessary db queries when exported directory can't be emptyGravatar Joey Hess2017-09-15
| | | | | | In rename foo/bar to foo/baz, foo can't be empty. In delete zxyyz, there's no exported directory (top doesn't count).
* remove empty directories when removing from exportGravatar Joey Hess2017-09-15
| | | | | | | | | | | | | | | The subtle part of this is what happens when the remote fails to remove an empty directory. The removal from the export needs to fail in that case, so the removal will be tried again later. However, removeExportLocation has already been run and changed the export db, so if the next run checks getExportLocation, it might decide nothing remains to be done, leaving the empty directory. Dealt with that by making removeEmptyDirectories, handle a failure by calling addExportLocation, reverting the database changes so the next run will be guaranteed to try deleting the empty directory again. This commit was sponsored by Thomas Hochstein on Patreon.
* implement removeExportDirectoryGravatar Joey Hess2017-09-15
| | | | | | | | | | | | | | | Not yet called by Command.Export. WebDAV needs this to clean up empty collections. Also, example.sh turned out to not be cleaning up directories when removing content from them, so it made sense for it to use this. Remote.Directory did not need it, and since its cleanup method for empty directories is more efficient than what Command.Export will need to do to find empty directories, it uses Nothing so that extra work can be avoided. This commit was sponsored by Thom May on Patreon.
* export: cache connections for S3 and webdavGravatar Joey Hess2017-09-12
|
* External special remote protocol extended to support export.Gravatar Joey Hess2017-09-08
| | | | | | Also updated example.sh to support export. This commit was supported by the NSF-funded DataLad project.
* prevent exporttree=yes on remotes that don't support exportsGravatar Joey Hess2017-09-07
| | | | | | | | | Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.
* export file renamingGravatar Joey Hess2017-09-06
| | | | | | | | | | | | | | | | | This is seriously super hairy. It has to handle interrupted exports, which may be resumed with the same or a different tree. It also has to recover from export conflicts, which could cause the wrong content to be renamed to a file. I think this works, or is close to working. See the update to the design for how it works. This is definitely not optimal, in that it does more renames than are necessary. It would probably be worth finding the keys that are really renamed and only renaming those. But let's get the "simple" approach to work first.. This commit was supported by the NSF-funded DataLad project.
* git annex get from exportsGravatar Joey Hess2017-09-04
| | | | | | | | | | | | | | Straightforward enough, except for the needed belt-and-suspenders sanity checks to avoid foot shooting due to exports not being key/value stores. * Even when annex.verify=false, always verify from exports. * Only get files from exports that use a backend that supports checksum verification. * Never trust exports, even if the user says to, because then `git annex drop` would drop content if the export seemed to contain a copy. This commit was supported by the NSF-funded DataLad project.
* use export db to correctly handle duplicate filesGravatar Joey Hess2017-09-04
| | | | | | | | | | | | | | | | | Removed uncorrect UniqueKey key in db schema; a key can appear multiple times with different files. The database has to be flushed after each removal. But when adding files to the export, lots of changes are able to be queued up w/o flushing. So it's still fairly efficient. If large removals of files from exports are too slow, an alternative would be to make two passes over the diff, one pass queueing deletions from the database, then a flush and the a second pass updating the location log. But that would use more memory, and need to look up exportKey twice per removed file, so I've avoided such optimisation yet. This commit was supported by the NSF-funded DataLad project.
* implement exporttree=yes configurationGravatar Joey Hess2017-09-04
| | | | | | | | | | | | | | | | * Only export to remotes that were initialized to support it. * Prevent storing key/value on export remotes. * Prevent enabling exporttree=yes and encryption in the same remote. SetupStage Enable was changed to take the old RemoteConfig. This allowed only setting exporttree when initially setting up a remote, and not configuring it later after stuff might already be stored in the remote. Went with =yes rather than =true for consistency with other parts of git-annex. Changed docs accordingly. This commit was supported by the NSF-funded DataLad project.
* refactor ExportActionsGravatar Joey Hess2017-09-01
| | | | | | | | This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project
* make storeExport atomicGravatar Joey Hess2017-08-31
| | | | | | | | | | | | | | | | | | This avoids needing to deal with the complexity of partially transferred files in the export. We'd not be able to resume uploading to such a file anyway, so just avoid them. The implementation in Remote.Directory is not completely ideal, because it could leave the temp file hanging around in the export directory. This only happens if it's killed with -9, or there's a power failure; normally viaTmp cleans up after itself, even when interrupted. I could not see a better way to do it though, since the export directory might be the root of a filesystem. Also some design thoughts on resuming, which depend on storeExport being atomic. This commit was sponsored by Fernando Jimenez on Partreon.
* provide file with content to exportGravatar Joey Hess2017-08-29
| | | | | | | | | | | | | Rather than providing the key to export, provide the file. When exporting a treeish that contains files that are not annexed, this will let the content of those files also be exported. There's still a Key in the interface; it will be used by the external special remote protocol. A SHA1 key can be used when exporting non-annexed files. This commit was sponsored by Brock Spratlen on Patreon.
* add API for exportingGravatar Joey Hess2017-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implemented so far for the directory special remote. Several remotes don't make sense to export to. Regular Git remotes, obviously, do not. Bup remotes almost certianly do not, since bup would need to be used to extract the export; same store for Ddar. Web and Bittorrent are download-only. GCrypt is always encrypted so exporting to it would be pointless. There's probably no point complicating the Hook remotes with exporting at this point. External, S3, Glacier, WebDAV, Rsync, and possibly Tahoe should be modified to support export. Thought about trying to reuse the storeKey/retrieveKeyFile/removeKey interface, rather than adding a new interface. But, it seemed better to keep it separate, to avoid a complicated interface that sometimes encrypts/chunks key/value storage and sometimes users non-key/value storage. Any common parts can be factored out. Note that storeExport is not atomic. doc/design/exporting_trees_to_special_remotes.mdwn has some things in the "resuming exports" section that bear on this decision. Basically, I don't think, at this time, that an atomic storeExport would help with resuming, because exports are not key/value storage, and we can't be sure that a partially uploaded file is the same content we're currently trying to export. Also, note that ExportLocation will always use unix path separators. This is important, because users may export from a mix of windows and unix, and it avoids complicating the API with path conversions, and ensures that in such a mix, they always use the same locations for exports. This commit was sponsored by Bruno BEAUFILS on Patreon.
* use DynamicConfig to handle cost-commandGravatar Joey Hess2017-08-17
| | | | This commit was sponsored by Jake Vosloo on Patreon.
* add annex-ignore-command and annex-sync-command configsGravatar Joey Hess2017-08-17
| | | | | | | | | | | | | | | | Added remote configuration settings annex-ignore-command and annex-sync-command, which are dynamic equivilants of the annex-ignore and annex-sync configurations. For this I needed a new DynamicConfig infrastructure. Its implementation should be as fast as before when there is no dynamic config, and it caches so shell commands are only run once. Note that annex-ignore-command exits nonzero when the remote should be ignored. While that may seem backwards, it allows using the same command for it as for annex-sync-command when you want to disable both. This commit was sponsored by Trenton Cronholm on Patreon.
* configuration to disable automatic merge conflict resolutionGravatar Joey Hess2017-06-01
| | | | | | | | | | | | | | | * Added annex.resolvemerge configuration, which can be set to false to disable the usual automatic merge conflict resolution done by git-annex sync and the assistant. * sync: Added --no-resolvemerge option. Note that disabling merge conflict resolution is probably not a good idea in a direct mode repo or adjusted branch. Since updates to both are done outside the usual work tree, if it fails the tree is not left in a conflicted state, and it would be hard to manually resolve the conflict. Still, made annex.resolvemerge be supported in those cases for consistency. This commit was sponsored by Riku Voipio.
* adeiu, MissingHGravatar Joey Hess2017-05-16
| | | | | | | | | | | | | | | | Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.
* Ssh password prompting improved when using -JGravatar Joey Hess2017-05-11
| | | | | | | | | | | | | When ssh connection caching is enabled (and when GIT_ANNEX_USE_GIT_SSH is not set), only one ssh password prompt will be made per host, and only one ssh password prompt will be made at a time. This also fixes a race in prepSocket's stale ssh connection stopping when run with -J. It was possible for one thread to start a cached ssh connection, and another thread to immediately stop it, resulting in excess connections being made. This commit was supported by the NSF-funded DataLad project.
* de-Maybe remoteGitConfigGravatar Joey Hess2017-05-11
| | | | It's always set, so does not need to be a Maybe.
* annex.backend is the new name for what was annex.backendsGravatar Joey Hess2017-05-09
| | | | | | | | | It takes a single key-value backend, rather than the unncessary and confusing list. The old option still works if set. Simplified some old old code too. This commit was sponsored by Thomas Hochstein on Patreon.
* Added remote.<name>.annex-push and remote.<name>.annex-pullGravatar Joey Hess2017-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The former can be useful to make remotes that don't get fully synced with local changes, which comes up in a lot of situations. The latter was mostly added for symmetry, but could be useful (though less likely to be). Implementing `remote.<name>.annex-pull` was a bit tricky, as there's no one place where git-annex pulls/fetches from remotes. I audited all instances of "fetch" and "pull". A few cases were left not checking this config: * Git.Repair can try to pull missing refs from a remote, and if the local repo is corrupted, that seems a reasonable thing to do even though the config would normally prevent it. * Assistant.WebApp.Gpg and Remote.Gcrypt and Remote.Git do fetches as part of the setup process of a remote. The config would probably not be set then, and having the setup fail seems worse than honoring it if it is already set. I have not prevented all the code that does a "merge" from merging branches from remotes with remote.<name>.annex-pull=false. That could perhaps be done, but it would need a way to map from branch name to remote name, and the way refspecs work makes that hard to get really correct. So if the user fetches manually, the git-annex branch will get merged, for example. Anther way of looking at/justifying this is that the setting is called "annex-pull", not "annex-merge". This commit was supported by the NSF-funded DataLad project.
* test suite infra for testing mocked ssh remotesGravatar Joey Hess2017-03-17
| | | | This commit was supported by the NSF-funded DataLad project.
* AssociatedFile newtypeGravatar Joey Hess2017-03-10
| | | | | | To prevent any further mistakes like 1a497cefb47557f0b4788c606f9071be422b2511 This commit was sponsored by Francois Marier on Patreon.
* improve layoutGravatar Joey Hess2017-03-01
|
* Fix reversion in yesterday's release that made SHA1E and MD5E backends not work.Gravatar Joey Hess2017-03-01
|
* fix build with old ghcGravatar Joey Hess2017-02-28
|
* annex.securehashesonlyGravatar Joey Hess2017-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cryptographically secure hashes can be forced to be used in a repository, by setting annex.securehashesonly. This does not prevent the git repository from containing files with insecure hashes, but it does prevent the content of such files from being pulled into .git/annex/objects from another repository. We want to make sure that at no point does git-annex accept content into .git/annex/objects that is hashed with an insecure key. Here's how it was done: * .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be written to it normally * So every place that writes content must call, thawContent or modifyContent. We can audit for these, and be sure we've considered all cases. * The main functions are moveAnnex, and linkToAnnex; these were made to check annex.securehashesonly, and are the main security boundary for annex.securehashesonly. * Most other calls to modifyContent deal with other files in the KEY directory (inode cache etc). The other ones that mess with the content are: - Annex.Direct.toDirectGen, in which content already in the annex directory is moved to the direct mode file, so not relevant. - fix and lock, which don't add new content - Command.ReKey.linkKey, which manually unlocks it to make a copy. * All other calls to thawContent appear safe. Made moveAnnex return a Bool, so checked all callsites and made them deal with a failure in appropriate ways. linkToAnnex simply returns LinkAnnexFailed; all callsites already deal with it failing in appropriate ways. This commit was sponsored by Riku Voipio.
* add cryptographicallySecureGravatar Joey Hess2017-02-27
| | | | | | | | | | | Note that GPGHMAC keys are not cryptographically secure, because their content has no relation to the name of the key. So, things that use this function to avoid sending keys to a remote will need to special case in support for those keys. If GPGHMAC keys were accepted as cryptographically secure, symlinks using them could be committed to a git repo, and their content would be accepted into the repo, with no guarantee that two repos got the same content, which is what we're aiming to prevent.
* fix up Read instance incompatability caused by recent commitGravatar Joey Hess2017-02-24
| | | | | | | | | | | | | | | | | | | 2f868db90c7ba16eee901b9b1472b1e1a889dd93 changed the Read instance for Key. I've checked all uses of that instance (by removing it and seeing what breaks), and they're all limited to the webapp, except one. That is GitAnnexDistribution's Read instance. So, 2f868db90c7ba16eee901b9b1472b1e1a889dd93 would have broken upgrades of git-annex from downloads.kitenet.net. Once the .info files there got updated for a new release, old releases would have failed to parse them and never upgraded. To fix this, I found a way to make the .info files that contain GitAnnexDistribution values be readable by the old version of git-annex. This commit was sponsored by Ewen McNeill.
* add KeyVariety typeGravatar Joey Hess2017-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.
* factor non-type stuff out of KeyGravatar Joey Hess2017-02-24
|
* make file2key reject E* backend keys with a long extensionGravatar Joey Hess2017-02-24
| | | | | | | | | | | | | | | | | | | | I am not happy that I had to put backend-specific code in file2key. But it would be very difficult to avoid this layering violation. Most of the time, when parsing a Key from a symlink target, git-annex never looks up its Backend at all, so adding this check to a method of the Backend object would not work. The Key could be made to contain the appropriate Backend, but since Backend is parameterized on an "a" that is fixed to the Annex monad later, that would need Key to change to "Key a". The only way to clean this up that I can see would be to have the Key contain a LowlevelBackend, and put the validation in LowlevelBackend. Perhaps later, but that would be an extensive change, so let's not do it in this commit which may want to cherry-pick to backports. This commit was sponsored by Ethan Aubin.
* Tighten key parser to not accept keys containing a non-numeric fields, which ↵Gravatar Joey Hess2017-02-24
| | | | | | | | | could be used to embed data useful for a SHA1 attack against git. Also todo about why this is important, and with some further hardening to add. This commit was sponsored by Ignacio on Patreon.
* post-recive hook to make updateInstead work in direct mode and adjusted branchesGravatar Joey Hess2017-02-17
| | | | | | | | * Added post-recieve hook, which makes updateInstead work with direct mode and adjusted branches. * init: Set up the post-receive hook. This commit was sponsored by Fernando Jimenez on Patreon.
* add SetupStage parameter to RemoteType.setupGravatar Joey Hess2017-02-07
| | | | | | | | | | | | | | | | | Most remotes have an idempotent setup that can be reused for enableremote, but in a few cases, it needs to tell which, and whether a UUID was provided to setup was used. This is groundwork for making initremote be able to provide a UUID. It should not change any behavior. Note that it would be nice to make the UUID always be provided to setup, and make setup not need to generate and return a UUID. What prevented this simplification is Remote.Git.gitSetup, which needs to reuse the UUID of the git remote when setting it up, and so has to return that UUID. This commit was sponsored by Thom May on Patreon.