summaryrefslogtreecommitdiff
path: root/doc/todo
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-10-02 11:58:31 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-10-02 11:58:31 -0400
commit6dc6b4b7d9236a2ecbc3260945e0cd532a157c26 (patch)
treef579ab43822f9cd7c7a07d7c0f433eca63777e11 /doc/todo
parent156adb9af66a17c69e6d402d1e3d3bd477af72be (diff)
you requested his old closed bugs not be deleted yet
Diffstat (limited to 'doc/todo')
-rw-r--r--doc/todo/--batch_for_add.mdwn7
-rw-r--r--doc/todo/--batch_for_find.mdwn5
-rw-r--r--doc/todo/--batch_for_info.mdwn5
-rw-r--r--doc/todo/__39__info_filename__39___to_provide_information_either_content_is_locally_present.mdwn6
-rw-r--r--doc/todo/add_option_to_whereis_to_avoid_network_interactions.mdwn21
-rw-r--r--doc/todo/checkpresentkey_without_explicit_remote.mdwn5
-rw-r--r--doc/todo/could_standalone___39__fixed__39___git-annex_binaries_be_prelinked__63__.mdwn7
-rw-r--r--doc/todo/drop_--batch.mdwn6
-rw-r--r--doc/todo/get_--batch.mdwn7
-rw-r--r--doc/todo/interface_to_the___34__progress__34___of_annex_operations.mdwn19
-rw-r--r--doc/todo/make_addurl_respect_annex.largefiles_option.mdwn6
-rw-r--r--doc/todo/metadata_--batch.mdwn3
-rw-r--r--doc/todo/parallel_get.mdwn85
-rw-r--r--doc/todo/return___34__key__34___entry_in_--json_output_for_addurl___40__and_future_add__41___--batch.mdwn5
14 files changed, 187 insertions, 0 deletions
diff --git a/doc/todo/--batch_for_add.mdwn b/doc/todo/--batch_for_add.mdwn
new file mode 100644
index 000000000..c0450c11f
--- /dev/null
+++ b/doc/todo/--batch_for_add.mdwn
@@ -0,0 +1,7 @@
+should be extremely helpful when adding many files one at a time ;)
+
+[[!meta author=yoh]]
+
+> Implemented; made it not recurse into directories and output a blank line
+> if it doesn't add the file, so there's aways 1 line of output for each
+> input. [[done]] --[[Joey]]
diff --git a/doc/todo/--batch_for_find.mdwn b/doc/todo/--batch_for_find.mdwn
new file mode 100644
index 000000000..825ca560f
--- /dev/null
+++ b/doc/todo/--batch_for_find.mdwn
@@ -0,0 +1,5 @@
+I am using `annex find filename` after running 'annex add` to figure out if file was added to annex or to git.
+
+[[!meta author=yoh]]
+
+> [[done]] --[[Joey]]
diff --git a/doc/todo/--batch_for_info.mdwn b/doc/todo/--batch_for_info.mdwn
new file mode 100644
index 000000000..8f7fba456
--- /dev/null
+++ b/doc/todo/--batch_for_info.mdwn
@@ -0,0 +1,5 @@
+I guess as other commands which take separate files/keys as its argument(s), having --batch for info command would be of benefit
+
+[[!meta author=yoh]]
+
+> [[done]] --[[Joey]]
diff --git a/doc/todo/__39__info_filename__39___to_provide_information_either_content_is_locally_present.mdwn b/doc/todo/__39__info_filename__39___to_provide_information_either_content_is_locally_present.mdwn
new file mode 100644
index 000000000..5e0598816
--- /dev/null
+++ b/doc/todo/__39__info_filename__39___to_provide_information_either_content_is_locally_present.mdwn
@@ -0,0 +1,6 @@
+ATM in DataLad we rely on 'git annex find' to determine either files have content locally. Even though it could be used in a batch mode, I wondered if we could may be just use 'annex info' to obtain information either a file (or a key) has content locally? Another benefit would be is that within single command output we could determine also if a file under annex or not (instead of first doing e.g. 'info' to figure out if under annex and then 'find' again to figure out if content is present locally)
+
+
+[[!meta author=yoh]]
+
+> sure, [[done]] --[[Joey]]
diff --git a/doc/todo/add_option_to_whereis_to_avoid_network_interactions.mdwn b/doc/todo/add_option_to_whereis_to_avoid_network_interactions.mdwn
new file mode 100644
index 000000000..c322660df
--- /dev/null
+++ b/doc/todo/add_option_to_whereis_to_avoid_network_interactions.mdwn
@@ -0,0 +1,21 @@
+I thought that whereis command would report only based on the knowledge annex has locally in git-annex branch, but apparently it is trying to query for information even in --fast mode:
+
+[[!format sh """
+$> git annex whereis --fast bold.nii.gz
+yoh@dat....
+Permission denied (publickey,password).
+fatal: Could not read from remote repository.
+
+Please make sure you have the correct access rights
+and the repository exists.
+whereis bold.nii.gz
+(2 copies)
+ 899f0347-0888-48ef-91b6-bac213ca8cef -- [datalad-archives]
+ c8bd3d05-33d4-4b59-9d53-ca7efbdcdd13 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/openfmri/ds000001 [here]
+
+ datalad-archives: dl+archive:MD5E-s2527262329--bd3ea399057c529b37b09dcecec1ca60.0raw.tgz/ds001_R1.1.0/sub001/BOLD/task001_run001/bold.nii.gz#size=47241449
+
+"""]]
+
+[[!meta author=yoh]]
+Was [[done]] by [[Joey]] as of 6.20160524+gitg2b7b2c4-1~ndall+1
diff --git a/doc/todo/checkpresentkey_without_explicit_remote.mdwn b/doc/todo/checkpresentkey_without_explicit_remote.mdwn
new file mode 100644
index 000000000..7ad667f28
--- /dev/null
+++ b/doc/todo/checkpresentkey_without_explicit_remote.mdwn
@@ -0,0 +1,5 @@
+While being asked to check if file is available from "[datalad-archives]" remote I need to check if the archive's key available. Ideally I wish I could ask through the ongoing interaction protocol, but if not, I could use smth like 'git annex checkpresentkey' but that one demands specification also of a remote which to check. In my case I just want to know if that key is available from any remote, so I could confirm that the file is still present in our archives remote, i.e. that it could be retrieved later on
+
+[[!meta author=yoh]]
+
+> [[done]] --[[Joey]]
diff --git a/doc/todo/could_standalone___39__fixed__39___git-annex_binaries_be_prelinked__63__.mdwn b/doc/todo/could_standalone___39__fixed__39___git-annex_binaries_be_prelinked__63__.mdwn
new file mode 100644
index 000000000..3b1b929ea
--- /dev/null
+++ b/doc/todo/could_standalone___39__fixed__39___git-annex_binaries_be_prelinked__63__.mdwn
@@ -0,0 +1,7 @@
+Since in datalad we are invoking git and git-annex quite frequently, and on debian systems atm relying on git-annex-standalone pkg, I wondered, if there is a possibility to get all 'shimmed' binaries prelinked against shipped core libs to avoid a current bunch of unsucesfull searches for libraries.... I thought it might provide a notable benefit.
+
+just an idea
+
+[[!meta author=yoh]]
+
+> [[fixed|done]], but without prelinking. --[[Joey]]
diff --git a/doc/todo/drop_--batch.mdwn b/doc/todo/drop_--batch.mdwn
new file mode 100644
index 000000000..775798752
--- /dev/null
+++ b/doc/todo/drop_--batch.mdwn
@@ -0,0 +1,6 @@
+There is a dropkey --batch, so I guess I could workaround but probably would be nice for consistency to have --batch mode for drop itself as well
+
+[[!meta author=yoh]]
+
+> [[done]]; went ahead and added drop --batch to be symmetric with get
+> --batch. --[[Joey]]
diff --git a/doc/todo/get_--batch.mdwn b/doc/todo/get_--batch.mdwn
new file mode 100644
index 000000000..a23b36de0
--- /dev/null
+++ b/doc/todo/get_--batch.mdwn
@@ -0,0 +1,7 @@
+It seems that it would be tremendously useful, see e.g. our [datalad install](https://github.com/datalad/datalad/issues/553)
+
+[[!meta author=yoh]]
+
+> [[done]] although the output while getting a file is not
+> machine-parseable. So, I made --json also work for get, but enabling
+> json output disables any progress display. --[[Joey]]
diff --git a/doc/todo/interface_to_the___34__progress__34___of_annex_operations.mdwn b/doc/todo/interface_to_the___34__progress__34___of_annex_operations.mdwn
new file mode 100644
index 000000000..68125c53b
--- /dev/null
+++ b/doc/todo/interface_to_the___34__progress__34___of_annex_operations.mdwn
@@ -0,0 +1,19 @@
+It would be really nice if external tools working with annex could obtain updates on the progress of operations so they could report using their own UI back to the user. Especially relevant for --batch --json modes of operations (e.g. for get or addurl) whenever no progress is reported by annex itself at all.
+
+Related:
+[datalad #478](https://github.com/datalad/datalad/issues/478)
+
+
+[[!meta author=yoh]]
+
+> --status-fd is one way, or the progress could be included as
+> part of the --batch protocol. In either case, it might make sense to
+> reuse part of the external special remote protocol. (Which would let you
+> relay the progress messages when datalad is doing a nested retrieval, I
+> suppose.) --[[Joey]]
+
+>> [[done]]; --json-progress implemented. I limited the frequency of json
+>> progress items to 10 per second max, and it's typically only 1 per
+>> second or less, so didn't implement
+>> --json-progress=N to tune it. Also added --json and --json-progress to
+>> copy, move, mirror commands. --[[Joey]]
diff --git a/doc/todo/make_addurl_respect_annex.largefiles_option.mdwn b/doc/todo/make_addurl_respect_annex.largefiles_option.mdwn
new file mode 100644
index 000000000..b80349a76
--- /dev/null
+++ b/doc/todo/make_addurl_respect_annex.largefiles_option.mdwn
@@ -0,0 +1,6 @@
+ATM git annex addurl ignores annex.largefiles option so to automate annexification or direct add to git for a list of files I need manually to download each one of them into a FILE and then "git annex add -c annex.largefiles='exclude=*.txt' FILE". But it would have been convenient if I could "addurl" some files directly from urls directly into git, as per largefiles settings.
+
+N.B. I do understand that use-case might be somewhat vague, let me know if I should expand reasoning
+[[!meta author=yoh]]
+
+> [[done]] --[[Joey]]
diff --git a/doc/todo/metadata_--batch.mdwn b/doc/todo/metadata_--batch.mdwn
new file mode 100644
index 000000000..b65dc3ef5
--- /dev/null
+++ b/doc/todo/metadata_--batch.mdwn
@@ -0,0 +1,3 @@
+[[!meta author=yoh]]
+
+> [[done]] (using json input) --[[Joey]]
diff --git a/doc/todo/parallel_get.mdwn b/doc/todo/parallel_get.mdwn
new file mode 100644
index 000000000..fb3f8d098
--- /dev/null
+++ b/doc/todo/parallel_get.mdwn
@@ -0,0 +1,85 @@
+Wish: `git annex get [files] -jN` should run up to N downloads of files
+concurrently.
+
+This can already be done by just starting up N separate git-annex
+processes all trying to get the same files. They'll coordinate themselves
+to avoid downloading the same file twice.
+
+But, the output of concurrent git annex get's in a single teminal is a
+mess.
+
+It would be nice to have something similar to docker's output when fetching
+layers of an image. Something like:
+
+ get foo1 ok
+ get foo2 ok
+ get foo3 -> 5% 100 KiB/s
+ get foo4 -> 3% 90 KiB/s
+ get foo5 -> 20% 1 MiB/s
+
+Where the bottom N lines are progress displays for the downloads that are
+currently in progress. When a download finishes, it can scroll up the
+screen with "ok".
+
+ get foo1 ok
+ get foo2 ok
+ get foo5 ok
+ get foo3 -> 5% 100 KiB/s
+ get foo4 -> 3% 90 KiB/s
+ get foo6 -> 0% 110 Kib/S
+
+This display could perhaps be generalized for other concurrent actions.
+For example, drop:
+
+ drop foo1 ok
+ drop foo2 failed
+ Not enough copies ...
+ drop foo3 -> (checking r1...)
+ drop foo4 -> (checking r2...)
+
+But, do get first.
+
+Pain points:
+
+1. Currently, git-annex lets tools like rsync and wget display their own
+ progress. This makes sense for the single-file at a time get, because
+ rsync can display better output than just a percentage. (This is especially
+ the case with aria2c for torrents, which displays seeder/leecher info in
+ addition to percentage.)
+
+ But in multi-get mode, the progress display would be simplified. git-annex
+ can already get percent done information, either as reported by individiual
+ backends, or by falling back to polling the file as it's downloaded.
+
+2. The mechanics of updating the screen for a multi-line progress output
+ require some terminal handling code. Using eg, curses, in a mode that
+ doesn't take over the whole screen display, but just moves the cursor
+ up to the line for the progress that needs updating and redraws that
+ line. Doing this portably is probably going to be a pain, especially
+ I have no idea if it can be done on Windows.
+
+ An alternative would be a display more like apt uses for concurrent
+ downloads, all on one line:
+
+ get foo1 ok
+ get foo2 ok
+ get [foo3 -> 5% 100 KiB/s] [foo4 -> 3% 90 KiB/s] [foo5 -> 20% 1 MiB/s]
+
+ The problem with that is it has to avoid scrolling off the right
+ side, so it probably has to truncate the line. Since filenames
+ are often longer than "fooN", it probably has to elipsise the filename.
+ This approach is just not as flexible or nice in general.
+
+See also: [[parallel_possibilities]]
+
+> I am looking at using the ascii-progress library for this.
+> It has nice support for multiple progress bars, and is portable.
+> I have filed 7 issues on it, around 4 of which need to get fixed before
+> it's suitable for git-annex to use.. --[[Joey]]
+
+>> `git annex get -JN` works now, but lacks any progress display.
+>> Waiting on some updates to ascii-progress. --[[Joey]]
+
+>>> Wrote concurrent-output; [[done]] --[[Joey]]
+
+[[!meta author=yoh]]
diff --git a/doc/todo/return___34__key__34___entry_in_--json_output_for_addurl___40__and_future_add__41___--batch.mdwn b/doc/todo/return___34__key__34___entry_in_--json_output_for_addurl___40__and_future_add__41___--batch.mdwn
new file mode 100644
index 000000000..504af11b8
--- /dev/null
+++ b/doc/todo/return___34__key__34___entry_in_--json_output_for_addurl___40__and_future_add__41___--batch.mdwn
@@ -0,0 +1,5 @@
+You have noted that somewhere (may be in email), that it might help us to pipeline things if 'add' was returning the key if file was added to the annex. I guess the same could apply to 'addurl' so decided to mark this separate todo
+
+[[!meta author=yoh]]
+
+> already done earlier today [[done]] --[[Joey]]