summaryrefslogtreecommitdiff
path: root/doc/todo
diff options
context:
space:
mode:
Diffstat (limited to 'doc/todo')
-rw-r--r--doc/todo/Bittorrent-like_features.mdwn62
-rw-r--r--doc/todo/Wishlist:_sanitychecker_fix_wrong_UUID__47__duplicate_remote.mdwn7
-rw-r--r--doc/todo/mdwn2man:_make_backticks_bold.mdwn20
-rw-r--r--doc/todo/nicer_whereis_output.mdwn81
4 files changed, 146 insertions, 24 deletions
diff --git a/doc/todo/Bittorrent-like_features.mdwn b/doc/todo/Bittorrent-like_features.mdwn
index 4ec991a65..41988a422 100644
--- a/doc/todo/Bittorrent-like_features.mdwn
+++ b/doc/todo/Bittorrent-like_features.mdwn
@@ -1,31 +1,45 @@
-I made an oops and created a wishlist thread in the forum regarding bittorrent-like behaviour. Sorry, my bad.
+There are two different possible ways git-annex could use bittorrent:
-Here's the original thread:
-http://git-annex.branchable.com/forum/Wishlist:_Bittorrent-like_transfers/
+Let's describe those one by one.
-I think I summed up pretty well what bittorrent-like features could be added to git-annex in one of the posts, so I'll copy and paste some of it here (with slight clarifications added in).
+[[!toc]]
->Disclaimer: I'm thinking out loud of what could make git-annex even more awesome. I don't expect this to be implemented any time soon. Please pardon any dumbassery.
+Downloading files from multiple git-annex sources simultaneously
+================================================================
->Having your remotes (optionally!) act like a swarm would be an awesome feature to have because you bring in a lot of new features that optimize storage, bandwidth, and overall traffic usage. This would be made a lot easier if parts of it were implemented in small steps that added a nifty feature. The best part is, each of these could be implemented by themselves, and they're all features that would be really useful.
->
->Step 1. Concurrent downloads of a file from remotes.
->
->This would make sense to have, it saves upload traffic on your remotes, and you also get faster DL speeds on the receiving end.
->
->Step 2. Implementing part of the super-seeding capabilities.
->
->You upload pieces of a file to different remotes from your laptop, and on your desktop you can download all those pieces and put them together again to get a complete file. If you really wanted to get fancy, you could build in redundancy (ala RAID) so if a remote or two gets lost, you don't lose the entire file. This would be a very efficient use of storage if you have a bunch of free cloud storage accounts (~1GB each) and some big files you want to back up.
->
->Step 3. Setting it up so that those remotes could talk to one another and share those pieces.
->
->This is where it gets more like bittorrent. Useful because you upload 1 copy and in a few hours, have say, 5 complete copies on 5 different remotes. You could add or remove remotes from a swarm locally, and push those changes to those remotes, which then adapt themselves to suit the new rules and share those with other remotes in the swarm (rules should be GPG-signed as a safety precaution). Also, if/when deltas get implemented, you could push that delta to the swarm and have all the remotes adopt it. This is cooler than regular bittorrent because the shared file can be updated. As a safety precaution, the delta could be GPG signed so a corrupt file doesn't contaminate the entire swarm. Each remote could have bandwidth/storage limits set in a dotfile.
->
->This is a high-level idea of how it might work, and it's also a HUGE set of features to add, but if implemented, you'd be saving a ton of resources, adding new use cases, and making git-annex more flexible.
+Having your remotes (optionally!) act like a swarm would be an awesome feature to have because you bring in a lot of new features that optimize storage, bandwidth, and overall traffic usage. This would be made a lot easier if parts of it were implemented in small steps that added a nifty feature. The best part is, each of these could be implemented by themselves, and they're all features that would be really useful.
-And this:
+ 1. Concurrent downloads of a file from remotes.
->Obviously, Step 3 would only work on remotes that you have control of processes on, but if given login credentials to cloud storage remotes (potentially dangerous!) they could read/write to something like dropbox or rsync.
->
->Another thing, this would be completely trackerless. You just use remote groups (or create swarm definitions) and share those with your remotes. **It's completely decentralized!**
+ This would make sense to have, it saves upload traffic on your remotes, and you also get faster DL speeds on the receiving end.
+ 2. Implementing part of the super-seeding capabilities.
+
+ You upload pieces of a file to different remotes from your laptop, and on your desktop you can download all those pieces and put them together again to get a complete file. If you really wanted to get fancy, you could build in redundancy (ala RAID) so if a remote or two gets lost, you don't lose the entire file. This would be a very efficient use of storage if you have a bunch of free cloud storage accounts (~1GB each) and some big files you want to back up.
+
+ 3. Setting it up so that those remotes could talk to one another and share those pieces.
+
+ This is where it gets more like bittorrent. Useful because you upload 1 copy and in a few hours, have say, 5 complete copies on 5 different remotes. You could add or remove remotes from a swarm locally, and push those changes to those remotes, which then adapt themselves to suit the new rules and share those with other remotes in the swarm (rules should be GPG-signed as a safety precaution). Also, if/when deltas get implemented, you could push that delta to the swarm and have all the remotes adopt it. This is cooler than regular bittorrent because the shared file can be updated. As a safety precaution, the delta could be GPG signed so a corrupt file doesn't contaminate the entire swarm. Each remote could have bandwidth/storage limits set in a dotfile.
+
+This is a high-level idea of how it might work, and it's also a HUGE set of features to add, but if implemented, you'd be saving a ton of resources, adding new use cases, and making git-annex more flexible.
+
+Obviously, Step 3 would only work on remotes that you have control of processes on, but if given login credentials to cloud storage remotes (potentially dangerous!) they could read/write to something like dropbox or rsync.
+
+Another thing, this would be completely trackerless. You just use remote groups (or create swarm definitions) and share those with your remotes. **It's completely decentralized!**
+
+This was originally posted [[as a forum post|forum/Wishlist:_Bittorrent-like_transfers]] by [[users/GLITTAH]].
+
+Using an external client (addurl torrent support)
+=================================================
+
+The alternative to this would be to add `addurl` support for bittorrent files. The same way we can now add Youtube videos to a git-annex repository thanks to [[quvi]], we could also simply do:
+
+ git annex addtorrent debian-live-7.0.0-amd64-standard.iso.torrent
+
+or even better:
+
+ git annex addurl http://cdimage.debian.org/debian-cd/current-live/amd64/bt-hybrid/debian-live-7.0.0-amd64-standard.iso.torrent
+
+This way, a torrent would just become another source for a specific file. When we `get` the file, it fires up `$YOUR_FAVORITE_TORRENT_CLIENT` to download the file.
+
+That way we avoid the implementation complexity of shoving a complete bittorrent client within the assistant. The `get` operation would block until the torrent is downloaded, i guess... --[[anarcat]]
diff --git a/doc/todo/Wishlist:_sanitychecker_fix_wrong_UUID__47__duplicate_remote.mdwn b/doc/todo/Wishlist:_sanitychecker_fix_wrong_UUID__47__duplicate_remote.mdwn
new file mode 100644
index 000000000..ff7773d3e
--- /dev/null
+++ b/doc/todo/Wishlist:_sanitychecker_fix_wrong_UUID__47__duplicate_remote.mdwn
@@ -0,0 +1,7 @@
+In certain situations different client annexes might get the same remote repository added, but before being synced.
+
+Once the two clients sync they will both have two remotes with the same name. But only one UUID will have any content(Assuming only one client pushed).
+
+It would be nice to have some (automatic?) way to resolve this conflict.
+
+Not sure if anything sane can be done if both clients have pushed?
diff --git a/doc/todo/mdwn2man:_make_backticks_bold.mdwn b/doc/todo/mdwn2man:_make_backticks_bold.mdwn
new file mode 100644
index 000000000..87c228ab8
--- /dev/null
+++ b/doc/todo/mdwn2man:_make_backticks_bold.mdwn
@@ -0,0 +1,20 @@
+The traditionnal way of marking commandline flags in a manpage is with a `.B` (for Bold, I guess). It doesn't seem to be used by mdwn2man, which makes the manpage look a little more dull than it could.
+
+The following patch makes those options come out more obviously:
+
+[[!format diff """
+diff --git a/Build/mdwn2man b/Build/mdwn2man
+index ba5919b..7f819ad 100755
+--- a/Build/mdwn2man
++++ b/Build/mdwn2man
+@@ -8,6 +8,7 @@ print ".TH $prog $section\n";
+
+ while (<>) {
+ s{(\\?)\[\[([^\s\|\]]+)(\|[^\s\]]+)?\]\]}{$1 ? "[[$2]]" : $2}eg;
++ s/\`([^\`]*)\`/\\fB$1\\fP/g;
+ s/\`//g;
+ s/^\s*\./\\&./g;
+ if (/^#\s/) {
+"""]]
+
+I tested it against the git-annex manpage and it seems to work well. --[[anarcat]]
diff --git a/doc/todo/nicer_whereis_output.mdwn b/doc/todo/nicer_whereis_output.mdwn
new file mode 100644
index 000000000..79c4ba02c
--- /dev/null
+++ b/doc/todo/nicer_whereis_output.mdwn
@@ -0,0 +1,81 @@
+We had some informal discussions on IRC about improving the output of the `whereis` command.
+
+[[!toc levels=2]]
+
+First version: columns
+======================
+
+[[mastensg]] started by implementing a [simple formatter](https://gist.github.com/mastensg/6500982) that would display things in columns [screenshot](http://www.ping.uio.no/~mastensg/whereis.png)
+
+Second version: Xs
+==================
+
+After some suggestions from [[joey]], [[mastensg]] changed the format slightly ([screenshot](http://www.ping.uio.no/~mastensg/whereis2.png)):
+
+[[!format txt """
+17:01:34 <joeyh> foo
+17:01:34 <joeyh> |bar
+17:01:34 <joeyh> ||baz (untrusted)
+17:01:34 <joeyh> |||
+17:01:34 <joeyh> XXx 3? img.png
+17:01:36 <joeyh> _X_ 1! bigfile
+17:01:37 <joeyh> XX_ 2 zort
+17:01:39 <joeyh> __x 1?! maybemissing
+17:02:09 * joeyh does a s/\?/+/ in the above
+17:02:24 <joeyh> and decrements the counters for untrusted
+17:03:37 <joeyh> __x 0+! maybemissing
+"""]]
+
+Third version: incremental
+==========================
+
+Finally, [[anarcat]] worked on making it run faster on large repositories, in a [fork](https://gist.github.com/anarcat/6502988) of that first gist. Then paging was added (so headers are repeated).
+
+Fourth version: tuning and blocked
+==================================
+
+[[TobiasTheViking]] provided some bugfixes, and the next step was to implement the trusted/untrusted detection, and have a counter.
+
+This required more advanced parsing of the remotes, and instead of starting to do some JSON parsing, [[anarcat]] figured it was time to learn some Haskell instead.
+
+Current status: needs merge
+===========================
+
+So right now, the most recent version of the python script is in [anarcat's gist](https://gist.github.com/anarcat/6502988) and works reasonably well. However, it doesn't distinguish between trusted and untrusted repos and so on.
+
+Furthermore, we'd like to see this factored into the `whereis` command directly. A [raw.hs](http://codepad.org/miVJb5oK) file has been programmed by `mastensg`, and is now available in the above gist. It fits the desired output and prototypes, and has been `haskellized` thanks to [[guilhem]].
+
+Now we just need to merge those marvelous functions in `Whereis.hs` - but I can't quite figure out where to throw that code, so I'll leave it to someone more familiar with the internals of git-annex. The most recent version is still in [anarcat's gist](https://gist.github.com/anarcat/6502988). --[[anarcat]]
+
+Desired output
+--------------
+
+The output we're aiming for is:
+
+ foo
+ |bar
+ ||baz (untrusted)
+ |||
+ XXx 2+ img.png
+ _X_ 1! bigfile
+ XX_ 2 zort
+ __x 0+! maybemissing
+
+Legend:
+
+ * `_` - file missing from repo
+ * `x` - file may be present in untrusted repo
+ * `X` - file is present in trusted repo
+ * `[0-9]` - number of copies present in trusted repos
+ * `+` - indicates there may be more copies present
+ * `!` - indicates only one copy is left
+
+Implementation notes
+--------------------
+
+[[!format txt """
+20:48:18 <joeyh> if someone writes me a headerWhereis :: [(RemoteName, TrustLevel)] -> String and a formatWhereis :: [(RemoteName, TrustLevel, UUID)] -> [UUD] -> FileName -> String , I can do the rest ;)
+20:49:22 <joeyh> make that second one formatWhereis :: [(RemoteName, TrueLevel, Bool)] -> FileName -> String
+20:49:37 <joeyh> gah, typos
+20:49:45 <joeyh> suppose you don't need the RemoteName either
+"""]]