summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-06-10 16:13:23 -0400
committerGravatar Joey Hess <joey@kitenet.net>2012-06-10 16:13:23 -0400
commit7201d7835c72b51c4072729a4010e827ab62b490 (patch)
tree3a348a54b5bd7089d7500cedc717a253e7b60497 /doc
parent2de50f733a01ce5b282ff0f6eb8a1101bd496216 (diff)
parent3bb58afd594c7208a34749202a858644498acb6f (diff)
Merge branch 'master' into watch
Diffstat (limited to 'doc')
-rw-r--r--doc/design/assistant/blog/day_4__speed.mdwn47
-rw-r--r--doc/design/assistant/blog/day_4__speed/comment_1_bf3c9c33cc0dea5eaeb6f2af110b924b._comment8
-rw-r--r--doc/design/assistant/blog/day_4__speed/comment_2_33aba4c9abaa3e6a05a2c87ab7df9d0e._comment8
-rw-r--r--doc/design/assistant/comment_3_05223be50c889b2ed6bc4abf74116450._comment9
-rw-r--r--doc/design/assistant/comment_4_fbbd93b55803ae21e6ba4b6568c2fafd._comment9
-rw-r--r--doc/design/assistant/comment_5_f4e9af3fed6c27e8ff39badb9794064d._comment12
-rw-r--r--doc/design/assistant/comment_6_c7ad07cade1f44f9a8b61f92225bb9c5._comment10
-rw-r--r--doc/design/assistant/comment_7_609d38e993267195a80fecd84c93d1e2._comment8
-rw-r--r--doc/design/assistant/inotify.mdwn3
-rw-r--r--doc/forum/autobuilders_for_git-annex_to_aid_development.mdwn34
-rw-r--r--doc/todo/wishlist:_special-case_handling_of_Youtube_URLs_in_Web_special_remote.mdwn12
11 files changed, 160 insertions, 0 deletions
diff --git a/doc/design/assistant/blog/day_4__speed.mdwn b/doc/design/assistant/blog/day_4__speed.mdwn
new file mode 100644
index 000000000..badc6b7b1
--- /dev/null
+++ b/doc/design/assistant/blog/day_4__speed.mdwn
@@ -0,0 +1,47 @@
+Only had a few hours to work today, but my current focus is speed, and I
+have indeed sped up parts of `git annex watch`.
+
+One thing folks don't realize about git is that despite a rep for being
+fast, it can be rather slow in one area: Writing the index. You don't
+notice it until you have a lot of files, and the index gets big. So I've
+put a lot of effort into git-annex in the past to avoid writing the index
+repeatedly, and queue up big index changes that can happen all at once. The
+new `git annex watch` was not able to use that queue. Today I reworked the
+queue machinery to support the types of direct index writes it needs, and
+now repeated index writes are eliminated.
+
+... Eliminated too far, it turns out, since it doesn't yet *ever* flush
+that queue until shutdown! So the next step here will be to have a worker
+thread that wakes up periodically, flushes the queue, and autocommits.
+(This will, in fact, be the start of the [[syncing]] phase of my roadmap!)
+There's lots of room here for smart behavior. Like, if a lot of changes are
+being made close together, wait for them to die down before committing. Or,
+if it's been idle and a single file appears, commit it immediatly, since
+this is probably something the user wants synced out right away. I'll start
+with something stupid and then add the smarts.
+
+(BTW, in all my years of programming, I have avoided threads like the nasty
+bug-prone plague they are. Here I already have three threads, and am going to
+add probably 4 or 5 more before I'm done with the git annex assistant. So
+far, it's working well -- I give credit to Haskell for making it easy to
+manage state in ways that make it possible to reason about how the threads
+will interact.)
+
+What about the races I've been stressing over? Well, I have an ulterior
+motive in speeding up `git annex watch`, and that's to also be able to
+**slow it down**. Running in slow-mo makes it easy to try things that might
+cause a race and watch how it reacts. I'll be using this technique when
+I circle back around to dealing with the races.
+
+Another tricky speed problem came up today that I also need to fix. On
+startup, `git annex watch` scans the whole tree to find files that have
+been added or moved etc while it was not running, and take care of them.
+Currently, this scan involves re-staging every symlink in the tree. That's
+slow! I need to find a way to avoid re-staging symlinks; I may use `git
+cat-file` to check if the currently staged symlink is correct, or I may
+come up with some better and faster solution. Sleeping on this problem.
+
+----
+
+Oh yeah, I also found one more race bug today. It only happens at startup
+and could only make it miss staging file deletions.
diff --git a/doc/design/assistant/blog/day_4__speed/comment_1_bf3c9c33cc0dea5eaeb6f2af110b924b._comment b/doc/design/assistant/blog/day_4__speed/comment_1_bf3c9c33cc0dea5eaeb6f2af110b924b._comment
new file mode 100644
index 000000000..fb5b95490
--- /dev/null
+++ b/doc/design/assistant/blog/day_4__speed/comment_1_bf3c9c33cc0dea5eaeb6f2af110b924b._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawldKnauegZulM7X6JoHJs7Gd5PnDjcgx-E"
+ nickname="Matt"
+ subject="open source?"
+ date="2012-06-09T22:34:30Z"
+ content="""
+Are you publishing the source code for git-annex assistant somewhere?
+"""]]
diff --git a/doc/design/assistant/blog/day_4__speed/comment_2_33aba4c9abaa3e6a05a2c87ab7df9d0e._comment b/doc/design/assistant/blog/day_4__speed/comment_2_33aba4c9abaa3e6a05a2c87ab7df9d0e._comment
new file mode 100644
index 000000000..1fcc197ab
--- /dev/null
+++ b/doc/design/assistant/blog/day_4__speed/comment_2_33aba4c9abaa3e6a05a2c87ab7df9d0e._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="4.153.8.126"
+ subject="comment 2"
+ date="2012-06-09T23:01:29Z"
+ content="""
+Yes, it's in [[git|download]] with the rest of git-annex. Currently in the `watch` branch.
+"""]]
diff --git a/doc/design/assistant/comment_3_05223be50c889b2ed6bc4abf74116450._comment b/doc/design/assistant/comment_3_05223be50c889b2ed6bc4abf74116450._comment
new file mode 100644
index 000000000..a78fa3343
--- /dev/null
+++ b/doc/design/assistant/comment_3_05223be50c889b2ed6bc4abf74116450._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawkSq2FDpK2n66QRUxtqqdbyDuwgbQmUWus"
+ nickname="Jimmy"
+ subject="comment 3"
+ date="2012-06-07T20:22:55Z"
+ content="""
+I'd agree getting it into the main distros is the way to go, if you need OSX binaries, I could volunteer to setup an autobuilder to generate binaries for OSX users, however it would rely on users to have macports with the correct ports installed to use it (things like coreutils etc...)
+
+"""]]
diff --git a/doc/design/assistant/comment_4_fbbd93b55803ae21e6ba4b6568c2fafd._comment b/doc/design/assistant/comment_4_fbbd93b55803ae21e6ba4b6568c2fafd._comment
new file mode 100644
index 000000000..cd3b5aaef
--- /dev/null
+++ b/doc/design/assistant/comment_4_fbbd93b55803ae21e6ba4b6568c2fafd._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ subject="comment 4"
+ date="2012-06-08T01:56:52Z"
+ content="""
+I always appreciate your OSX work Jimmy...
+
+Could it be put into macports?
+"""]]
diff --git a/doc/design/assistant/comment_5_f4e9af3fed6c27e8ff39badb9794064d._comment b/doc/design/assistant/comment_5_f4e9af3fed6c27e8ff39badb9794064d._comment
new file mode 100644
index 000000000..bf8d9709e
--- /dev/null
+++ b/doc/design/assistant/comment_5_f4e9af3fed6c27e8ff39badb9794064d._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawkSq2FDpK2n66QRUxtqqdbyDuwgbQmUWus"
+ nickname="Jimmy"
+ subject="comment 5"
+ date="2012-06-08T07:22:34Z"
+ content="""
+In relation to macports, I often found that haskell in macports are often behind other distros, and I'm not willing to put much effort into maintaining or updating those ports. I found that to build git-annex, installing macports manually and then installing haskell-platform from the upstream to be the best way to get the most up to date dependancies for git-annex.
+
+fyi in macports ghc is at version 6.10.4 and haskell platform is at version 2009.2, so there are a significant number of ports to update.
+
+I was thinking about this a bit more and I reckon it might be easier to try and build a self contained .pkg package and have all the needed binaries in a .app styled package, that would work well when the webapp comes along. I will take a look at it in a week or two (currently moving house so I dont have much time)
+"""]]
diff --git a/doc/design/assistant/comment_6_c7ad07cade1f44f9a8b61f92225bb9c5._comment b/doc/design/assistant/comment_6_c7ad07cade1f44f9a8b61f92225bb9c5._comment
new file mode 100644
index 000000000..9fa66d6d3
--- /dev/null
+++ b/doc/design/assistant/comment_6_c7ad07cade1f44f9a8b61f92225bb9c5._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawkSq2FDpK2n66QRUxtqqdbyDuwgbQmUWus"
+ nickname="Jimmy"
+ subject="comment 6"
+ date="2012-06-08T15:21:18Z"
+ content="""
+It's not much for now... but see <http://www.sgenomics.org/~jtang/gitbuilder-git-annex-x00-x86_64-apple-darwin10.8.0/> I'm ignoring the debian-stable and pristine-tar branches for now, as I am just building and testing on osx 10.7.
+
+Hope the autobuilder will help you develop the OSX side of things without having direct access to an osx machine! I will try and get gitbuilder to spit out appropriately named tarballs of the compiled binaries in a few days when I have more time.
+"""]]
diff --git a/doc/design/assistant/comment_7_609d38e993267195a80fecd84c93d1e2._comment b/doc/design/assistant/comment_7_609d38e993267195a80fecd84c93d1e2._comment
new file mode 100644
index 000000000..6685c6548
--- /dev/null
+++ b/doc/design/assistant/comment_7_609d38e993267195a80fecd84c93d1e2._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="4.153.8.126"
+ subject="comment 7"
+ date="2012-06-09T18:07:51Z"
+ content="""
+Thanks, that's already been useful to me. You might as well skip the debian-specific \"bpo\" tags too.
+"""]]
diff --git a/doc/design/assistant/inotify.mdwn b/doc/design/assistant/inotify.mdwn
index ab88210b2..7cdde33ac 100644
--- a/doc/design/assistant/inotify.mdwn
+++ b/doc/design/assistant/inotify.mdwn
@@ -108,3 +108,6 @@ Many races need to be dealt with by this code. Here are some of them.
Not a problem; The removal event removes the old file from the index, and
the add event adds the new one.
+* At startup, `git add --update` is run, to notice deleted files.
+ Then inotify starts up. Files deleted in between won't have their
+ removals staged.
diff --git a/doc/forum/autobuilders_for_git-annex_to_aid_development.mdwn b/doc/forum/autobuilders_for_git-annex_to_aid_development.mdwn
new file mode 100644
index 000000000..8cd370937
--- /dev/null
+++ b/doc/forum/autobuilders_for_git-annex_to_aid_development.mdwn
@@ -0,0 +1,34 @@
+This is a continuation of the conversation from [[the comments|design/assistant/#comment-77e54e7ebfbd944c370173014b535c91]] section in the design of git-assistant. In summary, I've setup an auto builder which should help [[Joey]] have an easier time developing on git-annex on non-linux/debian platforms. This builder is currently running on OSX 10.7 with the 64bit version of Haskell Platform.
+
+The builder output can be found at <http://www.sgenomics.org/~jtang/gitbuilder-git-annex-x00-x86_64-apple-darwin10.8.0/>, the CGI on this site does not work as my OSX workstation is pushing the output from another location.
+
+The builder currently tries to build all branches except
+
+* debian-stable
+* pristine-tar
+* setup
+
+It also does not build any of the tags as well, Joey had suggested to ignore the bpo named tags, but for now it's easier for me to not build any tags. To continue on this discussion, if anyone wants to setup a gitbuilder instance, here is the build.sh script that I am using.
+
+<pre>
+#!/bin/bash -x
+
+# Macports
+export PATH=/opt/local/bin:$PATH
+
+# Haskell userland
+export PATH=$PATH:$HOME/.cabal/bin
+
+# Macports gnu
+export PATH=/opt/local/libexec/gnubin:$PATH
+
+make || exit 3
+
+make -q test
+if [ "$?" = 1 ]; then
+ # run "make test", but give it a time limit in case a test gets stuck
+ ../maxtime 1800 make test || exit 4
+fi
+</pre>
+
+It's also using the branches-local script for sorting and prioritising the branches to build, this branches-local script can be found at the [autobuild-ceph](https://github.com/ceph/autobuild-ceph/blob/master/branches-local) repository. If there are other people interested in setting up their own instances of gitbuilder for git-annex, please let me know and I will setup an aggregator page to collect status of the builds. The builder runs and updates the webpage every 30mins.
diff --git a/doc/todo/wishlist:_special-case_handling_of_Youtube_URLs_in_Web_special_remote.mdwn b/doc/todo/wishlist:_special-case_handling_of_Youtube_URLs_in_Web_special_remote.mdwn
new file mode 100644
index 000000000..e11989e52
--- /dev/null
+++ b/doc/todo/wishlist:_special-case_handling_of_Youtube_URLs_in_Web_special_remote.mdwn
@@ -0,0 +1,12 @@
+The [Web special remote](http://git-annex.branchable.com/special_remotes/web/) could possibly be improved by detecting when URLs reference a Youtube video page and using [youtube-dl](http://rg3.github.com/youtube-dl/) instead of wget to download the page. Youtube-dl can also handle several other video sites such as vimeo.com and blip.tv, so if this idea were to be implemented, it might make sense to borrow the regular expressions that youtube-dl uses to identify video URLs. A quick grep through the youtube-dl source for the identifier _VALID_URL should find those regexes (in Python's regex format).
+
+> This is something I've thought about doing for a while..
+> Two things I have not figured out:
+>
+> * Seems that this should really be user-configurable or a plugin system,
+> to handle more than just this one case.
+> * Youtube-dl breaks from time to time, I really trust these urls a lot
+> less than regular urls. Perhaps per-url trust levels are called for by
+> this.
+>
+> --[[Joey]]