summaryrefslogtreecommitdiff
path: root/doc/design
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-06-11 12:13:07 -0400
committerGravatar Joey Hess <joey@kitenet.net>2012-06-11 12:13:07 -0400
commita5a3cd55ac2bab656824e48d29ead8382c583b01 (patch)
tree1d5503b35e7fdb27bcd2e1e3f8f1020686882125 /doc/design
parent433ff41496b073c71e465af8b38b2ecafe27d8dd (diff)
parent5642e189b76070b43a8e24f9f49d36b950f83c8d (diff)
Merge branch 'master' into watch
Conflicts: debian/changelog
Diffstat (limited to 'doc/design')
-rw-r--r--doc/design/assistant/blog/day_5__committing.mdwn57
-rw-r--r--doc/design/assistant/cloud/comment_1_4997778abc171999499487b71b31c9ba._comment16
-rw-r--r--doc/design/assistant/cloud/comment_2_08da8bc74a4845e354dca99184cffd70._comment8
-rw-r--r--doc/design/assistant/inotify.mdwn11
-rw-r--r--doc/design/assistant/windows.mdwn12
5 files changed, 97 insertions, 7 deletions
diff --git a/doc/design/assistant/blog/day_5__committing.mdwn b/doc/design/assistant/blog/day_5__committing.mdwn
new file mode 100644
index 000000000..7d6b52199
--- /dev/null
+++ b/doc/design/assistant/blog/day_5__committing.mdwn
@@ -0,0 +1,57 @@
+After a few days otherwise engaged, back to work today.
+
+My focus was on adding the committing thread mentioned in [[day_4__speed]].
+I got rather further than expected!
+
+First, I implemented a really dumb thread, that woke up once per second,
+checked if any changes had been made, and committed them. Of course, this
+rather sucked. In the middle of a large operation like untarring a tarball,
+or `rm -r` of a large directory tree, it made lots of commits and made
+things slow and ugly. This was not unexpected.
+
+So next, I added some smarts to it. First, I wanted to stop it waking up
+every second when there was nothing to do, and instead blocking wait on a
+change occuring. Secondly, I wanted it to know when past changes happened,
+so it could detect batch mode scenarios, and avoid committing too
+frequently.
+
+I played around with combinations of various Haskell thread communications
+tools to get that information to the committer thread: `MVar`, `Chan`,
+`QSem`, `QSemN`. Eventually, I realized all I needed was a simple channel
+through which the timestamps of changes could be sent. However, `Chan`
+wasn't quite suitable, and I had to add a dependency on
+[Software Transactional Memory](http://en.wikipedia.org/wiki/Software_Transactional_Memory),
+and use a `TChan`. Now I'm cooking with gas!
+
+With that data channel available to the committer thread, it quickly got
+some very nice smart behavior. Playing around with it, I find it commits
+*instantly* when I'm making some random change that I'd want the
+git-annex assistant to sync out instantly; and that its batch job detection
+works pretty well too.
+
+There's surely room for improvement, and I made this part of the code
+be an entirely pure function, so it's really easy to change the strategy.
+This part of the committer thread is so nice and clean, that here's the
+current code, for your viewing pleasure:
+
+[[!format haskell """
+{- Decide if now is a good time to make a commit.
+ - Note that the list of change times has an undefined order.
+ -
+ - Current strategy: If there have been 10 commits within the past second,
+ - a batch activity is taking place, so wait for later.
+ -}
+shouldCommit :: UTCTime -> [UTCTime] -> Bool
+shouldCommit now changetimes
+ | len == 0 = False
+ | len > 4096 = True -- avoid bloating queue too much
+ | length (filter thisSecond changetimes) < 10 = True
+ | otherwise = False -- batch activity
+ where
+ len = length changetimes
+ thisSecond t = now `diffUTCTime` t <= 1
+"""]]
+
+Still some polishing to do to eliminate minor innefficiencies and deal
+with more races, but this part of the git-annex assistant is now very usable,
+and will be going out to my beta testers soon!
diff --git a/doc/design/assistant/cloud/comment_1_4997778abc171999499487b71b31c9ba._comment b/doc/design/assistant/cloud/comment_1_4997778abc171999499487b71b31c9ba._comment
new file mode 100644
index 000000000..1a01afaa3
--- /dev/null
+++ b/doc/design/assistant/cloud/comment_1_4997778abc171999499487b71b31c9ba._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawkq0-zRhubO6kR9f85-5kALszIzxIokTUw"
+ nickname="James"
+ subject="Cloud Service Limitations"
+ date="2012-06-11T02:15:04Z"
+ content="""
+Hey Joey!
+
+I'm not very tech savvy, but here is my question.
+I think for all cloud service providers, there is an upload limitation on how big one file may be.
+For example, I can't upload a file bigger than 100 MB on box.net.
+Does this affect git-annex at all? Will git-annex automatically split the file depending on the cloud provider or will I have to create small RAR archives of one large file to upload them?
+
+Thanks!
+James
+"""]]
diff --git a/doc/design/assistant/cloud/comment_2_08da8bc74a4845e354dca99184cffd70._comment b/doc/design/assistant/cloud/comment_2_08da8bc74a4845e354dca99184cffd70._comment
new file mode 100644
index 000000000..a9b377ea5
--- /dev/null
+++ b/doc/design/assistant/cloud/comment_2_08da8bc74a4845e354dca99184cffd70._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="4.153.8.126"
+ subject="re: cloud"
+ date="2012-06-11T04:48:08Z"
+ content="""
+Yes, git-annex has to split files for certian providers. I already added support for this as part of my first pass at supporting box.com, see [[tips/using_box.com_as_a_special_remote]].
+"""]]
diff --git a/doc/design/assistant/inotify.mdwn b/doc/design/assistant/inotify.mdwn
index 7cdde33ac..9fe6938c4 100644
--- a/doc/design/assistant/inotify.mdwn
+++ b/doc/design/assistant/inotify.mdwn
@@ -23,10 +23,11 @@ really useful, it needs to:
is exceeded. This can be tuned by root, so help the user fix it.
**done**
- periodically auto-commit staged changes (avoid autocommitting when
- lots of changes are coming in)
-- tunable delays before adding new files, etc
-- coleasce related add/rm events for speed and less disk IO
+ lots of changes are coming in) **done**
+- coleasce related add/rm events for speed and less disk IO **done**
- don't annex `.gitignore` and `.gitattributes` files **done**
+- run as a daemon **done**
+- tunable delays before adding new files, etc
- configurable option to only annex files meeting certian size or
filename criteria
- option to check files not meeting annex criteria into git directly
@@ -107,7 +108,3 @@ Many races need to be dealt with by this code. Here are some of them.
Not a problem; The removal event removes the old file from the index, and
the add event adds the new one.
-
-* At startup, `git add --update` is run, to notice deleted files.
- Then inotify starts up. Files deleted in between won't have their
- removals staged.
diff --git a/doc/design/assistant/windows.mdwn b/doc/design/assistant/windows.mdwn
index da669ad82..26ff2c1c6 100644
--- a/doc/design/assistant/windows.mdwn
+++ b/doc/design/assistant/windows.mdwn
@@ -22,3 +22,15 @@ Or I could try to use Cygwin.
## Deeper system integration
[NTFS Reparse Points](http://msdn.microsoft.com/en-us/library/aa365503%28v=VS.85%29.aspx) allow a program to define how the OS will interpret a file or directory in arbitrary ways. This requires writing a file system filter.
+
+## Developement environment
+
+Someone wrote in to say:
+
+> For Windows Development you can easily qualify
+> for Bizspark - http://www.microsoft.com/bizspark/
+>
+> This will get you 100% free Windows OS licenses and
+> Dev tools, plus a free Azure account for cloud testing.
+> (You can also now deploy Linux VMs to Azure as well)
+> No money required at all.