notmuch - thread-based email index, search and tagging

	Commit message (Collapse)	Author	Age
*	notmuch show: Add a one-line summary of the message before the header.	Carl Worth	2009-10-29
\| \| \| \| \|	The idea here is that a client could usefully display just this one line while optionally hiding the other header fields.
*	Fix add_message and get_filename to strip/re-add the database path.	Carl Worth	2009-10-28
\| \| \| \| \|	We now store only a relative path inside the database so the database is not nicely relocatable.
*	notmuch_database_add_message: Sanity check the file as the first thing	Carl Worth	2009-10-28
\| \| \| \| \|	This avoids us wasting a bunch of time doing an expensive SHA-1 over a large file only to discover later that it doesn't even look like an email message.
*	Tweak formatting of internal error messages.	Carl Worth	2009-10-28
\| \| \| \| \| \|	Was neglecting to print the phrase "Internal error: " before, and for the duplicate message-ID error it's nice to actually see the duplicate IDs.
*	index: Store "Full Name <user@example.com>" addressses in the database	Carl Worth	2009-10-28
\| \| \| \| \| \| \|	We put these is as a separate term so that they can be extracted. We don't actually need this for searching, since typing an email address in as a search term will already trigger a phrase search that does exactly what's wanted.
*	Add full-text indexing using the GMime library for parsing.	Carl Worth	2009-10-28
\| \| \| \| \| \| \| \| \| \| \|	This is based on the old notmuch-index-message.cc from early in the history of notmuch, but considerably cleaned up now that we have some experience with Xapian and know just what we want to index, (rather than just blindly trying to index exactly what sup does). This does slow down notmuch_database_add_message a lot, but I've got some ideas for getting some time back.
*	Fix segfault in case of the database lock not being available.	Carl Worth	2009-10-27
\| \| \| \| \| \|	We were nicely reporting the lock-aquisition failure, but then marching along trying to use the database object and just crashing badly. So don't do that.
*	Update prefix so that "thread:" can be used in search strings.	Carl Worth	2009-10-27
\| \| \| \| \| \| \| \| \| \| \|	It's convenient to be able to do things like: notmuch tag -inbox thread:<thread-id> (even though this can run into a race condition as noted in TODO--the fix for the race is simply to not run "notmuch new" between reading a thread with the (not yet existent) "notmuch show" and removing its inbox tag with a command like the above). So we now allow such a thing.
*	notmuch_database_add_message: Do not return a message on failure.	Carl Worth	2009-10-27
\| \| \| \| \| \| \|	The recent, disastrous failure of "notmuch new" would have been avoided with this change. The new_command function was basically assuming that it would only get a message object on success so wasn't destroying the message in the other cases.
*	notmuch_database_close: Explicitly flush the Xapian database.	Carl Worth	2009-10-27
\| \| \| \| \| \| \| \| \| \|	This would have helped with the recent bug causing "notmuch new" to not record any results in the database. I'm not sure why the explicit flush would be required, (shouldn't the destructor always ensure that things flush?), but perhaps some outstanding references from the leak prevented that. In any case, an explicit flush on close() seems to make sense.
*	notmuch restore: Fix to remove all tags before adding tags.	Carl Worth	2009-10-26
\| \| \| \| \| \| \| \| \| \| \| \|	This means that the restore operation will now properly pick up the removal of tags indicated by the tag just not being present in the dump file. We added a few new public functions in order to support this: notmuch_message_freeze notmuch_message_remove_all_tags notmuch_message_thaw
*	add_message: Add an optional parameter for getting the just-added message.	Carl Worth	2009-10-26
\| \| \| \| \|	We use this to implement the addition of "inbox" and "unread" tags for all messages added by "notmuch new".
*	Remove all calls to g_strdup_printf	Carl Worth	2009-10-26
\| \| \| \| \| \|	Replacing them with calls to talloc_asprintf if possible, otherwise to asprintf (with it's painful error-handling leaving the pointer undefined).
*	Drop dead function add_term.	Carl Worth	2009-10-25
\| \| \| \| \| \| \|	Even with the recent warnings work, gcc didn't tell me about a static function that I'm not calling? Apparently I get "defined but not used" in C files, but not C++ files. That's bogus, and yet one more reason for me to push the C++ to a minimal lower layer.
*	Add -Wswitch-enum and fix warnings.	Carl Worth	2009-10-25
\| \| \| \| \| \|	Having to enumerate all the enum values at every switch is annoying, but this warning actually found a bug, (missing support for NOTMUCH_STATUS_OUT_OF_MEMORY in notmuch_status_to_string).
*	Add -Wmising-declarations and fix warnings.	Carl Worth	2009-10-25
\| \| \| \|	Wow, lots of missing 'static' on internal functions.
*	_notmuch_database_linke_message: Fix error-status propagation.	Carl Worth	2009-10-25
\| \| \| \| \| \|	The _notmuch_database_link_message_to_parents function was void in an earlier draft. Now, ensure that we don't miss any error return value from it.
*	Change database to store only a single thread ID per message.	Carl Worth	2009-10-25
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of supporting multiple thread IDs, we now merge together thread IDs if one message is ever found to belong to more than one thread. This allows for constructing complete threads when, for example, a child message doesn't include a complete list of References headers back to the beginning of the thread. It also simplifies dealing with mapping a message ID to a thread ID which is now a simple get_thread_id just like get_message_id, (and no longer an iterator-based thing like get_tags).
*	link_message: Remove dead code.	Carl Worth	2009-10-25
\| \| \| \| \| \|	We dropped the THREAD_ID value from the database a while back, but here is code that's carefully computing that value and then never doing anything with it. Delete, delete, delete.
*	add_message: Pull the thread-stitching portion out into new ↵	Carl Worth	2009-10-25
\| \| \| \| \| \| \| \|	_notmuch_database_link_message The function was getting too long-winded before. Add since I'm about to change how we handle the thread linking, it's convenient to have it in an isolated function.
*	Add an INTERNAL_ERROR macro and use it for all internal errors.	Carl Worth	2009-10-25
\| \| \| \| \| \|	We were previously just doing fprintf;exit at each point, but I wanted to add file and line-number details to all messages, so it makes sense to use a single macro for that.
*	add_message: Propagate error status from notmuch_message_create_for_message_id	Carl Worth	2009-10-25
\| \| \| \|	What a great feeling to remove an XXX comment.
*	Add comment documenting our current database schema.	Carl Worth	2009-10-25
\| \| \| \| \|	I've got schemes to change this schema somewhat dramatically, so I want a place to be able to record and review those changes.
*	Drop the storage of thread ID(s) in a value.	Carl Worth	2009-10-25
\| \| \| \| \| \|	Now that we are iterating over the thread terms instead, we can drop this redundant storage (which should shrink our database a tiny bit).
*	Shuffle the value numbers around in the database.	Carl Worth	2009-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, it's nice that for now we don't have any users yet, so we can make incompatible changes to the database layout like this without causing trouble. ;-) There are a few reasons for this change. First, we now use value 0 uniformly as a timestamp for both mail and timestamp documents, (which lets us cleanup an ugly and fragile bare 0 in the add_value and get_value calls in the timestamp code). Second, I want to drop the thread value entirely, so putting it at the end of the list means we can drop it as compatible change in the future. (I almost want to drop the message-ID value too, but it's nice to be able to sort on it to get diff-able output from "notmuch dump".) But the thread value we never use as a value, (we would never sort on it, for example). And it's totally redundant with the thread terms we store already. So expect it to disappear soon.
*	Invent our own prefix values.	Carl Worth	2009-10-24
\| \| \| \| \| \| \| \| \| \| \|	We're now dropping all pretense of keeping the database directly compatible with sup's current xapian backend. (But perhaps someone might write a new nothmuch backend for sup in the future.) In coming up with the prefix values here, I tried to follow the conventions of http://xapian.org/docs/omega/termprefixes.html as closely as makes sense, (with some domain translation from "web" to "email archive").
*	Split BOOLEAN_PREFIX into INTERNAL and EXTERNAL subsets.	Carl Worth	2009-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \|	The idea here is that only some of the prefix names (such as "id" and "tag") actually make sense in external user-supplied query strings. Other things like "type" are internal implementation details of how we store things in the database. So internal machinery will add those terms to the database and we don't need to support them in the string itself. With this, we can now simply loop over the external prefix values to let the quiery parser know about them. So as we add prefixes in the future, we'll only need to add them to this list.
*	Change all occurrences of "msgid" to "id".	Carl Worth	2009-10-24
\| \| \| \|	What's good for the user is good for the internals.
*	Add the magic to allow searches such as "tag:inbox".	Carl Worth	2009-10-24
\| \| \| \| \| \| \| \| \| \| \|	The key for this is call add_boolean_prefix on the QueryParser object. That tells the query parser to take something like "tag:inbox" and transform it into the "Linbox" term and do what it needs to do to make this term a requirement of the search. We're starting to have a real system here. Also, I didn't want to expose the ugly name of "msgid" to the user, so we add a prefix name of simply "id" instead.
*	Fix timestamp generation to avoid overflowing the term limit	Carl Worth	2009-10-24
\| \| \| \| \| \|	The previous code was only correct as long as the timestamp prefix was only a single character. But with the recent change to a multi-character prefix, this broke. So fix it now.
*	Trim down prefix list to things we are actually using.	Carl Worth	2009-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've decided not to try for sup compatibility at the leve of the xapian datbase. There's just too much about sup's usage of the database that I don't like, (beyond the embedded ruby data structures there is redundant storage of message IDs, thread IDs, and dates (in both terms and values)). I'm going to fix that up in the database of notmuch, with some other changes as well. (I plan to drop "reference" terms once linkage to a thread ID through the reference is established. I also plan to add actual documents to represent threads.) So with all that incompatibility, I might as well make my own prefix values. And while doing that, I should try to be as compatible as possible with the conventions described here: http://xapian.org/docs/omega/termprefixes.html
*	Move the prefix-string arrays back into database.cc from message.cc	Carl Worth	2009-10-24
\| \| \| \| \|	Yes, I'm being wishy-washy here, moving code back and forth. But this is where these really do belong.
*	Add NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \| \|	And document that notmuch_database_add_message can return this value. This pushes the hard decision of what to do with duplicate messages out to the user, but that's OK. (We weren't really doing anything with these ourselves, and this way the user is at least informed of the issue, rather than it just getting papered over internally.)
*	Clarify documentation and error string for NOTMUCH_STATUS_TAG_TOO_LONG	Carl Worth	2009-10-23
\| \| \| \|	It's helpful to point out NOTMUCH_STATUS_TAG_MAX for users.
*	Add notmuch_database_set_timestamp and notmuch_database_get_timestamp	Carl Worth	2009-10-23
\| \| \| \| \|	These will be very helpful to implement an efficient "notmuch new" command which imports new mail messages that have appeared.
*	database: Add private find_unique_doc_id and find_unique_document functions	Carl Worth	2009-10-23
\| \| \| \| \| \|	These are a generalization of the unique-ness testing of notmuch_database_find_message. More preparation for firectory timestamps.
*	database: Similarly rename find_message_by_docid to find_document_for_doc_id	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \|	Again preferring notmuch_database_t* over Xapian::Database*. Also, we're standardizing on "doc_id" rather than "docid" locally, (as an analoge to "message_id"), in spite of the "Xapian::docid" name, (which, fortunately, we can ignore and just us "unsigned int" instead).
*	database: Rename internal find_messages_by_term to find_doc_ids	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \| \| \|	This name is a more accurate description of what it does, and the more general naming will make sense as we start storing non-message documents in the database (such as directory timestamps). Also, don't pass around a Xapian::Database where it's more our style to pass a notmuch_database_t*.
*	add_message: Fix to not add multiple documents with the same message ID	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \|	Here's the second big fix to message-ID handling, (the first was to generate message IDs when an email contained none). Now, with no document missing a message ID, and no two documents having the same message ID, we have a nice consistent database where the message ID can be used as a unique key.
*	add_message: Re-order the code a bit (find message-id first).	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \|	We're preparing for being able to deal with files with duplicate message IDs here. The plan is to create a notmuch_message_t object in add_message that may or may not reference a document that exists in the database. So to do this, we have to find the message ID before we do any manipulation of the doc.
*	Move thread_id generation code from database.cc to message.cc	Carl Worth	2009-10-23
\| \| \| \|	It's really up to the message to decide how to generate these.
*	add_message: Rename message to message_file	Carl Worth	2009-10-23
\| \| \| \| \| \| \| \| \|	I still don't like the name message_file at all, but we're about to start using a notmuch_message_t in this function so we need to do something to keep the identifiers separate for now. Eventually, it probably makes sense to push the message-parsing code from database.cc to message.cc.
*	Don't forget the "to" header when restrict parsing to certain headers	Carl Worth	2009-10-22
\| \| \| \| \| \| \| \| \| \|	We recently started discarding files as "not email" if they have none of Subject, From, nor To. Apaprently, my mail collection contains a number of messages that I sent, that are saved without Subject and From, (perhaps these were drafts?). Anyway, it's fortunate I had those since they alerted me to this bug, where we were not parsing the "To" header in some cases.
*	Fix missing error check.	Carl Worth	2009-10-22
\| \| \| \| \|	The notmuch_message_file_open function is perfectly capable of returning NULL. So check for it.
*	Generate message ID (using SHA1) when a mail message contains none.	Carl Worth	2009-10-22
\| \| \| \| \| \|	This is important as we're using the message ID as the unique key in our database. So previously, all messages with no message ID would be treated as the same message---not good at all.
*-.	Merge branch from fixing up bugs after bisecting.	Carl Worth	2009-10-21
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm glad that when I implemented "notmuch restore" I went through the extra effort to take the code I had written in one sitting into over a dozen commits. Sure enough, I hadn't tested well enough and had totally broken "notmuch setup", (segfaults and bogus thread_id values). With the little commits I had made, git bisect saved the day, and I went back to make the fixes right on top of the commits that introduced the bugs. So now we octopus merge those in.
\| \| *	Bring back the insert_thread_id function.	Carl Worth	2009-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We deleted this in favor of our fancy new thread_ids iterator from the message object. But one of the previous callers of insert_thread_id isn't using notmuch_message_t yet. I made the mistake of thinking I could just call g_hash_table_insert directly, but the problem was that nobody was splitting up the thread_id string at its commas. So with this, we were inserting bogus comma-separated IDs into the hash table, so thread_id values were ballooning out of control. Should be much better now.
* \| \|	Add notmuch_status_to_string function.	Carl Worth	2009-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Be kind and let the user print error messages, not just error codes.
* \| \|	Add notmuch_message_add_tag and notmuch_message_remove_tag	Carl Worth	2009-10-21
\| \|/ \|/\| \| \| \| \| \| \|	With these two added, we now have enough functionality in the library to implement "notmuch restore".
* \|	database: Add new notmuch_database_find_message	Carl Worth	2009-10-21
\|/ \| \| \| \| \| \|	With this function, and the recently added support for notmuch_message_get_thread_ids, we now recode the find_thread_ids function to work just the way we expect a user of the public notmuch API to work. Not too bad really.