aboutsummaryrefslogtreecommitdiffhomepage
path: root/notmuch-new.c
Commit message (Collapse)AuthorAge
...
* notmuch new: Don't prevent database upgrade from being interrupted.Gravatar Carl Worth2010-01-08
| | | | | | | | | | Our signal handler is designed to quickly flush out changes and then exit. But if a database upgrade is in progress when the user interrupts, then we just want to immediately abort. We could do something fancy like add a return value to our progress_notify function to allow it to tell the upgrade process to abort. But it's actually much cleaner and robust to delay the installation of our signal handler so that the default abort happens on SIGINT.
* notmuch new: Automatically upgrade the database if necessary.Gravatar Carl Worth2010-01-07
| | | | | | This takes advantage of the recently added library support to detect if the database needs to be upgraded and then automatically performs that upgrade, (with a nice progress report).
* notmuch new: Fix deletion support to recurse on removed directories.Gravatar Carl Worth2010-01-07
| | | | | | | Previously, when notmuch detected that a directory had been deleted it was only removing files immediately in that directory. We now correctly recurse to also remove any directories (and files, etc.) within sub-directories, etc.
* Prefer READ_ONLY consistently over READONLY.Gravatar Carl Worth2010-01-07
| | | | | | Previously we had NOTMUCH_DATABASE_MODE_READ_ONLY but NOTMUCH_STATUS_READONLY_DATABASE which was ugly and confusing. Rename the latter to NOTMUCH_STATUS_READ_ONLY_DATABASE for consistency.
* notmuch new: Never ask the database for any names from a new directory.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | | | | When we know that we are adding a new directory to the database, (and we therefore are using inode rather than strcmp-based sorting of the filenames), then we *never* want to see any names from the database. If we get any names that could only make us inadvertently remove files that we just added. Since it's not obvious from the Xapian documentation whether new terms being added as part of new documents will appear in the in-progress all-terms iteration we are using, (and this might differ based on Xapian backend and also might differ based on how many new directories are added and whether a flush threshold is reached). For all of these reasons, we play it safe and use NULL rather than a real notmuch_filenames_t iterator in this case to avoid any problem.
* notmuch new: Fix bug resulting in file removal on initial build of database.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | | The bug here was that we would see that the database did not know anything about a directory so would get results from the filesystem in inode rather than strcmp order. However, we wouldn't actually ask for the list of files from the database until after recursing into the sub-directories. So by the time we traverse the filenames looking for deletions, the database *does* have entries and we end up detecting erroneous deletions because our filename list from the filesystem isn't in strcmp order. So ask for the list of names from the database before doing any additions to avoid this problem.
* notmuch new: Fix to detect deletions of names at the end of the list.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | Previously we only scanned the list of filenames in the filesystem and detected a deletion whenever that scan skipped a name that existed in the database. That much was fine, but we *also* need to continue walking the list of names from the database when the filesystem list is exhausted. Without this, removing the last file or directory within any particular directory would go undetected.
* notmuch new: Fix regression preventing addition of symlinked mail files.Gravatar Carl Worth2010-01-06
| | | | | | | | | | As described in the previous commit message, we introduced multiple symlink-based regressions in commit 3df737bc4addfce71c647792ee668725e5221a98 Here, we fix the case of symlinks to regular files by doing an extra stat of any DT_LNK files to determine if they do, in fact, link to regular files.
* notmuch new: Fix regression preventing recursion through symlinks.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | | In commit 3df737bc4addfce71c647792ee668725e5221a98 we switched from using stat() to using the d_type field in the result of scandir() to determine whether a filename is a regular file or a directory. This change introduced a regression in that the recursion would no longer traverse through a symlink to a directory. (Since stat() would resolve the symlink but with scandir() we see a distinct DT_LNK value in d_type). We fix this for directories by allowing both DT_DIR and DT_LNK values to recurse, and then downgrading the existing not-a-directory check within the recursion to not be an error. We also add a new not-a-directory check outside the recursion that is an error.
* Fix typo in comment.Gravatar Carl Worth2010-01-06
| | | | The difference between "now" and "not" ends up being fairly dramatic.
* notmuch new: Print counts of deleted and renamed messages.Gravatar Carl Worth2010-01-06
| | | | | It's nice to be able to see a report indicating that the recently added support for detecting file rename and deletion is working.
* notmuch new: Proper support for renamed and deleted files.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | | | The "notmuch new" command will now efficiently notice if any files or directories have been removed from the mail store and will appropriately update its database. Any given mail message (as determined by the message ID) may have multiple corresponding filenames, and notmuch will return one of them. When a filen is deleted, the corresponding filename will be removed from the message in the database. When the last filename is removed from a message, that message will be entirely removed from the database. All file additions are handled before any file removals so that rename is supported properly.
* notmuch new: Store detected removed filenames for later processing.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | It is essential to defer the actual removal of any filenames from the database until we are entirely done adding any new files. This is to avoid any information loss from the database in the case of a renamed file or directory. Note that we're *still* not actually doing any removal---still just printing messages indicating the filenames that were detected as removed. But we're at least now printing those messages at a time when we actually *can* do the actual removal.
* notmuch new: Detect deleted (renamed) files and directories.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | This takes advantage of the notmuch_directory_t interfaces added recently (with cooresponding storage of directory documents in the database) to detect when files or entire directories are deleted or renamed within the mail store. This also fixes the recent regression where *all* files would be processed by every run of "notmuch new", (now only new files are processed once again). The deleted files and directories are only detected so far. They aren't properly removed from the database.
* add_files_recursive: Make the maildir detection more efficient.Gravatar Carl Worth2010-01-06
| | | | | | Previously, we were re-scanning the entire list of entries for every directory entry. Instead, we can simply check if the entries look like a maildir once, up-front.
* add_files_recursive: Separate scanning for directories and files for legibility.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | We now do two scans over the entries returned from scandir. The first scan is looking for directories (and making the recursive call). The second scan is looking for new files to add to the database. This is easier to read than the previous code which had a single loop and some if statements with ridiculously long bodies. It also has the advantage that once the directory scan is complete we can do a single comparison of the filesystem and database mtimes and entirely skip the second scan if it's not needed.
* add_files_recursive: Use consistent naming for array and count variables.Gravatar Carl Worth2010-01-06
| | | | | | Previously we had an array named "namelist" and its count named "num_entries". We now use an array name of "fs_entries" and a count named "num_fs_entries" to try to preserve sanity.
* notmuch new: Remove an unnecessary stat of every regular file in the mail store.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | | We were previousl using the stat for two reasons. One was to obtain the mtime of the file. This usage was removed in the previous commit, (since the mtime is unreliable in the case of a file being moved into the mail store). The second reason was to identify regular and directory file types. But this information is already available in the result we get from scandir. What's left is simply a stat for each directory in the mailstore, (which we are still using to compare filesystem mtime with the mtime stored in the database).
* notmuch new: Eliminate the check on the mtime of regular files before adding.Gravatar Carl Worth2010-01-06
| | | | | | | | | | | | | | This check was buggy in that moving a pre-existing file into the mail store, (where the file existed before the last run of "notmuch new"), does not update the mtime of the file. So the message would never be added to the database. The fix here is not practical in the long run, (since it causes *all* files in the mail store to be processed in every run of "notmuch new" (!)). But this change will let us drop a stat() call that we don't otherwise need and will help move us toward proper database-backed detection of new files, (which will fix the bug without the performance impact of the current fix).
* notmuch new: Fix internal documentation of add_files_recursive.Gravatar Carl Worth2010-01-06
| | | | | | To make it more clear that the mtime of a directory does not affect whether further sub-directories are examined, (they are examined unconditionally).
* notmuch new: Rename the various timestamp variables to be more clear.Gravatar Carl Worth2010-01-06
| | | | | | The previous name of "path_mtime" was very ambiguous. The new names are much more obvious (fs_mtime is the mtime from the filesystem and db_mtime is the mtime from the database).
* notmuch new: Avoid updating directory timestamp if interrupted.Gravatar Carl Worth2010-01-06
| | | | | | | | This was a very dangerous bug. An interrupted "notmuch new" session would still update the timestamp for the directory in the database. This would result in mail files that were not processed due to the original interruption *never* being picked up by future runs of "notmuch new". Yikes!
* notmuch-new: Remove dead add_files_callback code.Gravatar Carl Worth2010-01-06
| | | | Always satisfying to delete code (even if tiny).
* Make the add_files function static within notmuch-new.c.Gravatar Carl Worth2010-01-06
| | | | | No other files need this function so we don't need it exported in notmuch-client.h.
* lib: Implement new notmuch_directory_t API.Gravatar Carl Worth2010-01-06
| | | | | | | This new directory ojbect provides all the infrastructure needed to detect when files or directories are deleted or renamed. There's still code needed on top of this (within "notmuch new") to actually do that detection.
* lib: Rename set/get_timestamp to set/get_directory_mtime.Gravatar Carl Worth2010-01-06
| | | | | | I've been suitably scolded by Keith for doing a premature generalization that ended up just making the documentation more convoluted. Fix that.
* notmuch new: Remove hack to ignore read-only directories in mail store.Gravatar Carl Worth2010-01-06
| | | | | | | This was really the last thing keeping the initial run of "notmuch new" being different from all other runs. And I'm taking a fresh look at the performance of "notmuch new" anyway, so I think we can safely drop this optimization.
* notmuch new: Restrict the "not much" pun to the first run.Gravatar Carl Worth2010-01-06
| | | | | | | | | Several people complained that the humor wore thin very quickly. The most significant case of "not much mail" is when counting the user's initial mail collection. We've promised on the web page that no matter how much mail the user has, notmuch will consider it to be "not much" so let's say so. (This message was in place very early on, but was inadvertently dropped at some point.)
* Avoid compiler warnings due to ignored write return valuesGravatar Dirk-Jan C. Binnema2009-12-01
| | | | | | | | | | | | | Glibc (at least) provides the warn_unused_result attribute on write, (if optimizing and _FORTIFY_SOURCE is defined). So we explicitly ignore the return value in our signal handler, where we couldn't do anything anyway. Compile with: make CFLAGS="-O -D_FORTIFY_SOURCE" before this commit to see the warning.
* notmuch-new: Check for non-fatal errors from stat()Gravatar Chris Wilson2009-11-27
| | | | | | | | | | | Currently we assume that all errors on stat() a dname is fatal (but continue anyway and report the error at the end). However, some errors reported by stat() such as a missing file or insufficient privilege, we can simply ignore and skip the file. For the others, such as a fault (unlikely!) or out-of-memory, we handle like the other fatal errors by jumping to the end. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* Fix up whitespace styling from previous commit.Gravatar Carl Worth2009-11-27
| | | | | Function name in definition belong left-aligned. Body of if statement cannot be on the same line as the "if".
* notmuch-new: Test if directory looks like Maildir before skipping tmp.Gravatar Jan Janak2009-11-27
| | | | | | | | | | | | | | | | | 'notmuch new' skips directory entries with the name 'tmp'. This is to prevent notmuch from processing possibly incomplete Maildir messages stored in that directory. This patch attempts to refine the feature. If "tmp" entry is found, it first checks if the containing directory looks like a Maildir directory. This is done by searching for other common Maildir subdirectories. If they exist and if the entry "tmp" is a directory then it is skipped. Files and subdirectories with the name "tmp" that do not look like Maildir will still be processed by 'notmuch new'. Signed-off-by: Jan Janak <jan@ryngle.com>
* notmuch-new: Fix notmuch new to look at files within symbolic linksGravatar Aneesh Kumar K.V2009-11-27
| | | | | | | | | | | | | We look at the modified time of the database and the directory to decide whether we need to look at only the subdirectories. ie, if directory modified time is < database modified time then we have already looking at all the files withing the directory. So we just need to iterate through the subdirectories But with symlinks we need to make sure we follow them even if the directory modified time is less than database modified time Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* Stay out of tmp to respect the Maildir spec.Gravatar Jed Brown2009-11-23
|
* ANSI escapes in "new" only when output is a ttyGravatar Adrian Perez2009-11-23
| | | | | | | When running "notmuch new --verbose", ANSI escapes are used. This may not be desirable when the output of the command is *not* being sent to a terminal (e.g. when piping output into another command). In that case each file processed is printed in a new line and ANSI escapes are not used at all.
* Support for printing file paths in new commandGravatar Adrian Perez2009-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For very large mail boxes, it is desirable to know which files are being processed e.g. when a crash occurs to know which one was the cause. Also, it may be interesting to have a better idea of how the operation is progressing when processing mailboxes with big messages. This patch adds support for printing messages as they are processed by "notmuch new": * The "new" command now supports a "--verbose" flag. * When running in verbose mode, the file path of the message about to be processed is printed in the following format: current/total: /path/to/message/file Where "current" is the number of messages processed so far and "total" is the total count of files to be processed. The status line is erased using an ANSI sequence "\033[K" (erase current line from the cursor to the end of line) each time it is refreshed. This should not pose a problem because nearly every terminal supports it. * The signal handler for SIGALRM and the timer are not enabled when running in verbose mode, because we are already printing progress with each file, periodical reports are not neccessary.
* notmuch-new: Only print the regular progress report when on a ttyGravatar Chris Wilson2009-11-22
| | | | | | | Check that the stdout is connected to an interactive terminal with isatty() before installing the periodic timer to print progress reports. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* notmuch-new: Only install SIGALRM if not running under gdbGravatar Chris Wilson2009-11-22
| | | | | | | | | | | | | | | I felt sorry for Carl trying to step through an exception from xapian and suffering from the SIGALARMs.. We can detect if the user launched notmuch under a debugger by either checking our cmdline for the presence of the gdb string or querying if valgrind is controlling our process. For the latter we need to add a compile time check for the valgrind development library, and so add the initial support to build Makefile.config from configure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Carl Worth <cworth@cworth.org> [ickle: And do not install the timer when under the debugger]
* notmuch new: Fix to actually open the database READ_WRITE.Gravatar Chris Wilson2009-11-22
| | | | Chris claims he must have been distracted when he wrote this.
* Rename NOTMUCH_DATABASE_MODE_WRITABLE to NOTMUCH_DATABASE_MODE_READ_WRITEGravatar Carl Worth2009-11-21
| | | | And correspondingly, READONLY to READ_ONLY.
* Permit opening the notmuch database in read-only mode.Gravatar Chris Wilson2009-11-21
| | | | | | | | | We only rarely need to actually open the database for writing, but we always create a Xapian::WritableDatabase. This has the effect of preventing searches and like whilst updating the index. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Carl Worth <cworth@cworth.org>
* Revert "notmuch: Add Maildir directory name as tag name for messages"Gravatar Carl Worth2009-11-21
| | | | | | | | This reverts commit 9794f19017e028b542ed715bef3fd7cf0da5edff. The feature makes a lot of sense for the initial import, but it's not as clear whether it makes sense for ongoing "notmuch new" runs. We might need to make this opt-in by configuration.
* notmuch: Add Maildir directory name as tag name for messagesGravatar Aneesh Kumar K.V2009-11-21
| | | | | | | | This patch adds maildir directory name as the tag name for messages. This helps in adding tags using filtering already provided by procmail. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* notmuch new: Restore printout of total files counted.Gravatar Carl Worth2009-11-19
| | | | This was more fallout from the recent re-shuffling of this code.
* notmuch new: Fix countdown timer on first run.Gravatar Carl Worth2009-11-19
| | | | | A recent shuffling of this code accidentally disabled the timer, (making the time spent counting the files totally useless).
* count_files: sort directory in inode order before stattingGravatar Stewart Smith2009-11-18
| | | | | | Carl says: This has similar performance benefits as the previous patch, and I fixed similar style issues here as well, (including missing more of a commit message than the one-line summary).
* Minor style fixups for the previous fix.Gravatar Carl Worth2009-11-18
| | | | | Use consistent whitespace, a slightly less abbreviated identifier, and avoid a C99 declaration after statement.
* Read mail directory in inode number orderGravatar Stewart Smith2009-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This gives a rather decent reduction in number of seeks required when reading a Maildir that isn't in pagecache. Most filesystems give some locality on disk based on inode numbers. In ext[234] this is the inode tables, in XFS groups of sequential inode numbers are together on disk and the most significant bits indicate allocation group (i.e inode 1,000,000 is always after inode 1,000). With this patch, we read in the whole directory, sort by inode number before stat()ing the contents. Ideally, directory is sequential and then we make one scan through the file system stat()ing. Since the universe is not ideal, we'll probably seek during reading the directory and a fair bit while reading the inodes themselves. However... with readahead, and stat()ing in inode order, we should be in the best place possible to hit the cache. In a (not very good) benchmark of "how long does it take to find the first 15,000 messages in my Maildir after 'echo 3 > /proc/sys/vm/drop_caches'", this patch consistently cut at least 8 seconds off the scan time. Without patch: 50 seconds With patch: 38-42 seconds. (I did this in a previous maildir reading project and saw large improvements too)
* TypsosGravatar Ingmar Vanhassel2009-11-18
|
* notmuch new/tag: Flush all changes to database when interrupted.Gravatar Keith Packard2009-11-13
| | | | | | By installing a signal handler for SIGINT we can ensure that no work that is already complete will be lost if the user interrupts a "notmuch new" run with Control-C.