aboutsummaryrefslogtreecommitdiffhomepage
path: root/notmuch-new.c
Commit message (Collapse)AuthorAge
* new: Unify add_files and add_files_recursiveGravatar Austin Clements2012-05-24
| | | | | | | | Since starting at the top of a directory tree and recursing within that tree are now identical operations, there's no need for both add_files and add_files_recursive. This eliminates add_files (which did nothing more than call add_files_recursive after the previous patch) and renames add_files_recursive to add_files.
* new: Merge error checks from add_files and add_files_recursiveGravatar Austin Clements2012-05-24
| | | | | | | | | | | | Previously, add_files_recursive could have been called on a symlink to a non-directory. Hence, calling it on a non-directory was not an error, so a separate function, add_files, existed to fail loudly in situations where the path had to be a directory. With the new stat-ing logic, add_files_recursive is always called on directories, so the separation of this logic is no longer necessary. Hence, this patch moves the strict error checking previously done by add_files into add_files_recursive.
* new: Centralize file type stat-ing logicGravatar Austin Clements2012-05-24
| | | | | | | | | | This moves our logic to get a file's type into one function. This has several benefits: we can support OSes and file systems that do not provide dirent.d_type or always return DT_UNKNOWN, complex symlink-handling logic has been replaced by a simple stat fall-through in one place, and the error message for un-stat-able file is more accurate (previously, the error always mentioned directories, even though a broken symlink is not a directory).
* new: Remove workaround for detecting newly created directory objectsGravatar Austin Clements2012-05-23
| | | | | | | | | | Previously, notmuch_database_get_directory did not indicate whether or not the returned directory object was newly created, which required a workaround to distinguish newly created directory objects with no child messages from directory objects that had no mtime set but did have child messages. Now that notmuch_database_get_directory distinguishes whether or not the directory object exists in the database, this workaround is no longer necessary.
* lib/cli: Make notmuch_database_get_directory return a status codeGravatar Austin Clements2012-05-15
| | | | | | | | | | | | | | Previously, notmuch_database_get_directory had no way to indicate how it had failed. This changes its prototype to return a status code and set an out-argument to the retrieved directory, like similar functions in the library API. This does *not* change its currently broken behavior of creating directory objects when they don't exist, but it does document it and paves the way for fixing this. Also, it can now check for a read-only database and return NOTMUCH_STATUS_READ_ONLY_DATABASE instead of crashing. In the interest of atomicity, this also updates calls from the CLI so that notmuch still compiles.
* lib/cli: Make notmuch_database_create return a status codeGravatar Austin Clements2012-05-05
| | | | | | | | This is the notmuch_database_create equivalent of the previous change. In this case, there were places where errors were not being propagated correctly in notmuch_database_create or in calls to it. These have been fixed, using the new status value.
* lib/cli: Make notmuch_database_open return a status codeGravatar Austin Clements2012-05-05
| | | | | | | | | | | | It has been a long-standing issue that notmuch_database_open doesn't return any indication of why it failed. This patch changes its prototype to return a notmuch_status_t and set an out-argument to the database itself, like other functions that return both a status and an object. In the interest of atomicity, this also updates every use in the CLI so that notmuch still compiles. Since this patch does not update the bindings, the Python bindings test fails.
* Use notmuch_database_destroy instead of notmuch_database_closeGravatar Justus Winter2012-04-28
| | | | | | Adapt the notmuch binaries source to the notmuch_database_close split. Signed-off-by: Justus Winter <4winter@informatik.uni-hamburg.de>
* new: Fix missing end_atomic in remove_filename on errorGravatar Austin Clements2012-04-24
| | | | | | | Previously, if we failed to find the message by filename in remove_filename, we would return immediately from the function without ending its atomic block. Now this code follows the usual goto DONE idiom to perform cleanup.
* new: Print final fatal error message to stderrGravatar Austin Clements2012-04-24
| | | | | | | This was going to stdout. I removed the newline at the beginning of printing the fatal error message because it wouldn't make sense if you were only looking at the stderr stream (e.g., you had redirected stdout to /dev/null).
* new: Handle fatal errors in remove_filename and _remove_directoryGravatar Austin Clements2012-04-24
| | | | | Previously such errors were simply ignored. Now they cause an immediate cleanup and abort.
* new: Consistently treat fatal errors as fatalGravatar Austin Clements2012-04-24
| | | | | | | Previously, fatal errors in add_files_recursive were not treated as fatal by its callers (including itself!). This makes add_files_recursive errors consistently fatal and updates all callers to treat them as fatal.
* add support for user-specified files & directories to ignoreGravatar Tomi Ollila2012-02-17
| | | | | | | | | A new configuration key 'new.ignore' is used to determine which files and directories user wants not to be scanned as new mails. Mark the corresponding test as no longer broken. This work merges my previous attempts and Andreas Amann's work in id:"ylp7hi23mw8.fsf@tyndall.ie"
* Free the results of scandir()Gravatar Ethan Glasser-Camp2012-02-14
| | | | | | | | | | scandir() returns "strings allocated via malloc(3)" which are then "collected in array namelist which is allocated via malloc(3)". Currently we just free the array namelist. Instead, free all the entries of namelist, and then free namelist. entry only points to elements of namelist, so we don't free it separately.
* Silence buildbot warnings about unused resultsGravatar Austin Clements2012-01-21
| | | | | | | | | | | This ignores the results of the two writes in sigint handlers even harder than before. While my libc lacks the declarations that trigger these warnings, this can be tested by adding the following to notmuch.h: __attribute__((warn_unused_result)) ssize_t write(int fd, const void *buf, size_t count);
* notmuch: Quiet buildbot warnings.Gravatar David Edmondson2011-12-21
| | | | | Cast away the result of various *write functions. Provide a default value for some variables to avoid "use before set" warnings.
* cli: add support for pre and post notmuch new hooksGravatar Jani Nikula2011-12-11
| | | | | | | | | | | | | | Run notmuch new pre and post hooks, named "pre-new" and "post-new", if present in the notmuch hooks directory. The hooks will be run before and after incorporating new messages to the database. Typical use cases for pre-new and post-new hooks are fetching or delivering new mail to the maildir, and custom tagging of the mail incorporated to the database. Also add command line option --no-hooks to notmuch new to bypass the hooks. Signed-off-by: Jani Nikula <jani@nikula.org>
* cli: change argument parsing convention for subcommandsGravatar David Bremner2011-10-22
| | | | | | | | | previously we deleted the subcommand name from argv before passing to the subcommand. In this version, the deletion is done in the actual subcommands. Although this causes some duplication of code, it allows us to be more flexible about how we parse command line arguments in the subcommand, including possibly using off-the-shelf routines like getopt_long that expect the name of the command in argv[0].
* lib: make find_message{,by_filename) report errorsGravatar Ali Polatel2011-10-04
| | | | | | | | | | | | | | | | Previously, the functions notmuch_database_find_message() and notmuch_database_find_message_by_filename() functions did not properly report error condition to the library user. For more information, read the thread on the notmuch mailing list starting with my mail "id:871uv2unfd.fsf@gmail.com" Make these functions accept a pointer to 'notmuch_message_t' as argument and return notmuch_status_t which may be used to check for any error condition. restore: Modify for the new notmuch_database_find_message() new: Modify for the new notmuch_database_find_message_by_filename()
* new: Wrap adding and removing messages in atomic sections.Gravatar Austin Clements2011-09-24
| | | | | | This addresses atomicity of tag synchronization, the last atomicity problems in notmuch new. Each message add or remove is wrapped in its own atomic section, so interrupting notmuch new doesn't lose progress.
* new: Synchronize maildir flags eagerly.Gravatar Austin Clements2011-09-24
| | | | | | | | | | | | Because flag synchronization is stateless, it can be performed at any time as long as it's guaranteed to be performed after any change to a message's filename list. Take advantage of this to synchronize tags immediately after a filename is added or removed. This does not yet make adding or removing a message atomic, but it is a big step toward atomicity because it reduces the window where the database tags are inconsistent from nearly the entire notmuch-new to just around when the message is added or removed.
* new: Cleanup. De-duplicate file name removal code.Gravatar Austin Clements2011-09-24
| | | | | | | | | Previously, file name removal was implemented identically in two places. Now it's captured in one function. This is important because file name removal is about to get slightly more complicated with eager tag synchronization and correct removal atomicity.
* new: Cleanup. Put removed/renamed message count in add_files_state_t.Gravatar Austin Clements2011-09-24
| | | | | | Previously, pointers to these variables were passed around individually. This was okay when only one function needed them, but we're about to need them in a few more places.
* lib: Add support for nested atomic sections.Gravatar Austin Clements2011-09-23
| | | | | | | notmuch_database_t now keeps a nesting count and we only start a transaction or commit for the outermost atomic section. Introduces a new error, NOTMUCH_STATUS_UNBALANCED_ATOMIC.
* new: Defer updating directory mtimes until the end.Gravatar Austin Clements2011-09-23
| | | | | | | | Previously, if notmuch new were interrupted between updating the directory mtime and handling removals from that directory, a subsequent notmuch new would not handle those removals until something else changed in that directory. This defers recording the updated mtime until after removals are handled to eliminate this problem.
* new: Don't lose messages on SIGINT.Gravatar Austin Clements2011-09-13
| | | | | | | | Previously, message removals were always performed, even after a SIGINT. As a result, when a message was moved from one folder to another, a SIGINT between processing the directory the message was removed from and processing the directory it was added to would result in notmuch removing that message from the database.
* new: Improved workaround for mistaken new directoriesGravatar Austin Clements2011-06-29
| | | | | | | | | | | | | Currently, notmuch new assumes any directory with a database mtime of 0 is new, but we don't set the mtime until after processing messages and subdirectories in that directory. Hence, anything that prevents the mtime update (such as an interruption or the wall-clock logic introduced in 8c39e8d6) will cause the next notmuch new to think the directory is still new. We work around this by setting the new directory's database mtime to -1 before scanning anything in the new directory. This also obviates the need for the workaround used in 8c39e8d6.
* new: Don't update DB mtime if FS mtime equals wall-clock time.Gravatar Austin Clements2011-06-29
| | | | | | | | | | | | | | | | This fixes a race where multiple message deliveries in the same second with an intervening notmuch new could result in messages being ignored by notmuch (at least, until a later delivery forced a rescan). Because mtimes only have second granularity, later deliveries in the same second won't change the directory mtime, and hence won't trigger notmuch new to rescan the directory. This situation can only occur when notmuch new is being run at the same second as the directory's modification time, so simply don't update the saved mtime in this case. This very race happens all over the test suite, and is currently compensated for with increment_mtime (and, occasionally, luck). With this change, increment_mtime becomes unnecessary.
* fix sum moar typos [comments in source code]Gravatar Pieter Praet2011-06-23
| | | | | | | | | | Various typo fixes in comments within the source code. Signed-off-by: Pieter Praet <pieter@praet.org> Edited-by: Carl Worth <cworth@cworth.org> Restricted to just source-code comments, (and fixed fix of "descriptios" to "descriptors" rather than "descriptions").
* Remove some variables which were set but not used.Gravatar Carl Worth2011-05-11
| | | | | | | | | | | | gcc (at least as of version 4.6.0) is kind enough to point these out to us, (when given -Wunused-but-set-variable explicitly or implicitly via -Wunused or -Wall). One of these cases was a legitimately unused variable. Two were simply variables (named ignored) we were assigning only to squelch a warning about unused function return values. I don't seem to be getting those warnings even without setting the ignored variable. And the gcc docs. say that the correct way to squelch that warning is with a cast to (void) anyway.
* new: Update comments for add_files_recursiveGravatar Carl Worth2011-03-10
| | | | | | The most recent commit optimized the implementation of this function. This commit simply updates the relevant comments to match the new implementation.
* new: read db_files and db_subdirs only if mtime changedGravatar Karel Zak2011-03-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The db_files and db_subdirs are unnecessary for unchanged directories. maildir with 10000 e-mails: old version: $ time ./notmuch new No new mail. real 0m0.053s user 0m0.028s sys 0m0.026s new version: $ time ./notmuch new No new mail. real 0m0.032s user 0m0.009s sys 0m0.023s Signed-off-by: Karel Zak <kzak@redhat.com> Reviewed-by: Austin Clements <amdragon@mit.edu> Looks good (faster than, but provably equivalent to the original code! notmuch_directory_get_child_* are side-effect free, db_files/db_subdirs aren't used between where they were set in the old code and where they are set in the new code, and db_files/db_subdirs are initialized to NULL when declared). Another timing data point: Old code: ./notmuch new 0.77s user 0.28s system 99% cpu 1.051 total New code: ./notmuch new 0.09s user 0.27s system 98% cpu 0.368 total
* new: Print progress estimates only when we have sufficient informationGravatar Michal Sojka2011-01-26
| | | | | | | | | | Without this patch, it might happen that the remaining time or processing rate were calculated just after start where nothing was processed yet. This resulted into division by a very small number (or zero) and the printed information was of little value. Instead of printing nonsenses we print only that the operation is in progress. The estimates will be printed later, after there is enough data.
* new: Enhance progress reportingGravatar Michal Sojka2011-01-26
| | | | | | | | | | | | | | | notmuch new reports progress only during the "first" phase when the files on disk are traversed and indexed. After this phase, other operations like rename detection and maildir flags synchronization are performed, but the user is not informed about them. Since these operations can take significant time, we want to inform the user about them. This patch enhances the progress reporting facility that was already present. The timer that triggers reporting is not stopped after the first phase but continues to run until all operations are finished. The rename detection and maildir flag synchronization are enhanced to report their progress.
* new: Add all initial tags at onceGravatar Michal Sojka2011-01-26
| | | | | | | | If there are several tags applied to the new messages, it is beneficial to store them to the database at one, because it saves some time, especially when the notmuch new is run for the first time. This patch decreased the time for initial import from 1h 35m to 1h 14m.
* Do not defer maildir flag synchronization for new messagesGravatar Austin Clements2011-01-26
| | | | | | | | | | | | | | | | | | | | | | | | This is a simplified version of a patch originally by Michal Sojka <sojkam1@fel.cvut.cz> which is designed to have the same performance benefits. Michal said the following: When notmuch new is run for the first time, it is not necessary to defer maildir flags synchronization to later because we already know that no files will be removed. Performing the maildinr flag synchronization immediately after the message is added to the database has the advantage that the message is likely hot in the disk cache so the synchronization is faster. Additionally, we also save one database query for each message, which must be performed when the operation is deferred. Without this patch, the first notmuch new of 200k messages (3 GB) took 1h and 46m out of which 20m was maildir flags synchronization. With this patch, the whole operation took only 1h and 36m. Unlike Michal's patch, this version does the deferral for any new message, rather than doing it only on the first run of "notmuch new".
* notmuch new: Scan directory whenever fs mtime is not equal to db mtimeGravatar Carl Worth2010-12-05
| | | | | | | | | | | | | | Previously, we would only scan a directory if the filesystem modification time was strictly newer than the database modification time for the directory. This would cause a problem for systems with an unstable clock, (if a new mail was added to the filesystem, then the system clock rolled backward, "notmuch new" would not find the message until the clock caught up and the directory was modified again). Now, we always scan the directory if the modification time of the directory is not exactly the same between the filesystem and the database. This avoids the problem described above even with an unstable system clock.
* notmuch new: Defer maildir_flags synchronization until after removalsGravatar Carl Worth2010-11-11
| | | | | | | | | | | | | | | | | | | | | | When a file in the mailstore is renamed, this appears to "notmuch new" as both an added file and a removed file (for the same message). We want the synchronization of the maildir_flags to reflect the final state, (after the rename is complete). Therefore, it's incorrect to perform the synchronization immediately after adding a new file. Instead we queue up these synchronizations (by message ID[*]) and perform them after the removals are complete. With this change, the "dump/restore" case of the maildir-sync tests, as well as the recent "remove 'S'" case both now pass where they were failing before. Interestingly, the "remove info" test was passing before, but now fails. This is actually due to a separate bug, (and the bug just fixed was masking it, by preventing the test from performing as desired). [*] It's important to queue by message ID---queueing actual message objects does not work since the message objects will retain stale data such as the old filenames.
* lib: Rework interface for maildir_flags synchronizationGravatar Carl Worth2010-11-11
| | | | | | | | | | | | | Instead of having an API for setting a library-wide flag for synchronization (notmuch_database_set_maildir_sync) we instead implement maildir synchronization with two new library functions: notmuch_message_maildir_flags_to_tags and notmuch_message_tags_to_maildir_flags These functions are nicely documented here, (though the implementation does not quite match the documentation yet---as plainly evidenced by the current results of the test suite).
* Avoid abbreviation, preferring notmuch_config_get_maildir_synchronize_flagsGravatar Carl Worth2010-11-11
| | | | | | | | | | | | | Since the name of the configuration parameter here is: maildir.synchronize_flags the convention is that the functions to get and set this parameter should match it in name. Hence: notmuch_config_get_maildir_synchronize_flags etc. (as opposed to notmuch_config_get_maildir_sync).
* Make maildir synchronization configurableGravatar Michal Sojka2010-11-10
| | | | | | | This adds group [maildir] and key 'synchronize_flags' to the configuration file. Its value enables (true) or diables (false) the synchronization between notmuch tags and maildir flags. By default, the synchronization is disabled.
* Maildir synchronizationGravatar Michal Sojka2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows bi-directional synchronization between maildir flags and certain tags. The flag-to-tag mapping is defined by flag2tag array. The synchronization works this way: 1) Whenever notmuch new is executed, the following happens: o New messages are tagged with configured new_tags. o For new or renamed messages with maildir info present in the file name, the tags defined in flag2tag are either added or removed depending on the flags from the file name. 2) Whenever notmuch tag (or notmuch restore) is executed, a new set of flags based on the tags is constructed for every message and a new file name is prepared based on the old file name but with the new flags. If the flags differs and the old message was in 'new' directory then this is replaced with 'cur' in the new file name. If the new and old file names differ, the file is renamed and notmuch database is updated accordingly. The rename happens before the database is updated. In case of crash between rename and database update, the next run of notmuch new brings the database in sync with the mail store again.
* Sprinkle some const-correctness around new_tags.Gravatar Carl Worth2010-04-23
| | | | To eliminate a compiler warning.
* notmuch-config: make new message tags configurableGravatar Ben Gamari2010-04-23
| | | | | | Add a new_tags option in the [messages] section of the configuration file to allow the user to specify which tags should be added to new messages by notmuch new.
* Prevent data loss caused by SIGINT during notmuch newGravatar Michal Sojka2010-04-13
| | | | | | | | | | When Ctrl-C is pressed in a wrong time during notmuch new, it can lead to removal of messages from the database even if the files were not removed. It happened at least once to me. Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
* lib: Rename iterator functions to prepare for reverse iteration.Gravatar Carl Worth2010-03-09
| | | | | | | | We rename 'has_more' to 'valid' so that it can function whether iterating in a forward or reverse direction. We also rename 'advance' to 'move_to_next' to setup parallel naming with the proposed functions 'move_to_first', 'move_to_last', and 'move_to_previous'.
* Fix misspelling of DT_UNKNOWN.Gravatar Carl Worth2010-01-23
| | | | | How foolish of me to advertise the fact that I pushed a commit without compiling it first...
* Add some comments to document the recently-fixed handling of d_type.Gravatar Carl Worth2010-01-23
| | | | | The fix was subtle, (requiring less code than originally expected), so it behooves us to document it well.
* notmuch new: Fix to work on filesystems returning DT_UNKNOWNGravatar Geo Carncross2010-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | Such as reiserfs or xfs. This has been broken since the merge of support for rename and deletion of files from the mail store. Here's the original justification for the patch: A review of notmuch-new.c shows three uses of ->d_type: Near line 153, in _entries_resemble_maildir() we can simply allow for DT_UNKNOWN. This would fail if people have MH-style folders which have three folders called "new" "cur" and "tmp", but that seems unlikely, in which case the "tmp" folder would simply not be scanned. Near line 273 in add_files_recursive() we have another check. If DT_UNKNOWN, we fall through, then add_files_recursive() does a stat almost immediately, returning with success if the path isn't a directory. Thus, the fallback is already written. Finally, near line 343, in add_files_recursive() (a long function) we have another check. Here we can simply treat DT_UNKNOWN as DT_LNK, since the logic for the stat() results are the same.
* notmuch new: Print upgrade progress report as a percentage.Gravatar Carl Worth2010-01-09
| | | | | | | | | | | | Previously we were printing a number of messages upgraded so far. The original motivation for this was to accurately reflect the fact that there are two passes, (so each message is processed twice and it's not accurate to represent with a single count). But as it turns out, the second pass takes zero time (relatively speaking) so we're still not accounting for it. If nothing else, the percentage-based reporting makes for a cleaner API for the progress_notify function.