Reimplement AsynchronousFileOutputStream to use a writer thread - bazel - a fast, scalable, multi-language and extensible build system

diff options

author	ulfjack <ulfjack@google.com>	2018-06-15 03:15:03 -0700
committer	Copybara-Service <copybara-piper@google.com>	2018-06-15 03:16:15 -0700
commit	816f3ef7c8811cd1abd9ff807dc8ff1d9f6e10ac (patch)
tree	dff3ce3d36308110c2fcad3911f32d6aee07c389 /src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java
parent	8e09941b41c48b94b398af1700c4f98bb2ea8596 (diff)

Reimplement AsynchronousFileOutputStream to use a writer thread

Background: the original code is implementing the OutputStream interface. Given a sequence of write calls, the code puts each of these writes into a queue. On the other side of the queue is an unbounded thread pool, which takes the writes off the queue one by one, and then does individual blocking writes with a fixed file offset. There are three problems with the original code: 1. Writes are sent to the Kernel one-by-one. Imagine if the incoming writes a single-byte writes, then we do one kernel call for every single byte. This is the worst case. 2. Due to multithreading, the writes can get reordered. In the worst case, the order is reversed, and the kernel flushes each write to disk individually. Since the writes are not aligned to disk block boundaries, each write has to first *read* the block from disk, overwrite a few bytes, and then flush the block back to disk. On a spinning platter, this is the worst possible sequence of operations: write a block, read the same block back, write the same block again, read the same block back, etc., with each operation having to wait for a full disk rotation. Note that this gets worse if there's high thread and / or disk contention, e.g., when running a build system in the background. 3. Due to the unbounded thread pool, it may end up creating a new thread for every single write (possibly as many as one per byte written). This is also the worst case, although it's probably negligible compared to 1+2. Compared to that, this change uses an in-memory buffer before sending writes to the kernel so non-block sized writes are batched, it writes sequentially, and it uses a single thread created at the start. A single thread should be more than able to fully saturate local disk I/O, so multi-threading should ~never be a improvement here. This might look different if we had perfectly aligned writes of an integer multiple of the disk block size to a distributed network file system or a local SSD raid. If you look at the clients of this class, that's definitely not the case: this code is primarily used for local file BEP transports - we wouldn't use local file BEP transports to write to a network file system, we'd use the BES instead. It's also used to write a couple of log files, also all local - otherwise we'd add the data to BEP. These are all unaligned, ~random-sized writes. If we created a lot of these files, then using a thread pool with fewer threads than files and using non-blocking I/O might be an improvement due to the reduction in thread count, but I think it's very unlikely that we'll ever need that complexity. PiperOrigin-RevId: 200694423

Diffstat (limited to 'src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: