aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java
diff options
context:
space:
mode:
authorGravatar ulfjack <ulfjack@google.com>2018-06-15 03:15:03 -0700
committerGravatar Copybara-Service <copybara-piper@google.com>2018-06-15 03:16:15 -0700
commit816f3ef7c8811cd1abd9ff807dc8ff1d9f6e10ac (patch)
treedff3ce3d36308110c2fcad3911f32d6aee07c389 /src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java
parent8e09941b41c48b94b398af1700c4f98bb2ea8596 (diff)
Reimplement AsynchronousFileOutputStream to use a writer thread
Background: the original code is implementing the OutputStream interface. Given a sequence of write calls, the code puts each of these writes into a queue. On the other side of the queue is an unbounded thread pool, which takes the writes off the queue one by one, and then does individual blocking writes with a fixed file offset. There are three problems with the original code: 1. Writes are sent to the Kernel one-by-one. Imagine if the incoming writes a single-byte writes, then we do one kernel call for every single byte. This is the worst case. 2. Due to multithreading, the writes can get reordered. In the worst case, the order is reversed, and the kernel flushes each write to disk individually. Since the writes are not aligned to disk block boundaries, each write has to first *read* the block from disk, overwrite a few bytes, and then flush the block back to disk. On a spinning platter, this is the worst possible sequence of operations: write a block, read the same block back, write the same block again, read the same block back, etc., with each operation having to wait for a full disk rotation. Note that this gets worse if there's high thread and / or disk contention, e.g., when running a build system in the background. 3. Due to the unbounded thread pool, it may end up creating a new thread for every single write (possibly as many as one per byte written). This is also the worst case, although it's probably negligible compared to 1+2. Compared to that, this change uses an in-memory buffer before sending writes to the kernel so non-block sized writes are batched, it writes sequentially, and it uses a single thread created at the start. A single thread should be more than able to fully saturate local disk I/O, so multi-threading should ~never be a improvement here. This might look different if we had perfectly aligned writes of an integer multiple of the disk block size to a distributed network file system or a local SSD raid. If you look at the clients of this class, that's definitely not the case: this code is primarily used for local file BEP transports - we wouldn't use local file BEP transports to write to a network file system, we'd use the BES instead. It's also used to write a couple of log files, also all local - otherwise we'd add the data to BEP. These are all unaligned, ~random-sized writes. If we created a lot of these files, then using a thread pool with fewer threads than files and using non-blocking I/O might be an improvement due to the reduction in thread count, but I think it's very unlikely that we'll ever need that complexity. PiperOrigin-RevId: 200694423
Diffstat (limited to 'src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java')
0 files changed, 0 insertions, 0 deletions