| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refocus synchronization mechanism to cope with file descriptor set fork-
induced races to more tightly constrain concurrent fork/exec pairs. This
problem has been observed in bazel proper repeatedly, exhibiting as the
iconic ETXTBSY - Text file busy in wide worker pool builds and tests.
Evidence that this was discovered by @buchgr is in the comment and change
to the embedded ExecutionService implementation, and the description of
the race and the need for the synchronization was lifted from that scope
to the JavaSubprocessFactory. This factory is a singleton and represents
the gateway to all worker process execution, and serves as the correct
lock primitive to ensure that file descriptor sets are not duplicated
across forks, which gave rise to this issue.
To test this, I demonstrated a reproducer presented at
https://bugs.java.com/view_bug.do?bug_id=8068370
with 2.4% of invocations in that pathological case exhibiting the issue.
With a functionally equivalent change - synchronizing around a
processBuilder.start() call - as the only modification to the reproducer,
no further failures of any kind were observed, over several hundred runs.
Closes #5556.
PiperOrigin-RevId: 203947224
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://github.com/bazelbuild/bazel/commit/656a0bab1e025ff3c27d595284a4bf1c5a8d8028 with test (unknown commit) and fix.
Big round of sandbox fixes / performance improvements.
- The number of stat() syscalls in the SymlinkedSandboxedSpawn was way too high. Do less, feel better.
- When using --experimental_sandbox_base, ensure that symlinks in the path are resolved. Before this, you had to check whether on your system /dev/shm is a symlink to /run/shm and then use that instead. Now it no longer matters, as symlinks are resolved.
- Remove an unnecessary directory creation from each sandboxed invocation. Turns out that the "tmpdir" that we created was no longer used after some changes to Bazel's TMPDIR handling.
- Use simpler sandbox paths, by using the unique ID for each Spawn provided by SpawnExecutionPolicy instead of a randomly generated temp folder name. This also saves a round-trip from our VFS to NIO and back. Clean up the sandbox base before each build to ensure that the unique IDs are actually unique. ;)
- Use Java 8's Process#isAlive to check whether a process is alive instead of trying to get the exitcode and catching an exception.
Closes #4913.
PiperOrigin-RevId: 193031017
|
|
|
|
| |
PiperOrigin-RevId: 191642942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The number of stat() syscalls in the SymlinkedSandboxedSpawn was way too high. Do less, feel better.
- When using --experimental_sandbox_base, ensure that symlinks in the path are resolved. Before this, you had to check whether on your system /dev/shm is a symlink to /run/shm and then use that instead. Now it no longer matters, as symlinks are resolved.
- Remove an unnecessary directory creation from each sandboxed invocation. Turns out that the "tmpdir" that we created was no longer used after some changes to Bazel's TMPDIR handling.
- Use simpler sandbox paths, by using the unique ID for each Spawn provided by SpawnExecutionPolicy instead of a randomly generated temp folder name. This also saves a round-trip from our VFS to NIO and back. Clean up the sandbox base before each build to ensure that the unique IDs are actually unique. ;)
- Use Java 8's Process#isAlive to check whether a process is alive instead of trying to get the exitcode and catching an exception.
Closes #4913.
PiperOrigin-RevId: 190472170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two close operations were added to work around #1708, but caused #2675.
We found the root cause of the hanging problem in #1708 is a race
condition when creating Windows processes:
When Bazel trys to create two processes, one for a local command
execution, one for starting the worker process. The worker process
might accidentally inherits handles opened when creating the local
command process, and it holds those handles as long as it lives.
Therefore, ReadFile function hangs when handles for the write end of
stdout/stderr pipes are released by the worker.
The solution is to make Bazel native createProcess JNI function
explicitly inheirts handles as needed, and use this function to start
worker process.
Related: http://support.microsoft.com/kb/315939
Fixed https://github.com/bazelbuild/bazel/issues/2675
Change-Id: I1c9b1ac3c9383ed2fd28ea92f528f19649693275
PiperOrigin-RevId: 173244832
|
|
|
|
|
|
|
|
|
|
|
| |
Also move the implementation of FutureCommandResult to a top-level class.
This is in preparation for significantly simplifying the shell library. The
plan is to remove the Subprocess abstraction, and have lower-level
implementations implement the much simpler FutureCommandResult interface
instead.
PiperOrigin-RevId: 167844736
|
|
|
|
| |
PiperOrigin-RevId: 164827022
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Important: the simplified API now defaults to forwarding interrupts to
subprocesses. I did audit all the call sites, and I think this is a safe change
to make.
- Properly support timeouts with all implementations
- Simplify the API
- only provide two flavours of blocking calls, which require no input and
forward interrupts; this is the most common usage
- provide a number of async calls, which optionally takes input, and a flag
whether to forward interrupts
- only support input streams, no byte arrays or other 'convenience features'
that are rarely needed and unnecessarily increase the surface area
- use java.time.Duration to specify timeout; for consistency, interpret a
timeout of <= 0 as no timeout (i.e., including rather than excluding 0)
- KillableObserver and subclasses are no longer part of the public API, but
still used to implement timeouts if the Subprocess.Factory does not support
them
- Update the documentation for Command
- Update all callers; most callers now use the simplified API
PiperOrigin-RevId: 164716782
|
|
|
|
|
|
| |
--
PiperOrigin-RevId: 147716435
MOS_MIGRATED_REVID=147716435
|
|
|
|
|
|
|
| |
Makes #1664 much less acute.
--
MOS_MIGRATED_REVID=130750731
|
|
|
|
|
|
|
|
|
| |
Subprocesses now get killed if the Bazel server itself is killed and so do their subprocesses.
Also implemented Subprocess#close() so that we get a little more control over when the native structures are cleaned up.
--
MOS_MIGRATED_REVID=126628000
|
|
|
|
|
|
|
|
|
|
|
| |
to Windows process management.
With this change, Bazel can build itself using native Windows process management and Ctrl-C works in server mode as expected. Yay!
Flipping the flag will come in a separate change that's easy to roll back if need be.
--
MOS_MIGRATED_REVID=126408264
|
|
implementation can eventually be plugged in.
--
MOS_MIGRATED_REVID=126404913
|