aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/main/java/com/google/devtools/build/lib/remote/RemoteSpawnRunner.java
Commit message (Collapse)AuthorAge
* Allow banning symlink action outputs from being uploaded to a remote cache.Gravatar Benjamin Peterson2018-05-03
| | | | | | | | | | | | | | | | | | | This is mostly a roll-forward of 4465dae23de989f1452e93d0a88ac2a289103dd9, which was reverted by fa36d2f48965b127e8fd397348d16e991135bfb6. The main difference is that the new behavior is now gated behind the --noremote_allow_symlink_upload flag. https://docs.google.com/document/d/1gnOYszitgrLVet3sQk-TKGqIcpkkDsc6aw-izoo-d64 is a design proposal to support symlinks in the remote cache, which would render this change moot. I'd like to be able to prevent incorrect cache behavior until that change is implemented, though. This fixes https://github.com/bazelbuild/bazel/issues/4840 (again). Closes #5122. Change-Id: I2136cfe82c2e1a8a9f5856e12a37d42cabd0e299 PiperOrigin-RevId: 195261827
* Friendlier error messages on remote failures. Moving the error to the top. ↵Gravatar olaola2018-04-24
| | | | | | | Removing stack trace unless verbose failures is on. TESTED=unit test PiperOrigin-RevId: 194060440
* Rename SpawnExecutionPolicy -> SpawnExecutionContext.Gravatar tomlu2018-04-19
| | | | | | | | This class will be used to tie a Spawn to a SpawnRunner, and isn't really a policy object. It will carry state such as the expanded inputs and expanded command line. Currently a context can be passed between different SpawnRunners. This will be addressed independently, so a context is tied to a particular spawn runner. PiperOrigin-RevId: 193501918
* Automated rollback of commit 4465dae23de989f1452e93d0a88ac2a289103dd9.Gravatar buchgr2018-04-18
| | | | | | | | | | | | | | | | | | | | | *** Reason for rollback *** The no-cache tag is not respected (see b/77857812) and thus this breaks remote caching for all projects with symlink outputs. *** Original change description *** Only allow regular files and directories spawn outputs to be uploaded to a remote cache. The remote cache protocol only knows about regular files and directories. Currently, during action output upload, symlinks are resolved into regular files. This means cached "executions" of an action may have different output file types than the original execution, which can be a footgun. This CL bans symlinks from cachable spawn outputs and fixes http... *** PiperOrigin-RevId: 193338629
* Remove unused execRoot parameter from buildActionGravatar George Gensure2018-03-27
| | | | | | Closes #4916. PiperOrigin-RevId: 190662077
* Ensure Runner name is always set.Gravatar Googler2018-03-27
| | | | | RELNOTES: None. PiperOrigin-RevId: 190617155
* Only allow regular files and directories spawn outputs to be uploaded to a ↵Gravatar Benjamin Peterson2018-03-27
| | | | | | | | | | | | | | | | | | | | remote cache. The remote cache protocol only knows about regular files and directories. Currently, during action output upload, symlinks are resolved into regular files. This means cached "executions" of an action may have different output file types than the original execution, which can be a footgun. This CL bans symlinks from cachable spawn outputs and fixes https://github.com/bazelbuild/bazel/issues/4840. The interface of SpawnCache.CacheHandle.store is refactored: 1. The outputs parameter is removed, since that can be retrieved from the underlying Spawn. 2. It can now throw ExecException in order to fail actions. Closes #4902. Change-Id: I0d1d94d48779b970bb5d0840c66a14c189ab0091 PiperOrigin-RevId: 190608852
* Propagating and printing server logs on demand.Gravatar olaola2018-03-21
| | | | | | | WANT_LGTM=all TESTED=RBE, unit tests RELNOTES: None PiperOrigin-RevId: 189938345
* Propagating remote results, including stdout/err, to Bazel on execution ↵Gravatar olaola2018-03-16
| | | | | | | | | timeouts. The refactoring to have an Exception that contains partial results will also be used in the next CL, in order to propagate and save remote server logs. RELNOTES: None PiperOrigin-RevId: 189344465
* Remove unnecessary I/O operations from handling remotely executed actions ↵Gravatar olaola2018-03-08
| | | | | | | | (fixes performance regression #4749). Also adding Skylark tests for input/output directories. TESTED=locally RELNOTES: Fix performance regression PiperOrigin-RevId: 188346410
* Don't display remote exec failed message on interrupt. Fixes #4783Gravatar olaola2018-03-06
| | | | | | TESTED=manually RELNOTES: None PiperOrigin-RevId: 188079436
* remote: Add interceptor for logging gRPC calls during remote execution/cachingGravatar Googler2018-03-05
| | | | | | | | | | This provides a io.grpc.ClientInterceptor implementation that can be used to log gRPC call information. The interceptor can select a logging handler to use based on the gRPC method being called (Watch, Execute, Write, etc) to build a LogEntry, which can then be logged after the call has finished. Unit tests for the interceptor are included. In this change, the interceptor is never invoked, nor are there any handlers implemented for any gRPC methods. The interceptor also never tries to log any entries. To avoid circular dependency issues (Remote library will depend on logger which depends on remote library for utils), I've factored out the utility classes from the remote library into their own directory/package as part of this change. PiperOrigin-RevId: 187926516
* Adding a property name to the SpawnRunner. Most runners already had it, I ↵Gravatar olaola2018-02-22
| | | | | | | | | | just add it to the interface, and include it in the SpawnResult. This will be used to categorize/aggregate spawns executed by various runners. Also, minor refinement to the cacheHit property of the SpawnResult with remote execution. RELNOTES: None TESTED=presubmit, next cl PiperOrigin-RevId: 186637978
* Bugfix: spawn output directories are not passed to remote execution.Gravatar olaola2018-02-21
| | | | | | | | This bug was invisible to out tests, because our RemoteWorker will work regardless of whether the output was listed as a file or as a directory. TESTED=same tests RELNOTES: None PiperOrigin-RevId: 186518904
* Propagating whether there was a cache hit in the SpawnResult.Gravatar olaola2018-02-20
| | | | | | | | So far, nobody uses it, but I want to start using this field soon. TESTED=unit test RELNOTES: None PiperOrigin-RevId: 186290375
* Fixing #4585: broken re-execution of orphaned actions.Gravatar olaola2018-02-08
| | | | | | | | This is an important regression, we will want to patch the fix into 0.10 TESTED=fixed unit test, with A/B testing RELNOTES: Resolved an issue where a failure in the remote cache would not trigger local re-execution of an action. PiperOrigin-RevId: 184991670
* Have the RemoteSpawnRunner use the execution platform present in the Spawn ↵Gravatar John Cater2018-01-10
| | | | | | | | | | | to get the remote execution properties. Fixes #4128. This reverts commit 3ce42ef3074ee6d3ac7d9968381c8c0a51d9d38d. Change-Id: I8b9ad5099f6334c2488a22baf05d0b273e10f776 PiperOrigin-RevId: 181550828
* Use EmptyActionInput instead of null in SpawnInputExpanderGravatar ulfjack2018-01-08
| | | | | | | | This simplifies some spawn runners, which no longer have to specially handle null; unfortunately, the sandbox runners do not support VirtualActionInput, so they still have to special-case it. PiperOrigin-RevId: 181175408
* Automated rollback of commit 43f45b58acf10beadbb1221b71dfa06fa1341510.Gravatar jcater2017-12-21
| | | | | | | | | | | | | | | *** Reason for rollback *** Breaks some remote execution clients. *** Original change description *** Have the RemoteSpawnRunner use the execution platform present in the Spawn to get the remote execution properties. Fixes #4128. Change-Id: I7e71caef2465204d2dd8225448d54e52366807e6 PiperOrigin-RevId: 179847553
* remote: add directory support for remote caching and executionGravatar Hadrien Chauvin2017-12-20
| | | | | | Add support for directory trees as artifacts. Closes #4011. PiperOrigin-RevId: 179691001
* Have the RemoteSpawnRunner use the execution platform present in the Spawn ↵Gravatar John Cater2017-12-19
| | | | | | | | | to get the remote execution properties. Fixes #4128. Change-Id: I7e71caef2465204d2dd8225448d54e52366807e6 PiperOrigin-RevId: 179595126
* Fix: uploading artifacts of failed actions to remote cache stopped working.Gravatar olaola2017-12-11
| | | | | | | | | | | | | | | | To reproduce: run a failing test with --experimental_remote_spawn_cache or with --spawn_strategy=remote and no executor. Expected: test log is uploaded. Desired behavior: - regardless of whether a spawn is cacheable or not, its artifacts should be uploaded to the remote cache. - the spawn result should only be set if the spawn is cacheable *and* the action succeeded. - when executing remotely, the do_not_cache field should be set for non-cacheable spawns, and the remote execution engine should respect it. This CL contains multiple fixes to ensure the above behaviors, and adds a few tests, both end to end and unit tests. Important behavior change: it is no longer assumed that non-cacheable spawns should use a NO_CACHE SpawnCache! The appropriate test case was removed. Instead, an assumption was added that all implementations of SpawnCache should respect the Spawns.mayBeCached(spawn) property. Currently, only NO_CACHE and RemoteSpawnCache exist, and they (now) support it. TESTED=remote build execution backend. WANT_LGTM: philwo,buchgr RELNOTES: None PiperOrigin-RevId: 178617937
* remote: Replace Retrier with Retrier2.Gravatar buchgr2017-12-04
| | | | | | | | | - Replace the existing Retrier with Retrier2. - Rename Retrier2 to Retrier and remove the old Retrier + RetryException class. RELNOTES: None. PiperOrigin-RevId: 177835070
* Refactor the FileSystem API to allow for different hash functions.Gravatar buchgr2017-11-30
| | | | | | | | | | | | | Refactor the FileSystem class to include the hash function as an instance field. This allows us to have a different hash function per FileSystem and removes technical debt, as currently that's somewhat accomplished by a horrible hack that has a static method to set the hash function for all FileSystem instances. The FileSystem's default hash function remains MD5. RELNOTES: None PiperOrigin-RevId: 177479772
* remote: don't hide non-test failures behind test failures. Fixes #4082Gravatar buchgr2017-11-30
| | | | | | | | | | Bazel should display the root cause of a test failure to the user. For example, if a test could not be executed on a remote executor due to there being no network connection, then it shouldn't display the test as failed but tell the user about the network error. RELNOTES: PiperOrigin-RevId: 177439578
* Clean up ExecutionRequirementsGravatar ulfjack2017-11-29
| | | | | | | | | | | | | | | | | | | - remove BaseSpawn.Local; instead, all callers pass in the full set of execution requirements they want to set - disable caching and sandboxing for the symlink tree action - it does not declare outputs, so it can't be cached or sandboxed (fixes #4041) - centralize the existing execution requirements in the ExecutionRequirements class - centralize checking for execution requirements in the Spawn class (it's possible that we may need a more decentralized, extensible design in the future, but for now having them in a single place is simple and effective) - update the documentation - forward the relevant tags to execution requirements in TargetUtils (progress on #3960) - this also contributes to #4153 PiperOrigin-RevId: 177288598
* Simplify SpawnRunner interfaceGravatar ulfjack2017-11-28
| | | | | | | | | | | | | | It turns out that the SUCCESS status is often misunderstood to mean "zero exit", even though this is clearly documented. I've decided to add another status for non-zero exit, and use success only for zero exit to avoid this pitfall. Also, many of the status codes are set, but never used. I decided to reduce the number of status codes to only those that are actually relevant, which simplifies further processing. Instead, we should add a string message for the error case when we need one - we're not using it right now, so I decided not to add that yet. PiperOrigin-RevId: 177129441
* Fixing #3834, making sure test.log always exists.Gravatar olaola2017-10-18
| | | | | | | | Even if the test action produced no output, which it really shouldn't, Bazel should create an empty test.log file. TESTED=unit tests RELNOTES: Fixes #3834 PiperOrigin-RevId: 172412615
* Removing local fallback on failed action commands.Gravatar olaola2017-10-18
| | | | | | | | We should only fall back if a remote execution error occurred, not if the command itself failed. TESTED=better unit tests RELNOTES: None PiperOrigin-RevId: 172406687
* Simplify the SpawnExecException constructorGravatar ulfjack2017-10-16
| | | | | | | Whether or not there was a catastrophic error is stored in the SpawnResult, so we can just use that instead of passing in an additional boolean. PiperOrigin-RevId: 172083752
* Move SpawnResult from build.lib.exec into build.lib.actions so that e.g. ↵Gravatar ruperts2017-09-22
| | | | | | | build.lib.actions.SpawnActionContext can import SpawnResult without creating a cyclic dependency. RELNOTES: None. PiperOrigin-RevId: 169642267
* Removing deprecated field total_input_file_count.Gravatar olaola2017-09-21
| | | | | | | | | See API Revisions design doc: https://docs.google.com/document/d/12c3oAPgedckLpue2yj0xGgJTEOyUm4mXWWBJ157J-8I/edit#heading=h.llz6ymkp07b1 The BuildInfo is now sent via RequestMetadata. TESTED=we never used this field PiperOrigin-RevId: 169432884
* Passing Bazel metadata in gRPC headers.Gravatar olaola2017-09-21
| | | | | | TESTED=unit tests RELNOTES: none PiperOrigin-RevId: 169395919
* Uploading failed action outputs to the remote cache, because even if the ↵Gravatar olaola2017-09-19
| | | | | | | | | tests fails, we still want to be able to download the logs and other outputs from CAS. This fixes a bug introduced by https://github.com/bazelbuild/bazel/commit/562fcf9f5dfd14daea718f77da95b43b1400689b. To reproduce: run a failing test vs a BES service, the test log would not be uploaded. TESTED=unit tests PiperOrigin-RevId: 169143428
* Automatic code cleanup.Gravatar cushon2017-09-11
| | | | PiperOrigin-RevId: 168141877
* remote: Return exit code 34 for remote caching/execution errors.Gravatar buchgr2017-09-01
| | | | | | | | | | For any errors that are due to failures in the remote caching / execution layers Bazel now returns exit code 34 (ExitCode.REMOTE_ERROR). This includes errors where the remote cache / executor is unreachable or crashes. It does not include errors if the test / build failure is due to user errors i.e. compilation or test failures. PiperOrigin-RevId: 167259236
* remote: support timeoutsGravatar buchgr2017-08-30
| | | | PiperOrigin-RevId: 166981977
* remote: don't fail build if upload failsGravatar Benjamin Peterson2017-08-22
| | | | | | | | | | | | | | If the upload of local build artifacts fails, the build no longer fails but instead a warning is printed once. If --verbose_failures is specified, a detailed warning is printed for every failure. This helps fixing #2964, however it doesn't fully fix it due to timeouts and retries slowing the build significantly. Also, add some other tests related to fallback behavior. Change-Id: Ief49941f9bc7e0123b5d93456d77428686dd5268 PiperOrigin-RevId: 165938874
* Fixing #3552: re-execute cached orphaned Actions.Gravatar olaola2017-08-17
| | | | | | | A bug was introduced in patch 9626bb4923c74c6d3c09b7438eb24b32191053df, where a cache miss would not result in action re-execution, making the cache miss non-recoverable. RELNOTES: fixes #3552 PiperOrigin-RevId: 165434579
* Introduce a new SpawnCache API; add a RemoteSpawnCache implementationGravatar ulfjack2017-08-14
| | | | | | | | | | AbstractSpawnRunner now uses a SpawnCache if one is registered, this allows adding caching to any spawn runner without having to be aware of the implementations. I will delete the old CachedLocalSpawnRunner in a follow-up CL. PiperOrigin-RevId: 165024382
* remote: Provide a clear error message if the remote cache is in an invalid ↵Gravatar buchgr2017-08-14
| | | | | | | | | | | | | state. A remote cache must never serve a failed action. However, if it did Bazel would not detect this and simply fail and display an error message that's hard to distinquish from a local execution failure. Bazel now displays a clear error message stating what went wrong. RELNOTES: None. PiperOrigin-RevId: 164975631
* remote: Refactor RemoteSpawnRunner to distinquish between remoteGravatar buchgr2017-08-11
| | | | | | | | | | | executor, remote cache and local executor errors. This is a no-op refactoring and clears the way to integrate a change that no longer uploads to the remote cache if a previous remote cache interaction failed. RELNOTES: None. PiperOrigin-RevId: 164966432
* CachedLocalSpawnRunner: reuse existing code from RemoteSpawnRunnerGravatar ulfjack2017-08-11
| | | | PiperOrigin-RevId: 164961564
* Use java.time.Duration for timeoutsGravatar ulfjack2017-08-09
| | | | PiperOrigin-RevId: 164577062
* remote: Don't upload failed action to cache. Fixes #3452Gravatar Jakob Buchgraber2017-07-27
| | | | | | | Also, restructure the code for better read- and testability. Change-Id: Ibdd0413f89e4687b836b768a9e7d6315234cb825 PiperOrigin-RevId: 163322658
* Fixing #3380: output stack traces and informative messages.Gravatar olaola2017-07-27
| | | | | | | | Also added tests specifically for the output, to ensure we don't break it again. TESTED=remote worker, unit tests RELNOTES: fixes #3380 PiperOrigin-RevId: 163283558
* Extend the SpawnRunner APIGravatar ulfjack2017-07-17
| | | | | | | | | | | | | - add an id for logging; this allows us to correlate log entries for the same spawn from multiple spawn runner implementations in the future - add a prefetch method to the SpawnExecutionPolicy; better than relying on the ActionInputPrefetcher being injected in the constructor - add a name parameter to the report method; this is in preparation for a single unified SpawnStrategy implementation - it's basically the last bit of difference between SandboxStrategy and RemoteSpawnStrategy; they're otherwise equivalent (if not identical) PiperOrigin-RevId: 162194684
* remote: Lower the chances of a race condition in the remote upload.Gravatar Ola Rozenfeld2017-07-17
| | | | | | | | | | | | Lower the chances of triggering #3360. Immediately before uploading output artifacts, check if any of the inputs ctime changed, and if so don't upload the outputs to the remote cache. I profiled the runs on //src:bazel before vs. after the change; as expected, the number of VFS_STAT increases by a factor of 180% (!), but, according to the profiler, it adds less than 1ms to the overall build time of 130s. PiperOrigin-RevId: 162179308
* remote: Improve error handling for --remote_* cmd line flags. Fixes #3361, #3358Gravatar buchgr2017-07-14
| | | | | | | | | | | - Move flag handling into RemoteModule to fail as early as possible. - Make error messages from flag handling human readable. - Fix a bug where remote execution would only support TLS with a root certificate being specified. - If a remote executor without a remote cache is specified, assume the remote cache to be the same as the executor. PiperOrigin-RevId: 161946029
* remote: Don't cache test if marked "external". Fixes #3362Gravatar buchgr2017-07-14
| | | | | RELNOTES: None. PiperOrigin-RevId: 161937673