aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/main/java/com/google/devtools/build/lib/remote
Commit message (Collapse)AuthorAge
* Automatic code cleanup.Gravatar cushon2017-09-11
| | | | PiperOrigin-RevId: 168141877
* ActionInputFileCache: move getMetadata to a new super-interfaceGravatar ulfjack2017-09-11
| | | | | | Update the callers that only need getMetadata to use the new interface. PiperOrigin-RevId: 167992239
* More BUILD file refactorings.Gravatar philwo2017-09-06
| | | | | | | | | Split collect, concurrent, vfs, windows into package-level BUILD files. Move clock classes out of "util", into their own Java package. Move CompactHashSet into its own Java package to break a dependency cycle. Give nestedset and inmemoryfs their own package-level BUILD files. PiperOrigin-RevId: 167702127
* Extract authandtls, buildeventservice, buildeventstream into package-level ↵Gravatar philwo2017-09-04
| | | | | | | | BUILD files. Replace all ":relative" labels with "//absolute:path" labels for easier search & replace. PiperOrigin-RevId: 167500985
* remote: Return exit code 34 for remote caching/execution errors.Gravatar buchgr2017-09-01
| | | | | | | | | | For any errors that are due to failures in the remote caching / execution layers Bazel now returns exit code 34 (ExitCode.REMOTE_ERROR). This includes errors where the remote cache / executor is unreachable or crashes. It does not include errors if the test / build failure is due to user errors i.e. compilation or test failures. PiperOrigin-RevId: 167259236
* remote: support timeoutsGravatar buchgr2017-08-30
| | | | PiperOrigin-RevId: 166981977
* remote: simplify upload logicGravatar buchgr2017-08-29
| | | | | | Use ByteStream for uploads. PiperOrigin-RevId: 166844266
* remote: close file when uploadingGravatar buchgr2017-08-29
| | | | | RELNOTES: None PiperOrigin-RevId: 166841380
* remote/http: distinquish between action cache and casGravatar buchgr2017-08-29
| | | | | | | | | | | | | | | | | | | | | | | | The remote http caching client now prefixes the URL of an action cache request with 'ac' and cas request with 'cas'. This is useful as servers might want to distinquish between the action cache and the cas. Before this change: (GET|PUT) /f141ae2d23d0f976599e678da1c51d82fedaf8b1 After this change: (GET|PUT) /ac/f141ae2d23d0f976599e678da1c51d82fedaf8b1 (GET|PUT) /cas/f141ae2d23d0f976599e678da1c51d82fedaf8b1 This change will likely break any HTTP caching service that assumed the old URL scheme. TESTED: Manually using https://github.com/buchgr/bazel-remote RELNOTES: The remote HTTP/1.1 caching client (--remote_rest_cache) now distinquishes between action cache and CAS. The request URL for the action cache is prefixed with 'ac' and the URL for the CAS is prefixed with 'cas'. PiperOrigin-RevId: 166831997
* remote: don't fail build if upload failsGravatar Benjamin Peterson2017-08-22
| | | | | | | | | | | | | | If the upload of local build artifacts fails, the build no longer fails but instead a warning is printed once. If --verbose_failures is specified, a detailed warning is printed for every failure. This helps fixing #2964, however it doesn't fully fix it due to timeouts and retries slowing the build significantly. Also, add some other tests related to fallback behavior. Change-Id: Ief49941f9bc7e0123b5d93456d77428686dd5268 PiperOrigin-RevId: 165938874
* Remove ALREADY_EXISTS special treatment from the CAS uploader. This error ↵Gravatar olaola2017-08-22
| | | | | | | | should no longer be thrown by any CAS implementations. TESTED=unit tests RELNOTES: none PiperOrigin-RevId: 165937395
* Fixing #3552: re-execute cached orphaned Actions.Gravatar olaola2017-08-17
| | | | | | | A bug was introduced in patch 9626bb4923c74c6d3c09b7438eb24b32191053df, where a cache miss would not result in action re-execution, making the cache miss non-recoverable. RELNOTES: fixes #3552 PiperOrigin-RevId: 165434579
* Introduce a new SpawnCache API; add a RemoteSpawnCache implementationGravatar ulfjack2017-08-14
| | | | | | | | | | AbstractSpawnRunner now uses a SpawnCache if one is registered, this allows adding caching to any spawn runner without having to be aware of the implementations. I will delete the old CachedLocalSpawnRunner in a follow-up CL. PiperOrigin-RevId: 165024382
* remote: Provide a clear error message if the remote cache is in an invalid ↵Gravatar buchgr2017-08-14
| | | | | | | | | | | | | state. A remote cache must never serve a failed action. However, if it did Bazel would not detect this and simply fail and display an error message that's hard to distinquish from a local execution failure. Bazel now displays a clear error message stating what went wrong. RELNOTES: None. PiperOrigin-RevId: 164975631
* remote: Refactor RemoteSpawnRunner to distinquish between remoteGravatar buchgr2017-08-11
| | | | | | | | | | | executor, remote cache and local executor errors. This is a no-op refactoring and clears the way to integrate a change that no longer uploads to the remote cache if a previous remote cache interaction failed. RELNOTES: None. PiperOrigin-RevId: 164966432
* CachedLocalSpawnRunner: reuse existing code from RemoteSpawnRunnerGravatar ulfjack2017-08-11
| | | | PiperOrigin-RevId: 164961564
* AbstractSpawnStrategy: use ActionExecutionContext.getVerboseFailuresGravatar ulfjack2017-08-11
| | | | | | | ...instead of injecting it through the constructor. Simplify all the callers accordingly. PiperOrigin-RevId: 164955391
* Use java.time.Duration for timeoutsGravatar ulfjack2017-08-09
| | | | PiperOrigin-RevId: 164577062
* grpc: Consolidate gRPC code from BES and Remote Execution. Fixes #3460, #3486Gravatar buchgr2017-08-04
| | | | | | | | | | | BES and Remote Execution have separate implementations of gRPC channel creation, authentication and TLS. We should merge them, to avoid duplication and bugs. One such bug is #3640, where the BES code had a different implementation for Google Application Default Credentials. RELNOTES: The Build Event Service (BES) client now properly supports Google Applicaton Default Credentials. PiperOrigin-RevId: 164253879
* remote: Delete output files created by failed download.Gravatar Jakob Buchgraber2017-07-27
| | | | | | | | | | If a remote download fails, delete any output files that might have already been created. Else, this might intefere with a subsequent locally executed actions that expects none of its output files to exist. See #3452. Change-Id: I467a97d05606c586aa257326213940a37dad9dd5 PiperOrigin-RevId: 163336093
* remote: Don't upload failed action to cache. Fixes #3452Gravatar Jakob Buchgraber2017-07-27
| | | | | | | Also, restructure the code for better read- and testability. Change-Id: Ibdd0413f89e4687b836b768a9e7d6315234cb825 PiperOrigin-RevId: 163322658
* Fixing #3380: output stack traces and informative messages.Gravatar olaola2017-07-27
| | | | | | | | Also added tests specifically for the output, to ensure we don't break it again. TESTED=remote worker, unit tests RELNOTES: fixes #3380 PiperOrigin-RevId: 163283558
* Simplify RemoteActionContextProviderGravatar ulfjack2017-07-24
| | | | PiperOrigin-RevId: 162915070
* Tiny refactoring, functional no-op.Gravatar olaola2017-07-24
| | | | | | | | After the ByteUploader changes, the Retrier no longer needs the onFailure(s) functions. Removing them will simplify both the code and the stack traces used for debugging problems. TESTED=unit tests RELNOTES: none PiperOrigin-RevId: 162913762
* Extract a common AbstractSpawnStrategy parent classGravatar ulfjack2017-07-24
| | | | | | This removes a bunch of code duplication that I previously introduced. PiperOrigin-RevId: 162909430
* Add ActionInputPrefetcher to ActionExecutionContextGravatar ulfjack2017-07-24
| | | | | | | This is more consistent with other values, and removes the need to inject it into the constructor of the various strategy implementations. PiperOrigin-RevId: 162729187
* Move ActionInputPrefetcher to the actions packageGravatar ulfjack2017-07-21
| | | | | | The plan is to add it to ActionExecutionContext, which is also there. PiperOrigin-RevId: 162656835
* Fix #3416: catch the ALREADY_EXISTS status code on upload, and treat it as ↵Gravatar olaola2017-07-21
| | | | | | | | | success. This can happen per spec, if multiple builds try to upload the same blob concurrently. Also, added this to the RemoteWorker, per spec. PiperOrigin-RevId: 162647548
* Make the @Option annotation depend on the java version of the tagging enums.Gravatar ccalvarin2017-07-18
| | | | | | | The option filters proto dependency can be removed from the OptionsParser. This is in response to option parser users that want to avoid the bazel-internal proto file in their dependencies. RELNOTES: None. PiperOrigin-RevId: 162249778
* Extend the SpawnRunner APIGravatar ulfjack2017-07-17
| | | | | | | | | | | | | - add an id for logging; this allows us to correlate log entries for the same spawn from multiple spawn runner implementations in the future - add a prefetch method to the SpawnExecutionPolicy; better than relying on the ActionInputPrefetcher being injected in the constructor - add a name parameter to the report method; this is in preparation for a single unified SpawnStrategy implementation - it's basically the last bit of difference between SandboxStrategy and RemoteSpawnStrategy; they're otherwise equivalent (if not identical) PiperOrigin-RevId: 162194684
* remote: Lower the chances of a race condition in the remote upload.Gravatar Ola Rozenfeld2017-07-17
| | | | | | | | | | | | Lower the chances of triggering #3360. Immediately before uploading output artifacts, check if any of the inputs ctime changed, and if so don't upload the outputs to the remote cache. I profiled the runs on //src:bazel before vs. after the change; as expected, the number of VFS_STAT increases by a factor of 180% (!), but, according to the profiler, it adds less than 1ms to the overall build time of 130s. PiperOrigin-RevId: 162179308
* remote: Failed blob upload should close file handle.Gravatar buchgr2017-07-17
| | | | | | | | | | | Two cases: - The upload succeeds and the Chunker consumes all its data and closes the underlying data source. - The upload fails and the Chunker doesn't close the data source, and so we call reset() manually. RELNOTES: None. PiperOrigin-RevId: 162178032
* Merge handleOptions into beforeCommandGravatar ulfjack2017-07-17
| | | | | | | Now that we have the options before calling beforeCommand, there's no need for a separate handleOptions method in the BlazeModule API. Remove it. PiperOrigin-RevId: 162002300
* remote: Chunker.reset() should release resources.Gravatar buchgr2017-07-17
| | | | | RELNOTES: None. PiperOrigin-RevId: 161970540
* Simplify exception handling in spawn strategiesGravatar ulfjack2017-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | The main change here is to only catch SpawnExecException in StandaloneTestStrategy, so all other exceptions simplify propagate up. As a result, Bazel no longer retries tests that fail with an exception, we only retry tests that actually ran, had a spawn result, and resulted in a UserExecException. That is probably what we want. Also do some cleanup: - Remove ExecException.timedOut; nobody was calling it (but there's still SpawnExecException.timedOut) - Remove SpawnActionContext.shouldPropagateExecException; all exceptions (except SpawnExecException) are now propagated by default - Remote the SandboxOptions from the SandboxStrategies; all sandboxing options are now handled by the underlying SpawnRunner implementations I'll send a followup CL to remove the UserExecException and EnvironmentalExecException types; the types don't do anything special, and there are no catch blocks in production code that catch one of these more specific types. This should fix #3322 by removing a bunch of special handling. PiperOrigin-RevId: 161960919
* Update the gRPC channels for remote caching and execution to use a ↵Gravatar olaola2017-07-17
| | | | | | | | | RoundRobin client side load balancer instead of the default PickFirst. A RoundRobin rotates between all addresses returned by the resolver, while PickFirst sends all traffic to the first address. This might help resolve some of the load problems we encountered (see https://github.com/grpc/grpc/issues/11704 for more details). TESTED=remote worker PiperOrigin-RevId: 161960008
* remote: Fix a bug where local executed results would not be uploaded. Fixes ↵Gravatar Ola Rozenfeld2017-07-14
| | | | | | | | | | | | | #3379 Commit dc24004873c335 broke the upload of locally executed action results. Also, update our unit tests who did not catch this error. P.S.: olaola@ is the author of this change, but due to time constraints we had to merge it while she was asleep. Change-Id: Ib150152c0bddc8311908c105aef208506d3b6a8d PiperOrigin-RevId: 161954553
* remote: Improve error handling for --remote_* cmd line flags. Fixes #3361, #3358Gravatar buchgr2017-07-14
| | | | | | | | | | | - Move flag handling into RemoteModule to fail as early as possible. - Make error messages from flag handling human readable. - Fix a bug where remote execution would only support TLS with a root certificate being specified. - If a remote executor without a remote cache is specified, assume the remote cache to be the same as the executor. PiperOrigin-RevId: 161946029
* remote: Don't cache test if marked "external". Fixes #3362Gravatar buchgr2017-07-14
| | | | | RELNOTES: None. PiperOrigin-RevId: 161937673
* Sorting the Action output files.Gravatar olaola2017-07-14
| | | | PiperOrigin-RevId: 161925075
* remote: Chunker should open files lazily.Gravatar buchgr2017-07-14
| | | | | | | | | | The Chunker should only open a file on the first call to next(). We noticed that when remote executing with hundreds of actions in parallel bazel would sometimes run out of file descriptors. That's because on creating a new Chunker object it would already open a file, eventhough the first call to next would happen at a much later stage. PiperOrigin-RevId: 161923568
* Bring back the very useful stacktrace printouts on --verbose_failures (#3380).Gravatar olaola2017-07-14
| | | | | | TESTED=remote worker RELNOTES: fixes #3380 PiperOrigin-RevId: 161922635
* Fixing the handling of retries for watch and execute calls.Gravatar olaola2017-07-13
| | | | | | TESTED=remote worker, triggered some errors RELNOTES: fixes #3305, helps #3356 PiperOrigin-RevId: 161722997
* remote: Refactor GrpcRemoteExecutor to only take what it needs.Gravatar buchgr2017-07-12
| | | | | | | | | | - Only pass to the GrpcRemoteExecutor constructor the objects it really needs. - Share a Retrier between GrpcRemoteCache and GrpcRemoteExecutor. RELNOTES: None PiperOrigin-RevId: 161660054
* Rewrite all the sandbox strategy implementationsGravatar ulfjack2017-07-12
| | | | | | | | | | | | | | - Make use of existing abstractions like SpawnRunner and SpawnExecutionPolicy. - Instead of having the *Strategy create a *Runner, and then call back into SandboxStrategy, create a single SandboxContainer which contains the full command line, environment, and everything needed to create and delete the sandbox directory. - Do all the work in SandboxStrategy, including creation and deletion of the sandbox directory. - Use SpawnResult instead of throwing, catching, and rethrowing. - Simplify the control flow a bit. PiperOrigin-RevId: 161644979
* remote: Rewrite ChunkerGravatar buchgr2017-07-12
| | | | | | | | | | | | | | | | | - Remove the Chunker.Builder and use the Chunker constructors instead. - Fix a correctness bug, where the chunker would ignore the return value of InputStream.read(byte[]). - Have Chunk.getData() return a ByteString as opposed to a byte[]. All callsides need ByteString objects and this change makes the subsequent change possible. - Have the Chunker use a preallocated byte[] in order to avoid allocating a new one on every call to next(). RELNOTES: None. PiperOrigin-RevId: 161637158
* Reimplement RemoteSpawnStrategy on top of RemoteSpawnRunnerGravatar ulfjack2017-07-10
| | | | | | | | | | | | | | | | | It is intentional that RemoteSpawnStrategy no longer contains any remote execution specific code. My intent is to merge all SpawnStrategy implementations into a single class (similar to the new RemoteSpawnStrategy), and delegate all the specific work to SpawnRunner implementations. However, we're not there yet, and we still need to be able to look up SpawnStrategy implementations by name through the annotations, so we still need separate classes for now. We might also want to have a shared test suite for all SpawnRunner instances that checks for basic compliance with the specification. Progress on #1531. PiperOrigin-RevId: 161377751
* remote: Rewrite the ByteStream upload.Gravatar buchgr2017-07-10
| | | | | | | | | | | | | | The current ByteStream upload implementation has no support for application-level flow control, which resulted in excessive buffering and OOM errors. The new implementation respects gRPCs flow control. Additionally, this code adds support for multiple uploads of the same digest. That is, if a digest (i.e. file) is uploaded several times concurrently, only one upload will be performed. RELNOTES: None. PiperOrigin-RevId: 161287337
* Make sure to check the received digest on downloadGravatar ulfjack2017-07-07
| | | | | | | This helped me find a bug in a rewrite of the remote worker, which is independent of this change. PiperOrigin-RevId: 161204914
* Refactor RemoteSpawn{Strategy,Runner}Gravatar ulfjack2017-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Make the RemoteSpawnRunner match RemoteSpawnStrategy functionality, including local fallback, remote caching, and execution. This is done so we can actually finish the migration to the SpawnRunner API, for which I've had a pending change for several months now. - Never throw StatusRuntimeException from the GrpcRemoteCache or the GrpcExecutor. We almost never do, since the Retrier catches them implcitly, so a number of catch blocks were already unreachable. Carefully document the cases where we still need to handle it. - RemoteSpawnStrategy / RemoteSpawnRunner no longer catch gRPC-specific exceptions; they should be able to handle any reasonable remote caching / execution implementation (except we don't have a common interface for GrpcRemoteExecutor yet), with no dependency on gRPC as such. Note that the RemoteSpawnStrategy class will actually go away after the SpawnRunner migration (eventually). - However, ensure that we _are_ actually throwing CacheNotFoundException; the retrier implicitly catches that also, so we need to manually unwrap from RetryException. - Don't call into the EventHandler from RemoteSpawnStrategy; instead, throw an exception with the message, and let the higher levels handle the reporting (we only allow this for exception + local fallback, for which there's no good reporting API right now). PiperOrigin-RevId: 161195666