aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/test/java/com/google/devtools/build/lib/remote
Commit message (Collapse)AuthorAge
* remote: Add interceptor for logging gRPC calls during remote execution/cachingGravatar Googler2018-03-05
| | | | | | | | | | This provides a io.grpc.ClientInterceptor implementation that can be used to log gRPC call information. The interceptor can select a logging handler to use based on the gRPC method being called (Watch, Execute, Write, etc) to build a LogEntry, which can then be logged after the call has finished. Unit tests for the interceptor are included. In this change, the interceptor is never invoked, nor are there any handlers implemented for any gRPC methods. The interceptor also never tries to log any entries. To avoid circular dependency issues (Remote library will depend on logger which depends on remote library for utils), I've factored out the utility classes from the remote library into their own directory/package as part of this change. PiperOrigin-RevId: 187926516
* Verifying result read with retries in the remote execute unit test.Gravatar olaola2018-02-28
| | | | | | | | The current behavior is already correct, just adding a test to make sure we retry reads as we should. TESTED=the unit test RELNOTES: None PiperOrigin-RevId: 187398578
* Propagating whether there was a cache hit in the SpawnResult.Gravatar olaola2018-02-20
| | | | | | | | So far, nobody uses it, but I want to start using this field soon. TESTED=unit test RELNOTES: None PiperOrigin-RevId: 186290375
* remote: Add support for HTTP Basic AuthGravatar Jakob Buchgraber2018-02-08
| | | | | | Closes #4609. PiperOrigin-RevId: 185032751
* Adding awaitTermination in our tests tear down.Gravatar olaola2018-02-08
| | | | | | | | | | | This is to prevent this error: SEVERE: *~*~*~ Channel io.grpc.internal.ManagedChannelImpl-56 for target directaddress:///io.grpc.inprocess.InProcessSocketAddress@3ecbfba1 was not shutdown properly!!! ~*~*~* Make sure to call shutdown()/shutdownNow() and awaitTermination(). TESTED=ran tests RELNOTES: None PiperOrigin-RevId: 185020683
* User-friendlier representation of a missing digest.Gravatar olaola2018-02-08
| | | | | | | | I moved it into DigestUtil preemptively in case we switch to binary instead of hex representation. TESTED=manually RELNOTES: None PiperOrigin-RevId: 185007558
* Fixing #4585: broken re-execution of orphaned actions.Gravatar olaola2018-02-08
| | | | | | | | This is an important regression, we will want to patch the fix into 0.10 TESTED=fixed unit test, with A/B testing RELNOTES: Resolved an issue where a failure in the remote cache would not trigger local re-execution of an action. PiperOrigin-RevId: 184991670
* remote: Rewrite the HTTP caching client in Netty. Fixes #4481Gravatar buchgr2018-01-26
| | | | | | | | | | | | | | * This puts in the foundation of HTTP/2 support for remote caching. * Allows us to remove the Apache HTTP library as a dependency, reducing the Bazel binary size by 1MiB. On fast networks (i.e. GCE to GCS) we can see a >2x speed improvement for TLS throughput. Even from my workstation to GCS I get significant build time improvements when using Netty's TLS 18s vs 12s. Closes #4481. PiperOrigin-RevId: 183411787
* Prevent broken cache entries on concurrent file changesGravatar ulfjack2018-01-19
| | | | | | | | | | | | | | | | Local execution has an inherent race condition: if a user modifies a file while an action is executed, then it is impossible for Bazel to tell which version of the file was actually read during action execution. The file may have been modified before or after the tool has read it, or, in the worst case, the tool may have read both the original and the modified version. In addition, the file may be changed back to the original state before Bazel can check the file, so computing the digest before / after may not be sufficient. This is a concern for both local and remote caches, although the cost of poisoning a shared remote cache is significantly higher, and is what has triggered this work. Fixes #3360. We solve this by keeping a reference to the FileContentsProxy, and using that to check for modificaitons before storing the cache entry. We output a warning if this check fails. This change does not increase memory consumption; Java objects are always allocated in multiples of 8 bytes, we use compressed oops, and the FileArtifactValue currently has 12 bytes worth of fields (excl. object overhead), so adding another pointer is effectively free. As a possible performance optimization on purely local builds, we could also consider not computing digests at all, and only use the FileContentsProxy for caching. PiperOrigin-RevId: 182510358
* Introduce Root class.Gravatar tomlu2018-01-17
| | | | | | | | | | | This class represents a root (such as a package path or an output root) used for file lookups and artifacts. It is meant to be as opaque as possible in order to hide the user's environment from sky keys and sky functions. Roots are used by RootedPaths and ArtifactRoots. This CL attempts to make the minimum number of modifications necessary to change RootedPath and ArtifactRoot to use these fields. Deprecated methods and invasive accessors are permitted to minimise the risk of any observable changes. RELNOTES: None PiperOrigin-RevId: 182271759
* Rename Root to ArtifactRoot.Gravatar tomlu2018-01-16
| | | | | | This is slightly more descriptive, and we will potentially want to use the name Root for a broader concept shared between ArtifactRoot and RootedPath. PiperOrigin-RevId: 182082367
* Remove use of Root#asDerivedRoot where the derived root == exec root.Gravatar tomlu2018-01-15
| | | | | | | | This method violates the invariant that derived roots are never equal to the exec root. Only source roots can be equal to the exec root. Note that this method was only used in tests, so this CL should be completely safe as long as its tests pass. PiperOrigin-RevId: 181998483
* Plumb exec root through to all spawn runners.Gravatar tomlu2018-01-11
| | | | | | They need this to parse input manifests. Previously we would grab the exec root from the Root, but wish to unsupport this. PiperOrigin-RevId: 181669143
* Use EmptyActionInput instead of null in SpawnInputExpanderGravatar ulfjack2018-01-08
| | | | | | | | This simplifies some spawn runners, which no longer have to specially handle null; unfortunately, the sandbox runners do not support VirtualActionInput, so they still have to special-case it. PiperOrigin-RevId: 181175408
* remote: fix download of output directoriesGravatar buchgr2017-12-20
| | | | | | | | Fix a bug where Bazel would crash if two Directory protos had the same hash. RELNOTES: Remote Caching and Execution support output directories. PiperOrigin-RevId: 179731040
* remote: rename auth flags.Gravatar Jakob Buchgraber2017-12-20
| | | | | | | | | | | | | | | | | | | | | | --auth_* flags only work with Google Cloud Authentication. That's confusing and restricts the naming of more general purpose authentication flags that we might want to add in the future. So instead of --auth_* let's call them --google_* (the old ones will continue working for a while). Also, --auth_enabled (aka --google_default_credentials) is no longer required when specifying --auth_credentials (aka --google_credentials). So now there's two simple ways to authenticate with Google Cloud: * bazel build --google_default_credentials * bazel build --google_credentials=creds.json RELNOTES: --auth_* flags were renamed to --google_* flags. The old names will continue to work for this release but will be removed in the next release. Change-Id: Ia1736f32e15a37995be3172cd9608d518ddeab44 PiperOrigin-RevId: 179700832
* remote: add directory support for remote caching and executionGravatar Hadrien Chauvin2017-12-20
| | | | | | Add support for directory trees as artifacts. Closes #4011. PiperOrigin-RevId: 179691001
* remote: Allow auth scopes to be a comma-separated list.Gravatar Jakob Buchgraber2017-12-19
| | | | | | | | | | | | --auth_scopes can be passed a comma-separated list of authentication scopes. Add "https://www.googleapis.com/auth/devstorage.read_write" to the list of defaults. This scope is used when using Google Cloud Storage (GCS) as a remote caching backend. Change-Id: I62e6fed28b28737823ad6c70cbc5048b3a3190b5 PiperOrigin-RevId: 179548090
* remote: Add support for Google Cloud Storage.Gravatar Jakob Buchgraber2017-12-18
| | | | | | | | | | | | | | | | | | | | | | Add support for Google Cloud Storage (GCS) as a HTTP caching backend. This commit mainly adds the infrastructure necessary to authenticate to GCS. Using GCS as a caching backend works as follows: 1) Create a new GCS bucket. 2) Create a service account that can read and write to the GCS bucket and generate a JSON credentials file for it. 3) Invoke Bazel as follows: bazel build --remote_rest_cache=https://storage.googleapis.com/<bucket> --auth_enabled --auth_scope=https://www.googleapis.com/auth/devstorage.read_write --auth_credentials=</path/to/creds.json> I'll add simplification's and docs in a subsequent commit. Change-Id: Ie827d7946a2193b97ea7d9aa72eb15f09de2164d PiperOrigin-RevId: 179406380
* Fix: uploading artifacts of failed actions to remote cache stopped working.Gravatar olaola2017-12-11
| | | | | | | | | | | | | | | | To reproduce: run a failing test with --experimental_remote_spawn_cache or with --spawn_strategy=remote and no executor. Expected: test log is uploaded. Desired behavior: - regardless of whether a spawn is cacheable or not, its artifacts should be uploaded to the remote cache. - the spawn result should only be set if the spawn is cacheable *and* the action succeeded. - when executing remotely, the do_not_cache field should be set for non-cacheable spawns, and the remote execution engine should respect it. This CL contains multiple fixes to ensure the above behaviors, and adds a few tests, both end to end and unit tests. Important behavior change: it is no longer assumed that non-cacheable spawns should use a NO_CACHE SpawnCache! The appropriate test case was removed. Instead, an assumption was added that all implementations of SpawnCache should respect the Spawns.mayBeCached(spawn) property. Currently, only NO_CACHE and RemoteSpawnCache exist, and they (now) support it. TESTED=remote build execution backend. WANT_LGTM: philwo,buchgr RELNOTES: None PiperOrigin-RevId: 178617937
* remote: Replace Retrier with Retrier2.Gravatar buchgr2017-12-04
| | | | | | | | | - Replace the existing Retrier with Retrier2. - Rename Retrier2 to Retrier and remove the old Retrier + RetryException class. RELNOTES: None. PiperOrigin-RevId: 177835070
* Refactor the FileSystem API to allow for different hash functions.Gravatar buchgr2017-11-30
| | | | | | | | | | | | | Refactor the FileSystem class to include the hash function as an instance field. This allows us to have a different hash function per FileSystem and removes technical debt, as currently that's somewhat accomplished by a horrible hack that has a static method to set the hash function for all FileSystem instances. The FileSystem's default hash function remains MD5. RELNOTES: None PiperOrigin-RevId: 177479772
* Simplify SpawnRunner interfaceGravatar ulfjack2017-11-28
| | | | | | | | | | | | | | It turns out that the SUCCESS status is often misunderstood to mean "zero exit", even though this is clearly documented. I've decided to add another status for non-zero exit, and use success only for zero exit to avoid this pitfall. Also, many of the status codes are set, but never used. I decided to reduce the number of status codes to only those that are actually relevant, which simplifies further processing. Instead, we should add a string message for the error case when we need one - we're not using it right now, so I decided not to add that yet. PiperOrigin-RevId: 177129441
* Properly attaching context for remote uploads in RemoteSpawnCache.Gravatar olaola2017-10-20
| | | | | | | | Fixes #3930. Also added tests. TESTED=added tests RELNOTES: Fixing regression to --experimental_remote_spawn_cache PiperOrigin-RevId: 172852740
* Removing local fallback on failed action commands.Gravatar olaola2017-10-18
| | | | | | | | We should only fall back if a remote execution error occurred, not if the command itself failed. TESTED=better unit tests RELNOTES: None PiperOrigin-RevId: 172406687
* Fixing displayed retry attempts one-off error, when the error is non-retriable.Gravatar olaola2017-10-06
| | | | | | | | Adding unit tests. TESTED=unit tests RELNOTES: None PiperOrigin-RevId: 170750220
* Checking both old and new error fields on remote execute operation.Gravatar olaola2017-09-29
| | | | | | | The old field is the error on Operation proto. The new field is the ExecuteResponse status field. Note that the new field will also allow us to fetch logs for timing out tests, resolving a TODO, but this is not yet done is this change. PiperOrigin-RevId: 170370676
* Move SpawnResult from build.lib.exec into build.lib.actions so that e.g. ↵Gravatar ruperts2017-09-22
| | | | | | | build.lib.actions.SpawnActionContext can import SpawnResult without creating a cyclic dependency. RELNOTES: None. PiperOrigin-RevId: 169642267
* Passing Bazel metadata in gRPC headers.Gravatar olaola2017-09-21
| | | | | | TESTED=unit tests RELNOTES: none PiperOrigin-RevId: 169395919
* Uploading failed action outputs to the remote cache, because even if the ↵Gravatar olaola2017-09-19
| | | | | | | | | tests fails, we still want to be able to download the logs and other outputs from CAS. This fixes a bug introduced by https://github.com/bazelbuild/bazel/commit/562fcf9f5dfd14daea718f77da95b43b1400689b. To reproduce: run a failing test vs a BES service, the test log would not be uploaded. TESTED=unit tests PiperOrigin-RevId: 169143428
* Delete the unused CachedLocalSpawnRunner, which is superseeded by SpawnCacheGravatar ulfjack2017-09-12
| | | | PiperOrigin-RevId: 168359681
* remote: Add new retrier with support for circuit breakingGravatar buchgr2017-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | Add a generic retrier implementation (Retrier2) that can be configured by plugging in a backoff strategy, a function to decide on retriable errors and a circuit breaker. A concrete implementation is added via RemoteRetrier that mostly is a copy of the code of the existing Retrier. Retrier2 adds support for circuit breaking [1]. It allows the retrier to reject execution when failure rates are high. The remote execution code will use this to gently switch between local and remote execution/caching if the latter experiences lots of failures. Retrier2 is also useful when not used with gRPC. We need retriers for the HTTP caching interface too. All the code added in this CL is unused, to keep reviews managable. In a follow up CL, I will switch the code to use the new Retrier and delete the old retrier. [1] https://martinfowler.com/bliki/CircuitBreaker.html PiperOrigin-RevId: 168355597
* remote: Return exit code 34 for remote caching/execution errors.Gravatar buchgr2017-09-01
| | | | | | | | | | For any errors that are due to failures in the remote caching / execution layers Bazel now returns exit code 34 (ExitCode.REMOTE_ERROR). This includes errors where the remote cache / executor is unreachable or crashes. It does not include errors if the test / build failure is due to user errors i.e. compilation or test failures. PiperOrigin-RevId: 167259236
* remote: support timeoutsGravatar buchgr2017-08-30
| | | | PiperOrigin-RevId: 166981977
* remote: don't fail build if upload failsGravatar Benjamin Peterson2017-08-22
| | | | | | | | | | | | | | If the upload of local build artifacts fails, the build no longer fails but instead a warning is printed once. If --verbose_failures is specified, a detailed warning is printed for every failure. This helps fixing #2964, however it doesn't fully fix it due to timeouts and retries slowing the build significantly. Also, add some other tests related to fallback behavior. Change-Id: Ief49941f9bc7e0123b5d93456d77428686dd5268 PiperOrigin-RevId: 165938874
* Remove ALREADY_EXISTS special treatment from the CAS uploader. This error ↵Gravatar olaola2017-08-22
| | | | | | | | should no longer be thrown by any CAS implementations. TESTED=unit tests RELNOTES: none PiperOrigin-RevId: 165937395
* Fixing #3552: re-execute cached orphaned Actions.Gravatar olaola2017-08-17
| | | | | | | A bug was introduced in patch 9626bb4923c74c6d3c09b7438eb24b32191053df, where a cache miss would not result in action re-execution, making the cache miss non-recoverable. RELNOTES: fixes #3552 PiperOrigin-RevId: 165434579
* Introduce a new SpawnCache API; add a RemoteSpawnCache implementationGravatar ulfjack2017-08-14
| | | | | | | | | | AbstractSpawnRunner now uses a SpawnCache if one is registered, this allows adding caching to any spawn runner without having to be aware of the implementations. I will delete the old CachedLocalSpawnRunner in a follow-up CL. PiperOrigin-RevId: 165024382
* remote: Provide a clear error message if the remote cache is in an invalid ↵Gravatar buchgr2017-08-14
| | | | | | | | | | | | | state. A remote cache must never serve a failed action. However, if it did Bazel would not detect this and simply fail and display an error message that's hard to distinquish from a local execution failure. Bazel now displays a clear error message stating what went wrong. RELNOTES: None. PiperOrigin-RevId: 164975631
* Refactor persistent workers to use SpawnRunner.Gravatar Benjamin Peterson2017-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | Change the persistent worker spawn strategy to extend AbstractSpawnStrategy and put the actual logic into WorkerSpawnRunner. WorkerTestStrategy is unaffected. I had to extend SpawnPolicy with a speculating() method. Persistent workers need to know if speculation is happening in order to require sandboxing. Additionally, I added java_test rules for the local runner tests and worker tests. See https://github.com/bazelbuild/bazel/issues/3481. NOTE: ulfjack@ made some changes to this change before merging: - changed Reporter to EventHandler; added TODO about its usage - reverted non-semantic indentation change in AbstractSpawnStrategy - reverted a non-semantic indentation change in WorkerSpawnRunner - updated some internal classes to match - removed catch IOException in WorkerSpawnRunner in some cases, removed verboseFailures flag from WorkerSpawnRunner, updated callers - disable some tests on Windows; we were previously not running them, now that we do, they fail :-( Change-Id: I207b3938f0dc84d374ab052d5030020886451d47 PiperOrigin-RevId: 164965398
* Unify input prefetchingGravatar ulfjack2017-08-10
| | | | | | | All prefetching now goes through AbstractSpawnStrategy's implementation of SpawnExecutionPolicy. Make sure the sandbox runners also do this consistently. PiperOrigin-RevId: 164836877
* Use java.time.Duration for timeoutsGravatar ulfjack2017-08-09
| | | | PiperOrigin-RevId: 164577062
* grpc: Consolidate gRPC code from BES and Remote Execution. Fixes #3460, #3486Gravatar buchgr2017-08-04
| | | | | | | | | | | BES and Remote Execution have separate implementations of gRPC channel creation, authentication and TLS. We should merge them, to avoid duplication and bugs. One such bug is #3640, where the BES code had a different implementation for Google Application Default Credentials. RELNOTES: The Build Event Service (BES) client now properly supports Google Applicaton Default Credentials. PiperOrigin-RevId: 164253879
* remote: Don't upload failed action to cache. Fixes #3452Gravatar Jakob Buchgraber2017-07-27
| | | | | | | Also, restructure the code for better read- and testability. Change-Id: Ibdd0413f89e4687b836b768a9e7d6315234cb825 PiperOrigin-RevId: 163322658
* Fixing #3380: output stack traces and informative messages.Gravatar olaola2017-07-27
| | | | | | | | Also added tests specifically for the output, to ensure we don't break it again. TESTED=remote worker, unit tests RELNOTES: fixes #3380 PiperOrigin-RevId: 163283558
* Fix #3416: catch the ALREADY_EXISTS status code on upload, and treat it as ↵Gravatar olaola2017-07-21
| | | | | | | | | success. This can happen per spec, if multiple builds try to upload the same blob concurrently. Also, added this to the RemoteWorker, per spec. PiperOrigin-RevId: 162647548
* Extend the SpawnRunner APIGravatar ulfjack2017-07-17
| | | | | | | | | | | | | - add an id for logging; this allows us to correlate log entries for the same spawn from multiple spawn runner implementations in the future - add a prefetch method to the SpawnExecutionPolicy; better than relying on the ActionInputPrefetcher being injected in the constructor - add a name parameter to the report method; this is in preparation for a single unified SpawnStrategy implementation - it's basically the last bit of difference between SandboxStrategy and RemoteSpawnStrategy; they're otherwise equivalent (if not identical) PiperOrigin-RevId: 162194684
* remote: Chunker.reset() should release resources.Gravatar buchgr2017-07-17
| | | | | RELNOTES: None. PiperOrigin-RevId: 161970540
* remote: Fix a bug where local executed results would not be uploaded. Fixes ↵Gravatar Ola Rozenfeld2017-07-14
| | | | | | | | | | | | | #3379 Commit dc24004873c335 broke the upload of locally executed action results. Also, update our unit tests who did not catch this error. P.S.: olaola@ is the author of this change, but due to time constraints we had to merge it while she was asleep. Change-Id: Ib150152c0bddc8311908c105aef208506d3b6a8d PiperOrigin-RevId: 161954553
* remote: Improve error handling for --remote_* cmd line flags. Fixes #3361, #3358Gravatar buchgr2017-07-14
| | | | | | | | | | | - Move flag handling into RemoteModule to fail as early as possible. - Make error messages from flag handling human readable. - Fix a bug where remote execution would only support TLS with a root certificate being specified. - If a remote executor without a remote cache is specified, assume the remote cache to be the same as the executor. PiperOrigin-RevId: 161946029