| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
* Refactor Chunker constructor to a builder to reduce constructor overload.
* Pass digest into this where we have it
* Redo ensureInputsPresent to not lose the missing digests during processing so we can pass them to the Chunker constructor.
RELNOTES: None
PiperOrigin-RevId: 207297915
|
|
|
|
|
|
|
|
|
|
| |
The only remaining use was a testing REST backend in the LRE.
I wrote a replacement for that using netty, which we use for our network stuff in Bazel, which means we can now get rid of Hazelcast. :)
I'll remove the Hazelcast files in a separate change when this is merged.
PiperOrigin-RevId: 205985996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refocus synchronization mechanism to cope with file descriptor set fork-
induced races to more tightly constrain concurrent fork/exec pairs. This
problem has been observed in bazel proper repeatedly, exhibiting as the
iconic ETXTBSY - Text file busy in wide worker pool builds and tests.
Evidence that this was discovered by @buchgr is in the comment and change
to the embedded ExecutionService implementation, and the description of
the race and the need for the synchronization was lifted from that scope
to the JavaSubprocessFactory. This factory is a singleton and represents
the gateway to all worker process execution, and serves as the correct
lock primitive to ensure that file descriptor sets are not duplicated
across forks, which gave rise to this issue.
To test this, I demonstrated a reproducer presented at
https://bugs.java.com/view_bug.do?bug_id=8068370
with 2.4% of invocations in that pathological case exhibiting the issue.
With a functionally equivalent change - synchronizing around a
processBuilder.start() call - as the only modification to the reproducer,
no further failures of any kind were observed, over several hundred runs.
Closes #5556.
PiperOrigin-RevId: 203947224
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use try-with-resources to ensure OutputStreams
that we open via FileSystem.OutputStream(path)
are closed.
Eagerly closing OutputStreams avoids hanging on to
file handles until the garbage collector finalizes
the OutputStream, meaning Bazel on Windows (and
other processes) can delete or mutate these files.
Hopefully this avoids intermittent file deletion
errors that sometimes occur on Windows.
See https://github.com/bazelbuild/bazel/issues/5512
RELNOTES: none
PiperOrigin-RevId: 203342889
|
|
|
|
|
|
|
|
|
| |
enum.
Now that we aren't using enum names for the hash functions, we also accept the standard names, such as SHA-256.
RELNOTES: None.
PiperOrigin-RevId: 201624286
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change introduces concurrent downloads of action outputs
for remote caching/execution. So far, for an action we would
download one output after the other which isn't as bad as it
sounds as we would typically run dozens or hundreds of actions
in parallel. However, for actions with a lot of outputs or graphs
that allow limited parallelism we expect this change to positively
impact performance.
Note, that with this change the AbstractRemoteActionCache will
attempt to always download all outputs concurrently. The actual
parallelism is controlled by the underlying network transport.
The gRPC transport currently enforces no limits on the concurrent
calls, which should be fine given that all calls are multiplexed
on a single network connection. The HTTP/1.1 transport also
enforces no parallelism by default, but I have added the
--remote_max_connections=INT flag which allows to specify an upper
bound on the number of network connections to be open concurrently.
I have introduced this flag as a defensive mechanism for users
who's environment might enforce an upper bound on the number of open
connections, as with this change its possible for the number of
concurrently open connections to dramatically increase (from
NumParallelActions to NumParallelActions * SumParallelActionOutputs).
A side effect of this change is that it puts the infrastructure
for retries and circuit breaking for the HttpBlobStore in place.
RELNOTES: None
PiperOrigin-RevId: 199005510
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly a roll-forward of 4465dae23de989f1452e93d0a88ac2a289103dd9, which
was reverted by fa36d2f48965b127e8fd397348d16e991135bfb6. The main difference is
that the new behavior is now gated behind the --noremote_allow_symlink_upload
flag.
https://docs.google.com/document/d/1gnOYszitgrLVet3sQk-TKGqIcpkkDsc6aw-izoo-d64
is a design proposal to support symlinks in the remote cache, which would render
this change moot. I'd like to be able to prevent incorrect cache behavior until
that change is implemented, though.
This fixes https://github.com/bazelbuild/bazel/issues/4840 (again).
Closes #5122.
Change-Id: I2136cfe82c2e1a8a9f5856e12a37d42cabd0e299
PiperOrigin-RevId: 195261827
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-u doesn't currently make sense for Windows: https://github.com/docker/for-win/issues/636#issuecomment-293653788
The local remote worker happens to sometimes work on Windows because we would frequently (always?) hit the timeout here (but recently this hasn't been the case for me): https://github.com/bazelbuild/bazel/blob/fa36d2f48965b127e8fd397348d16e991135bfb6/src/tools/remote/src/main/java/com/google/devtools/build/remote/worker/ExecutionServer.java#L323
When we don't hit the timeout (so "id -u" succeeds, and the "-u" flag is passed to docker), we get an error like this:
```
Error response from daemon: container 06851b64e09bab5a930bfb706892785b24c7538c1a7be826fef315ab8e62c117 encountered an error during CreateProcess: failure in a Windows system call: The user name or password is incorrect. (0x52e)
```
The method for detecting Windows is the best I could find, similar to other isWindows functions in Bazel.
RELNOTES: None.
PiperOrigin-RevId: 193666199
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*** Reason for rollback ***
The no-cache tag is not respected (see b/77857812) and thus this breaks remote caching for all projects with symlink outputs.
*** Original change description ***
Only allow regular files and directories spawn outputs to be uploaded to a remote cache.
The remote cache protocol only knows about regular files and
directories. Currently, during action output upload, symlinks are
resolved into regular files. This means cached "executions" of an
action may have different output file types than the original
execution, which can be a footgun. This CL bans symlinks from cachable
spawn outputs and fixes http...
***
PiperOrigin-RevId: 193338629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remote cache.
The remote cache protocol only knows about regular files and
directories. Currently, during action output upload, symlinks are
resolved into regular files. This means cached "executions" of an
action may have different output file types than the original
execution, which can be a footgun. This CL bans symlinks from cachable
spawn outputs and fixes https://github.com/bazelbuild/bazel/issues/4840.
The interface of SpawnCache.CacheHandle.store is refactored:
1. The outputs parameter is removed, since that can be retrieved from the underlying Spawn.
2. It can now throw ExecException in order to fail actions.
Closes #4902.
Change-Id: I0d1d94d48779b970bb5d0840c66a14c189ab0091
PiperOrigin-RevId: 190608852
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove Optional<> where it's not needed. It's nice for return values, but IMHO it was overused in this code (e.g. Optional<List<X>> is an anti-pattern, as the list itself can already signal that it is empty).
- Use Bazel's own Path class when dealing with paths, not String or java.io.File.
- Move LinuxSandboxUtil into the "sandbox" package.
- Remove dead code and unused fields.
- Migrate deprecated VFS method calls to their replacements.
- Fix a bug in ExecutionStatistics where a FileInputStream was not closed.
Closes #4868.
PiperOrigin-RevId: 190217476
|
|
|
|
|
|
|
|
|
| |
timeouts.
The refactoring to have an Exception that contains partial results will also be used in the next CL, in order to propagate and save remote server logs.
RELNOTES: None
PiperOrigin-RevId: 189344465
|
|
|
|
|
|
|
|
|
|
| |
This provides a io.grpc.ClientInterceptor implementation that can be used to log gRPC call information. The interceptor can select a logging handler to use based on the gRPC method being called (Watch, Execute, Write, etc) to build a LogEntry, which can then be logged after the call has finished. Unit tests for the interceptor are included.
In this change, the interceptor is never invoked, nor are there any handlers implemented for any gRPC methods. The interceptor also never tries to log any entries.
To avoid circular dependency issues (Remote library will depend on logger which depends on remote library for utils), I've factored out the utility classes from the remote library into their own directory/package as part of this change.
PiperOrigin-RevId: 187926516
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* This puts in the foundation of HTTP/2 support for remote caching.
* Allows us to remove the Apache HTTP library as a dependency, reducing
the Bazel binary size by 1MiB.
On fast networks (i.e. GCE to GCS) we can see a >2x speed improvement for TLS
throughput. Even from my workstation to GCS I get significant build time
improvements when using Netty's TLS 18s vs 12s.
Closes #4481.
PiperOrigin-RevId: 183411787
|
|
|
|
|
|
|
|
|
|
|
|
| |
The JNI implementation doesn't work from a deployable jar.
Fixes https://github.com/bazelbuild/bazel/issues/3249
cc @ulfjack
Closes #4438.
PiperOrigin-RevId: 181746081
|
|
|
|
| |
PiperOrigin-RevId: 181162816
|
|
|
|
|
| |
RELNOTES: None.
PiperOrigin-RevId: 179705357
|
|
|
|
|
|
| |
Add support for directory trees as artifacts. Closes #4011.
PiperOrigin-RevId: 179691001
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*** Reason for rollback ***
Breaks //src/test/shell/bazel:bazel_sandboxing_test
*** Original change description ***
Use linux-sandbox via the (new) LinuxSandboxUtil.
RELNOTES: None.
PiperOrigin-RevId: 179676894
|
|
|
|
|
| |
RELNOTES: None.
PiperOrigin-RevId: 179646155
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for Google Cloud Storage (GCS) as a HTTP caching backend.
This commit mainly adds the infrastructure necessary to authenticate
to GCS.
Using GCS as a caching backend works as follows:
1) Create a new GCS bucket.
2) Create a service account that can read and write to the GCS bucket
and generate a JSON credentials file for it.
3) Invoke Bazel as follows:
bazel build
--remote_rest_cache=https://storage.googleapis.com/<bucket>
--auth_enabled
--auth_scope=https://www.googleapis.com/auth/devstorage.read_write
--auth_credentials=</path/to/creds.json>
I'll add simplification's and docs in a subsequent commit.
Change-Id: Ie827d7946a2193b97ea7d9aa72eb15f09de2164d
PiperOrigin-RevId: 179406380
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reproduce: run a failing test with --experimental_remote_spawn_cache or with --spawn_strategy=remote and no executor. Expected: test log is uploaded.
Desired behavior:
- regardless of whether a spawn is cacheable or not, its artifacts should be uploaded to the remote cache.
- the spawn result should only be set if the spawn is cacheable *and* the action succeeded.
- when executing remotely, the do_not_cache field should be set for non-cacheable spawns, and the remote execution engine should respect it.
This CL contains multiple fixes to ensure the above behaviors, and adds a few tests, both end to end and unit tests. Important behavior change: it is no longer assumed that non-cacheable spawns should use a NO_CACHE SpawnCache! The appropriate test case was removed. Instead, an assumption was added that all implementations of SpawnCache should respect the Spawns.mayBeCached(spawn) property. Currently, only NO_CACHE and RemoteSpawnCache exist, and they (now) support it.
TESTED=remote build execution backend.
WANT_LGTM: philwo,buchgr
RELNOTES: None
PiperOrigin-RevId: 178617937
|
|
This is because I want to add another remote execution related tool, the remote_client, which will use the Remote Execution API to fetch blobs from a remote cache. I will use this tool as part of end-to-end tests for remote execution.
TESTED=remote integration tests, presubmit
RELNOTES: None
PiperOrigin-RevId: 177995895
|