aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
-rw-r--r--site/_layouts/documentation.html1
-rw-r--r--site/docs/remote-caching.md355
-rw-r--r--src/main/java/com/google/devtools/build/lib/remote/README.md169
3 files changed, 357 insertions, 168 deletions
diff --git a/site/_layouts/documentation.html b/site/_layouts/documentation.html
index 7ef96c5b64..5052095515 100644
--- a/site/_layouts/documentation.html
+++ b/site/_layouts/documentation.html
@@ -109,6 +109,7 @@ nav: docs
<li><a href="/versions/{{ site.version }}/query-how-to.html">Querying Builds</a></li>
<li><a href="/versions/{{ site.version }}/test-encyclopedia.html">Writing Tests</a></li>
<li><a href="/versions/{{ site.version }}/best-practices.html">Best Practices</a></li>
+ <li><a href="/versions/{{ site.version }}/remote-caching.html">Remote Caching</a></li>
</ul>
<h3>Reference</h3>
diff --git a/site/docs/remote-caching.md b/site/docs/remote-caching.md
new file mode 100644
index 0000000000..bfdf4e7b87
--- /dev/null
+++ b/site/docs/remote-caching.md
@@ -0,0 +1,355 @@
+---
+layout: documentation
+title: Remote Caching
+---
+
+# Remote Caching
+
+A remote cache is used by a team of developers and/or a continuous integration
+(CI) system to share build outputs. If your build is reproducible, the
+outputs from one machine can be safely reused on another machine, which can
+make builds significantly faster.
+
+## Contents
+
+* [Remote caching overview](#remote-caching-overview)
+* [How a build uses remote caching](#how-a-build-uses-remote-caching)
+* [Setting up a server as the cache’s backend](#setting-up-a-server-as-the-caches-backend)
+ * [nginx](#nginx)
+ * [Bazel Remote Cache](#bazel-remote-cache)
+ * [Google Cloud Storage](#google-cloud-storage)
+ * [Other servers](#other-servers)
+* [HTTP Caching Protocol](#http-caching-protocol)
+* [Run Bazel using the remote cache](#run-bazel-using-the-remote-cache)
+ * [Read from and write to the remote cache](#read-from-and-write-to-the-remote-cache)
+ * [Read only from the remote cache](#read-only-from-the-remote-cache)
+ * [Exclude specific targets from using the remote cache](#exclude-specific-targets-from-using-the-remote-cache)
+ * [Delete content from the remote cache](#delete-content-from-the-remote-cache)
+* [Known Issues](#known-issues)
+* [Bazel remote execution (in development)](#remote-execution-in-development)
+
+## Remote caching overview
+
+Bazel breaks a build into discrete steps, which are called actions. Each action
+has inputs, output names, a command line, and environment variables. Required
+inputs and expected outputs are declared explicitly for each action.
+
+You can set up a server to be a remote cache for build outputs, which are these
+action outputs. These outputs consist of a list of output file names and the
+hashes of their contents. With a remote cache, you can reuse build outputs
+from another user’s build rather than building each new output locally.
+
+To use remote caching:
+
+* Set up a server as the cache’s backend
+* Configure the Bazel build to use the remote cache
+* Use Bazel version 0.10.0 or later
+
+The remote cache stores two types of data:
+
+* The action cache, which is a map of action hashes to action result metadata.
+* A content-addressable store (CAS) of output files.
+
+### How a build uses remote caching
+
+Once a server is set up as the remote cache, you use the cache in multiple
+ways:
+
+* Read and write to the remote cache
+* Read and/or write to the remote cache except for specific targets
+* Only read from the remote cache
+* Not use the remote cache at all
+
+When you run a Bazel build that can read and write to the remote cache,
+the build follows these steps:
+
+1. Bazel creates the graph of targets that need to be built, and then creates
+a list of required actions. Each of these actions has declared inputs
+and output filenames.
+2. Bazel checks your local machine for existing build outputs and reuses any
+that it finds.
+3. Bazel checks the cache for existing build outputs. If the output is found,
+Bazel retrieves the output. This is a cache hit.
+4. For required actions where the outputs were not found, Bazel executes the
+actions locally and creates the required build outputs.
+5. New build outputs are uploaded to the remote cache.
+
+## Setting up a server as the cache's backend
+
+You need to set up a server to act as the cache's backend. A HTTP/1.1
+server can treat Bazel's data as opaque bytes and so many existing servers
+can be used as a remote caching backend. Bazel's
+[HTTP Caching Protocol](#http-caching-protocol) is what supports remote
+caching.
+
+You are responsible for choosing, setting up, and maintaining the backend
+server that will store the cached outputs. When choosing a server, consider:
+
+* Networking speed. For example, if your team is in the same office, you may
+want to run your own local server.
+* Security. The remote cache will have your binaries and so needs to be secure.
+* Ease of management. For example, Google Cloud Storage is a fully managed service.
+
+There are many backends that can be used for a remote cache. Some options
+include:
+
+* [nginx](#nginx)
+* [Bazel Remote Cache](#bazel-remote-cache)
+* [Google Cloud Storage](#google-cloud-storage)
+
+### nginx
+
+nginx is an open source web server. With its [WebDAV module], it can be
+used as a remote cache for Bazel. On Debian and Ubuntu you can install the
+`nginx-extras` package. On macOS nginx is available via Homebrew:
+
+```bash
+$ brew tap denji/nginx
+$ brew install nginx-full --with-webdav
+```
+
+Below is an example configuration for nginx. Note that you will need to
+change `/path/to/cache/dir` to a valid directory where nginx has permission
+to write and read. You may need to change `client_max_body_size` option to a
+larger value if you have larger output files. The server will require other
+configuration such as authentication.
+
+
+Example configuration for `server section` in `nginx.conf`:
+
+```nginx
+location /cache/ {
+ # The path to the directory where nginx should store the cache contents.
+ root /path/to/cache/dir;
+ # Allow PUT
+ dav_methods PUT;
+ # Allow nginx to create the /ac and /cas subdirectories.
+ create_full_put_path on;
+ # The maximum size of a single file.
+ client_max_body_size 1G;
+ allow all;
+}
+```
+
+### Bazel Remote Cache
+
+Bazel Remote Cache is an open source remote build cache that you can use on
+your infrastructure. It is experimental and unsupported.
+
+This cache stores contents on disk and also provides garbage collection
+to enforce an upper storage limit and clean unused artifacts. The cache is
+available as a [docker image] and its code is available on [GitHub].
+
+Please refer to the [GitHub] page for instructions on how to use it.
+
+### Google Cloud Storage
+
+[Google Cloud Storage] is a fully managed object store which provides an
+HTTP API that is compatible with Bazel's remote caching protocol. It requires
+that you have a Google Cloud account with billing enabled.
+
+To use Cloud Storage as the cache:
+
+1. [Create a storage bucket](https://cloud.google.com/storage/docs/creating-buckets).
+Ensure that you select a bucket location that's closest to you, as network bandwidth
+is important for the remote cache.
+
+2. Create a service account for Bazel to authenticate to Cloud Storage. See
+[Creating a service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts#creating_a_service_account).
+
+3. Generate a secret JSON key and then pass it to Bazel for authentication. Store
+the key securely, as anyone with the key can read and write arbitrary data
+to/from your GCS bucket.
+
+4. Connect to Cloud Storage by adding the following flags to your Bazel command:
+ * Pass the following URL to Bazel by using the flag: `--remote_http_cache=https://storage.googleapis.com/bucket-name` where `bucket-name` is the name of your storage bucket.
+ * Pass the authentication key using the flag: `--google_credentials=/path/to/your/secret-key.json`.
+
+5. You can configure Cloud Storage to automatically delete old files. To do so, see
+[Managing Object Lifecycles](https://cloud.google.com/storage/docs/managing-lifecycles).
+
+### Other servers
+
+You can set up any HTTP/1.1 server that supports PUT and GET as the cache's
+backend. Users have reported success with caching backends such as [Hazelcast],
+[Apache httpd], and [AWS S3].
+
+## HTTP Caching Protocol
+
+Bazel supports remote caching via HTTP/1.1. The protocol is conceptually simple:
+Binary data (BLOB) is uploaded via PUT requests and downloaded via GET requests.
+Action result metadata is stored under the path `/ac/` and output files are stored
+under the path `/cas/`.
+
+For example, consider a remote cache running under `http://localhost:8080/cache`.
+A Bazel request to download action result metadata for an action with the SHA256
+hash `01ba4719...` will look as follows:
+
+```http
+GET /cache/ac/01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b HTTP/1.1
+Host: localhost:8080
+Accept: */*
+Connection: Keep-Alive
+```
+
+A Bazel request to upload an output file with the SHA256 hash `15e2b0d3...` to
+the CAS will look as follows:
+
+```http
+PUT /cas/15e2b0d3c33891ebb0f1ef609ec419420c20e320ce94c65fbc8c3312448eb225 HTTP/1.1
+Host: localhost:8080
+Accept: */*
+Content-Length: 9
+Connection: Keep-Alive
+
+0x310x320x330x340x350x360x370x380x39
+```
+
+## Run Bazel using the remote cache
+
+Once a server is set up as the remote cache, to use the remote cache you
+need to add flags to your Bazel command. See list of configurations and
+their flags below.
+
+You may also need configure authentication, which is specific to your
+chosen server.
+
+You may want to add these flags in a `.bazelrc` file so that you don’t
+need to specify them every time you run Bazel. Depending on your project and
+team dynamics, you can add flags to a `.bazelrc` file that is:
+
+* On your local machine
+* In your project’s workspace, shared with the team
+* On the CI system
+
+### Read from and write to the remote cache
+
+Take care in who has the ability to write to the remote cache. You may want
+only your CI system to be able to write to the remote cache.
+
+Use the following flags to:
+
+* read from and write to the remote cache
+* disable sandboxing
+
+```
+build --spawn_strategy=remote --genrule_strategy=remote
+build --strategy=Javac=remote --strategy=Closure=remote
+build --remote_http_cache=http://replace-with-your.host:port
+```
+
+Using the remote cache with sandboxing enabled is experimental. Use the
+following flags to read and write from the remote cache with sandboxing
+enabled:
+
+```
+build --experimental_remote_spawn_cache
+build --remote_http_cache=http://replace-with-your.host:port
+```
+
+### Read only from the remote cache
+
+Use the following flags to: read from the remote cache with sandboxing
+disabled.
+
+```
+build --spawn_strategy=remote --genrule_strategy=remote
+build --strategy=Javac=remote --strategy=Closure=remote
+build --remote_http_cache=http://replace-with-your.host:port
+build --remote_upload_local_results=false
+```
+
+Using the remote cache with sandboxing enabled is experimental. Use the
+following flags to read from the remote cache with sandboxing enabled:
+
+```
+build --experimental_remote_spawn_cache
+build --remote_http_cache=http://replace-with-your.host:port
+build --remote_upload_local_results=false
+```
+
+### Exclude specific targets from using the remote cache
+
+To exclude specific targets from using the remote cache, tag the target with
+`no-cache`. For example:
+
+```
+java_library(
+ name = "target",
+ tags = ["no-cache"],
+)
+```
+
+### Delete content from the remote cache
+
+Deleting content from the remote cache is part of managing your server.
+How you delete content from the remote cache depends on the server you have
+set up as the cache. When deleting outputs, either delete the entire cache,
+or delete old outputs.
+
+The cached outputs are stored as a set of names and hashes. When deleting
+content, there’s no way to distinguish which output belongs to a specific
+build.
+
+You may want to delete content from the cache to:
+
+* Create a clean cache after a cache was poisoned
+* Reduce the amount of storage used by deleting old outputs
+
+## Known issues
+
+**Input file modification during a Build**
+
+When an input file is modified during a build, Bazel might upload invalid
+results to the remote cache. We are working on a solution for this problem.
+See [issue #3360] for updates. Avoid this problem by not editing source
+files during a build.
+
+
+**Environment variables leaking into an action**
+
+An action definition contains environment variables. This can be a problem
+for sharing remote cache hits across machines. For example, environments
+with different `$PATH` variables won't share cache hits. You can specify
+`--experimental_strict_action_env` to ensure that that's not the case and
+that only environment variables explicitly whitelisted via `--action_env`
+are included in an action definition. Bazel's Debian/Ubuntu package used
+to install `/etc/bazel.bazelrc` with a whitelist of environment variables
+including `$PATH`. If you are getting fewer cache hits than expected, check
+that your environment doesn't have an old `/etc/bazel.bazelrc` file.
+
+
+**Bazel does not track tools outside a workspace**
+
+Bazel currently does not track tools outside a workspace. This can be a
+problem if, for example, an action uses a compiler from `/usr/bin/`. Then,
+two users with different compilers installed will wrongly share cache hits
+because the outputs are different but they have the same action hash. Please
+watch [issue #4558] for updates.
+
+
+## Bazel remote execution (in development)
+
+A [gRPC protocol] that supports both remote caching and remote execution
+is in development. Remote execution allows Bazel to execute actions on a
+separate platform, such as a datacenter. You can try remote execution with
+[Buildfarm], an open source project that aims to provide a distributed remote
+execution platform.
+
+[WebDAV module]: http://nginx.org/en/docs/http/ngx_http_dav_module.html
+[docker image]: https://hub.docker.com/r/buchgr/bazel-remote-cache/
+[GitHub]: https://github.com/buchgr/bazel-remote/
+[GitHub Issue Tracker]: https://github.com/buchgr/bazel-remote/issues
+[Google Cloud Storage]: https://cloud.google.com/storage
+[Google Cloud Console]: https://cloud.google.com/console
+[Dialog to create a new GCS bucket]: /assets/remote-cache-gcs-create-bucket.png
+[bucket location]: https://cloud.google.com/storage/docs/bucket-locations
+[Dialog to create a new GCP Service Account]: /assets/remote-cache-gcp-service-account.png
+[Hazelcast]: https://hazelcast.com
+[Apache httpd]: http://httpd.apache.org
+[AWS S3]: https://aws.amazon.com/s3
+[issue #3360]: https://github.com/bazelbuild/bazel/issues/3360
+[gRPC protocol]: https://github.com/googleapis/googleapis/blob/master/google/devtools/remoteexecution/v1test/remote_execution.proto
+[Buildfarm]: https://github.com/bazelbuild/bazel-buildfarm
+[issue #4558]: https://github.com/bazelbuild/bazel/issues/4558
+
diff --git a/src/main/java/com/google/devtools/build/lib/remote/README.md b/src/main/java/com/google/devtools/build/lib/remote/README.md
index 74769401cb..1b368d661f 100644
--- a/src/main/java/com/google/devtools/build/lib/remote/README.md
+++ b/src/main/java/com/google/devtools/build/lib/remote/README.md
@@ -1,170 +1,3 @@
# Remote caching and execution with Bazel
-Bazel can be configured to use a remote cache and to execute build and test actions remotely.
-
-# Remote Caching
-
-## Overview
-
-A Bazel build consists of actions. One can think of an action as i.e. a compiler invocation. An action is defined by its command line, environment variables, its input files, and its output filenames. The result of an action is a complete list of the output filenames and hashes of their contents. Bazel can use a remote cache to store and lookup said action results and the outputs it references. Conceptually, the remote cache consists of two parts: (1) a map of action hashes to action results, and (2) a [content-addressable store](https://en.wikipedia.org/wiki/Content-addressable_storage) (CAS) of output files.
-
-Remote caching works by Bazel looking up the hash of an action in the remote cache, and if successful retrieving the action result and the output files it references. If the lookup fails Bazel executes the action locally, uploads the output files to the CAS, and stores a list of output files keyed by the hash of the action in the action cache.
-
-Bazel supports two caching protocols:
-
-1. A HTTP-based REST protocol
-2. [A gRPC-based protocol](https://github.com/googleapis/googleapis/blob/master/google/devtools/remoteexecution/v1test/remote_execution.proto)
-
-## Remote caching using the HTTP REST protocol
-
-The HTTP-based caching protocol is the recommended protocol to use for remote caching. The protocol uses HTTP PUT for uploads and HTTP GET for downloads. The action cache is expected under `/ac` and the CAS is expected under `/cas`.
-
-For example, consider a remote cache running under `localhost:8080`. A request to fetch an action result from the action cache might look like below.
-
-```
-GET /ac/01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b HTTP/1.1
-Host: localhost
-```
-
-An upload to the CAS might look as follows.
-
-```
-PUT /ac/01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b HTTP/1.1
-Host: localhost
-Content-Length: 10
-Content-Type: application/octet-stream
-```
-
-Users have had success using a diverse set of caching backends including Hazelcast and NGINX (with WebDAV).
-
-### Known Issues
-
-When an input file is modified during a build, Bazel might upload invalid results to the remote cache. We are working on a solution for this problem. Please watch [#3360](https://github.com/bazelbuild/bazel/issues/3360) for updates. One can avoid this problem by not editing source files during a build.
-
-### Bazel Setup
-
-In order to enable remote caching in Bazel you'll need to specify some flags. We recommend adding them to your `~/.bazelrc` file for ease of use.
-
-```
-build --spawn_strategy=remote --genrule_strategy=remote --strategy=Javac=remote --strategy=Closure=remote
-build --remote_http_cache=http://replace-with-your.host:port
-```
-
-The above will enable remote caching but with sandboxing disabled. The support for sandboxing with remote caching is currently (as of 0.9.0) experimental, but works well in our experience.
-
-```
-build --experimental_remote_spawn_cache
-build --remote_http_cache=http://replace-with-your.host:port
-```
-
-#### Customizing the Hash Function
-
-Bazel computes hashes for action cache and CAS entries using SHA256 by default.
-This default can be changed to MD5 or SHA1 by specifying the
-`--host_jvm_args=-Dbazel.DigestFunction=###` startup option. Note that the hash
-function used by Bazel and the remote cache need to match when using the gRPC
-protocol.
-
-
-### Bazel Remote Cache
-
-An open source remote build cache that stores contents on disk and also provides garbage collection to enforce an upper storage limit and clean unused artifacts.
-
-The cache is available as a [docker image](https://hub.docker.com/r/buchgr/bazel-remote-cache).
-
-### Hazelcast with REST interface
-
-[Hazelcast](https://hazelcast.org/) is a distributed in-memory cache which can be used by Bazel as a remote cache. You can download the standalone Hazelcast server [here](https://hazelcast.org/download/).
-
-A simple single-machine setup is to run a single Hazelcast server with REST enabled. The REST endpoint will be `http://localhost:5701/hazelcast/rest/maps/`. Run the Hazelcast server with REST using this command:
-
-```
-java -cp hazelcast-all-3.8.5.jar -Dhazelcast.rest.enabled=true com.hazelcast.core.server.StartServer
-```
-
-You can also use Bazel with a Hazelcast cluster - as long as REST is enabled -, and also customize the configuration. Please see the Hazelcast [documentation](http://docs.hazelcast.org/docs/3.6/manual/html-single/index.html) for more details.
-
-### NGINX with WebDAV
-
-First you need to set up NGINX with WebDAV support. On Debian or Ubuntu Linux, you can install the `nginx-extras` package. On OSX you can install the [`nginx-full`](https://github.com/Homebrew/homebrew-nginx) package from homebrew with `brew install nginx-full --with-webdav`.
-
-Once installed, edit nginx.conf with a section for uploading and serving cache objects.
-
-```
-location /cache/ {
- root /some/document/root;
- dav_methods PUT;
- autoindex on;
- allow all;
- client_max_body_size 256M;
-}
-```
-
-You will need to change `/some/document/root` to a valid directory where NGINX can write to and
-read from. You may need to change `client_max_body_size` option to a larger value in case the cache
-object is too large.
-
-### Apache HTTP Server with WebDAV module
-
-Assuming Apache HTTP Server is installed with DAV modules installed. You need to edit `httpd.conf` to enable the following modules:
-
-```
-LoadModule dav_module libexec/apache2/mod_dav.so
-LoadModule dav_fs_module libexec/apache2/mod_dav_fs.so
-```
-
-Edit `httpd.conf` to use a directory for uploading and serving cache objects. You may want to edit
-this directory to include security control.
-
-```
-<Directory "/some/directory/for/cache">
- AllowOverride None
- Require all granted
- Options +Indexes
-
- Dav on
- <Limit HEAD OPTIONS GET POST PUT DELETE>
- Order Allow,Deny
- Allow from all
- </Limit>
- <LimitExcept HEAD OPTIONS GET POST PUT DELETE>
- Order Deny,Allow
- Deny from all
- </LimitExcept>
-</Directory>
-```
-
-## Remote caching using the gRPC protocol
-
-We're working on a [gRPC protocol](https://github.com/googleapis/googleapis/blob/master/google/devtools/remoteexecution/v1test/remote_execution.proto)
-that supports both remote caching and remote execution. Bazel ships with a server-side implementation that's useful for testing and not intended for production use. [Buildfarm](https://github.com/bazelbuild/bazel-buildfarm) is an open source project that aims to provide a distributed remote execution platform.
-
-### Bazel Setup
-
-In order to enable remote caching in Bazel you'll need to specify some flags. We recommend adding them to your `~/.bazelrc` file for ease of use.
-
-```
-build --spawn_strategy=remote --genrule_strategy=remote --strategy=Javac=remote --strategy=Closure=remote
-build --remote_cache=replace-with-your.host:port
-```
-
-The above will enable remote caching but with sandboxing disabled. The support for sandboxing with remote caching is currently (as of 0.9.0) experimental (but works well in our experience).
-
-```
-build --experimental_remote_spawn_cache
-build --remote_cache=replace-with-your.host:port
-```
-
-Remote execution can be enabled by specifying the `--remote_executor=replace-with-your.host:port` flag.
-
-### Running the Remote Worker
-
-Bazel currently provides a sample gRPC caching backend.
-
-```
-$ git clone https://github.com/bazelbuild/bazel.git
-$ cd bazel
-$ bazel build //src/tools/remote:worker
-$ bazel-bin/src/tools/remote/worker --listen_port=8080
-```
-
+The documentation has been moved to https://docs.bazel.build/versions/master/build-event-protocol.html.