More documentation for distributed caching and remote execution

Adds step by step documentation for remote caching feature and document how to run the remote execution demo. Fixes a small issue with interacting with WebDAV module of NGINX and Apache using --rest_cache_url feature. -- Change-Id: I5e98aa6707bf502eab0801ba637eae575ba26d42 Reviewed-on: https://cr.bazel.build/9031 PiperOrigin-RevId: 148546592 MOS_MIGRATED_REVID=148546592
author: Alpha Lam <alpha.lam.ts@gmail.com> 2017-02-25 11:51:55 +0000
committer: Yue Gan <yueg@google.com> 2017-02-27 15:06:38 +0000
commit: 707e72a5883071fb1d9c10db7c6a2127795af0ef (patch)
tree: ba1de19929d0bc618518890f304a013b25dd976c /src/main/java/com/google
parent: 0dcdb06fb5dce3425209de9bb66fc76e34279fb7 (diff)
2 files changed, 201 insertions, 42 deletions
diff --git a/src/main/java/com/google/devtools/build/lib/remote/ConcurrentMapFactory.java b/src/main/java/com/google/devtools/build/lib/remote/ConcurrentMapFactory.java
index 3461673011..6867c222e5 100644
--- a/src/main/java/com/google/devtools/build/lib/remote/ConcurrentMapFactory.java
+++ b/src/main/java/com/google/devtools/build/lib/remote/ConcurrentMapFactory.java
@@ -39,8 +39,8 @@ import org.apache.http.impl.client.DefaultHttpClient;
 import org.apache.http.util.EntityUtils;
 
 /**
- * A factory class for providing a {@link ConcurrentMap} objects to be used with
- * {@link ConcurrentMapActionCache} objects. The underlying maps can be Hazelcast or RestUrl based.
+ * A factory class for providing a {@link ConcurrentMap} objects to be used with {@link
+ * ConcurrentMapActionCache} objects. The underlying maps can be Hazelcast or RestUrl based.
  */
 public final class ConcurrentMapFactory {
 
@@ -136,7 +136,11 @@ public final class ConcurrentMapFactory {
         HttpResponse response = client.execute(put);
         int statusCode = response.getStatusLine().getStatusCode();
 
-        if (HttpStatus.SC_OK != statusCode) {
+        // Accept more than SC_OK to be compatible with Nginx WebDav module.
+        if (HttpStatus.SC_OK != statusCode
+            && HttpStatus.SC_ACCEPTED != statusCode
+            && HttpStatus.SC_CREATED != statusCode
+            && HttpStatus.SC_NO_CONTENT != statusCode) {
           throw new RuntimeException("PUT failed with status code " + statusCode);
         }
       } catch (IOException e) {
diff --git a/src/main/java/com/google/devtools/build/lib/remote/README.md b/src/main/java/com/google/devtools/build/lib/remote/README.md
index b913ff4813..a62fc38df2 100644
--- a/src/main/java/com/google/devtools/build/lib/remote/README.md
+++ b/src/main/java/com/google/devtools/build/lib/remote/README.md
@@ -1,74 +1,229 @@
-# How to run a standalone Hazelcast server for testing distributed cache.
+# Introduction
 
-- First you need to run a standalone Hazelcast server with default
-configuration. If you already have a separate Hazelcast cluster you can skip
-this step.
+This directory includes the implementation for distributed caching support in Bazel and support for
+remote execution.
 
+# Design
+
+The detailed design document and discussion can be found in this forum thread.
+
+https://groups.google.com/forum/#!msg/bazel-discuss/7JSbF6DT6OU/ewuXO6ydBAAJ
+
+# Distributed Caching
+
+## Overview
+
+Distributed caching support in Bazel depends heavily on [content-addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage).
+
+A Bazel build consists of many actions. An action is defined by the command to execute, the
+arguments and a list of input files. Before executing an action Bazel computes a hash code from
+an action. This hash code will be used to lookup and index the output from executing the action.
+Bazel will lookup the hash code in the content-addressable storage (CAS) backend. If there is a
+match then the output files are downloaded. If there is no match the action will be executed and
+the output files will be uploaded to the CAS backend.
+
+There are 2 kinds of CAS backend support implemented in Bazel.
+
+* REST endpoint that supports PUT, GET and HEAD.
+* gRPC endpoint that implements the [distributed caching and remote execution protocol](https://github.com/bazelbuild/bazel/blob/master/src/main/protobuf/remote_protocol.proto).
+
+### Distributed caching with REST endpoint
+
+If all you need is just distributed caching this is probably the most reliable path as the REST
+APIs are simple and will remain stable.
+
+For quick setup you can use NGINX with WebDav module or Apache HTTP Server with WebDav enabled.
+This enables simple remote caching for sharing between users.
+
+#### Initial setup
+
+You should enable SHA1 digest for Bazel with distributed caching. Edit `~/.bazelrc` and put the
+following line:
+```
+startup --host_jvm_args=-Dbazel.DigestFunction=SHA1
+```
+
+#### NGINX with WebDav module
+
+First you need to set up NGINX with WebDav support. On Debian or Ubuntu Linux you can install
+`nginx-extras` package. On OSX you can install the [`nginx-full`](https://github.com/Homebrew/homebrew-nginx) package from
+homebrew with `brew install nginx-full --with-webdav`.
+
+Once installed, edit nginx.conf with a section for uploading and serving cache objects.
+
+```
+location /cache/ {
+    root   /some/document/root;
+    dav_methods PUT;
+    autoindex on;
+    allow all;
+    client_max_body_size 256M;
+}
+```
+
+You will need to change `/some/document/root` to a valid directory where NGINX can write to and
+read from. You may need to change `client_max_body_size` option to a larger value in case the cache
+object is too large.
+
+#### Apache HTTP Server with WebDav module
+
+Assuming Apache HTTP Server is installed with Dav modules installed. You need to edit `httpd.conf`
+to enable the following modules:
+```
+LoadModule dav_module libexec/apache2/mod_dav.so
+LoadModule dav_fs_module libexec/apache2/mod_dav_fs.so
+```
+
+Edit `httpd.conf` to use a directory for uploading and serving cache objects. You may want to edit
+this directory to include security control.
+```
+<Directory "/some/directory/for/cache">
+    AllowOverride None
+    Require all granted
+    Options +Indexes
+
+    Dav on
+    <Limit HEAD OPTIONS GET POST PUT DELETE>
+        Order Allow,Deny
+        Allow from all
+    </Limit>
+    <LimitExcept HEAD OPTIONS GET POST PUT DELETE>
+        Order Deny,Allow
+        Deny from all
+    </LimitExcept>
+</Directory>
+```
+
+#### Providing your own REST endpoint
+
+Any REST endpoint with GET, PUT and HEAD support will be sufficient. GET is used to fetch a cache
+object. PUT is used to upload a cache object and HEAD is used to check the existence of a cache
+object.
+
+#### Running Bazel with REST CAS endpoint
+
+Once you have a REST endpoint that supports GET, PUT and HEAD then you can run Bazel with the
+following options to enable distributed caching. Change `http://server-address:port/cache` to the
+one that you provide. You may also put the options in `~/.bazelrc`.
+
+```
+build --spawn_strategy=remote --rest_cache_url=http://server-address:port/cache
+```
+
+### Distributed caching with gRPC CAS endpoint
+
+A gRPC CAS endpoint that implements the [distributed caching and remote execution protocol](https://github.com/bazelbuild/bazel/blob/master/src/main/protobuf/remote_protocol.proto) will
+give the best performance and is the most actively developed distributed caching solution.
+
+#### Initial setup
+
+You should enable SHA1 digest for Bazel with distributed caching. Edit `~/.bazelrc` and put the
+following line:
+```
+startup --host_jvm_args=-Dbazel.DigestFunction=SHA1
+```
+
+#### Running the sample gRPC cache server
+
+Bazel currently provides a sample gRPC CAS implementation with Hazelcast as caching backend.
+To use it you need to clone from [Bazel](https://github.com/bazelbuild/bazel) and then build it.
 ```
-    java -cp third_party/hazelcast/hazelcast-3.6.4.jar \
-        com.hazelcast.core.server.StartServer
+bazel build //src/tools/remote_worker:remote_cache
 ```
 
-- Then you run Bazel pointing to the Hazelcast server.
+The following command will then start the cache server listening on port 8081 with the default
+Hazelcast settings.
+```
+bazel-bin/src/tools/remote_worker/remote_cache --listen_port 8081
+```
 
+To run everything in a single command.
 ```
-    bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 build \
-        --hazelcast_node=localhost:5701 --spawn_strategy=remote \
-        src/tools/generate_workspace:all
+bazel run //src/tools/remote_worker:remote_cache -- --listen_port 8081
 ```
 
-Above command will build generate_workspace with remote spawn strategy that uses
-Hazelcast as the distributed caching backend.
+If you want to change Hazelcast settings to enable distributed memory cache you can provide your
+own hazelcast.xml with the following command.
+```
+bazel-bin/src/tools/remote_worker/remote_cache --jvm_flags=-Dhazelcast.config=/path/to/hz.xml --listen_port 8081
+```
+You can copy and edit the [default](https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/resources/hazelcast-default.xml) Hazelcast configuration. Refer to Hazelcast [manual](http://docs.hazelcast.org/docs/3.6/manual/html-single/index.html#checking-configuration)
+for more details.
 
-# How to run a remote worker for testing remote execution.
+#### Using the gRPC CAS endpoint
 
-- First run the remote worker. This will start a standalone Hazelcast server
-with default configuration.
+Use the following build options to use the gRPC CAS endpoint for sharing build artifacts. Change
+`address:8081` to the correct server address and port number.
 
 ```
-    bazel-bin/src/tools/remote_worker/remote_worker \
-        --work_path=/tmp/remote --listen_port 8080
+build --spawn_strategy=remote --remote_cache=address:8081
 ```
 
-- Then run Bazel pointing to the Hazelcast server and remote worker.
+### Distributed caching with Hazelcast (TO BE REMOVED)
+
+Bazel can connect to a Hazelcast distributed memory cluster directly for sharing build artifacts.
+This feature will be removed in the future in favor of the gRPC protocol for distributed caching.
+Hazelcast may still be used as a distributed caching backend but Bazel will connect to it through
+a gRPC CAS endpoint.
+
+#### Starting a Hazelcast server
+
+If you do not already have a Hazelcast memory cluster you can clone [Bazel](https://github.com/bazelbuild/bazel) and run this command:
 
 ```
-        bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 build \
-            --hazelcast_node=localhost:5701 \
-            --remote_worker=localhost:8080 \
-            --spawn_strategy=remote src/tools/generate_workspace:all
+java -cp third_party/hazelcast/hazelcast-3.6.4.jar com.hazelcast.core.server.StartServer
 ```
 
-# How to run a remote worker with a remote cache server.
+#### Using Hazelcast as distributed cache
 
-- First you need to run a standalone Hazelcast server with default
-configuration. If you already have a separate Hazelcast cluster you can skip
-this step.
+You will need to put the following line in `~/.bazelrc`.
+```
+startup --host_jvm_args=-Dbazel.DigestFunction=SHA1
+```
 
+The following build options will use Hazelcast as a distributed cache during build. Change
+`address:5701` to the actual server address assuming Hazelcast listens to port 5701.
 ```
-    java -cp third_party/hazelcast/hazelcast-3.6.4.jar \
-        com.hazelcast.core.server.StartServer
+build --hazelcast_node=address:5701 --spawn_strategy=remote
 ```
 
-- Then run the remote cache server:
+# Remote Execution (For Demonstration Only)
+
+The implementation of remote execution worker in Bazel can only serve as a demonstration. The
+client-side implementation is being actively developed in Bazel. However there is no fully
+functional implementation of remote worker yet.
 
+## Initial setup
+
+You should enable SHA1 digest for Bazel with distributed caching. Edit `~/.bazelrc` and put the
+following line:
 ```
-    bazel-bin/src/tools/remote_worker/remote_cache --listen_port 8081
+startup --host_jvm_args=-Dbazel.DigestFunction=SHA1
 ```
 
-- The run the remote worker:
+## Running the sample gRPC cache server
+```
+bazel build //src/tools/remote_worker:remote_cache
+bazel-bin/src/tools/remote_worker/remote_cache --listen_port 8081
+```
 
+## Running the sample gRPC remote worker
 ```
-    bazel-bin/src/tools/remote_worker/remote_worker \
-        --work_path=/tmp/remote --listen_port 8080
+bazel build //src/tools/remote_worker:remote_worker
+bazel-bin/src/tools/remote_worker/remote_cache --work_path=/tmp --listen_port 8080
 ```
 
-- Then run Bazel pointing to the cache server and remote worker.
+The sample gRPC cache server and gRPC remote worker both use Hazelcast and shares the **same
+distributed memory cluster** for storing and accessing CAS objects. It is important the CAS objects
+are shared between the two server processes.
+
+You can modify hazelcast configuration by providing a `hazelcast.xml`. Please refer to Hazelcast
+manual for details. Make sure the cache server and the remote worker server shares the same
+memory cluster.
+
+## Running Bazel using gRPC for caching and remote execution
 
+Use the following build options.
 ```
-        bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 build \
-            --hazelcast_node=localhost:5701 \
-            --remote_worker=localhost:8080 \
-            --remote_cache=localhost:8081 \
-            --spawn_strategy=remote src/tools/generate_workspace:all
+build --spawn_strategy=remote --remote_worker=localhost:8080 --remote_cache=localhost:8081
 ```
author	Alpha Lam <alpha.lam.ts@gmail.com>	2017-02-25 11:51:55 +0000
committer	Yue Gan <yueg@google.com>	2017-02-27 15:06:38 +0000
commit	707e72a5883071fb1d9c10db7c6a2127795af0ef (patch)
tree	ba1de19929d0bc618518890f304a013b25dd976c /src/main/java/com/google
parent	0dcdb06fb5dce3425209de9bb66fc76e34279fb7 (diff)