aboutsummaryrefslogtreecommitdiffhomepage
path: root/site/blog
diff options
context:
space:
mode:
authorGravatar Carmi Grushko <carmi@google.com>2017-02-27 18:33:58 +0000
committerGravatar Yue Gan <yueg@google.com>2017-02-28 11:32:12 +0000
commita6090926b7ae504d82f78e6fe705f454b3216cb5 (patch)
treeedf592efcd9e031fe236204bb4a9841c94e63c15 /site/blog
parent738e5727bab23961c7cc423606f38f8b97a8150e (diff)
A blog post about proto_library et al.
proto_library-dep-graph.png was generated from: digraph mygraph { node [shape=box]; "//src:zip_code.proto" [style="dashed"] "//src:address.proto" [style="dashed"] "//src:person.proto" [style="dashed"] "//src:person_java_proto" [shape="ellipse"] "//src:person_cc_proto" [shape="ellipse"] "//src:person_java_proto" "//src:person_java_proto" -> "//src:person_proto" "//src:person_cc_proto" "//src:person_cc_proto" -> "//src:person_proto" "//src:person_proto" "//src:person_proto" -> "//src:address_proto" "//src:person_proto" -> "//src:person.proto" "//src:person.proto" "//src:address_proto" "//src:address_proto" -> "//src:address.proto" "//src:address_proto" -> "//src:zip_code_proto" "//src:zip_code_proto" "//src:zip_code_proto" -> "//src:zip_code.proto" "//src:zip_code.proto" "//src:address.proto" } -- PiperOrigin-RevId: 148664921 MOS_MIGRATED_REVID=148664921
Diffstat (limited to 'site/blog')
-rw-r--r--site/blog/_posts/2017-02-27-protocol-buffers.md211
1 files changed, 211 insertions, 0 deletions
diff --git a/site/blog/_posts/2017-02-27-protocol-buffers.md b/site/blog/_posts/2017-02-27-protocol-buffers.md
new file mode 100644
index 0000000000..f7074029d1
--- /dev/null
+++ b/site/blog/_posts/2017-02-27-protocol-buffers.md
@@ -0,0 +1,211 @@
+---
+layout: posts
+title: Protocol Buffers in Bazel
+---
+
+Bazel currently provides built-in rules for Java, JavaLite and C++.
+
+`proto_library` is a language-agnostic rule that describes relations between
+`.proto` files.
+
+`java_proto_library`, `java_lite_proto_library` and `cc_proto_library` are rules
+that "attach" to `proto_library` and generate language-specific bindings.
+
+By making a `java_library` (resp. `cc_library`) depend on `java_proto_library`
+(resp. `cc_proto_library`) your code gains access to the generated code.
+
+## TL;DR - Usage example
+
+> TIP: https://github.com/cgrushko/proto_library contains a buildable example.
+
+> NOTE: Bazel 0.4.4 lacks some features the example uses - you'll need to build
+> Bazel from head. The easiest is to install Bazel, download Bazel's source
+> code, build it (`bazel build //src:bazel`) and copy it somewhere (e.g., `cp
+> bazel-bin/src/bazel ~/bazel`)
+
+### WORKSPACE file
+
+Bazel's proto rules implicitly depend on the https://github.com/google/protobuf
+distribution (described below, in "Implicit Dependencies and Proto Toolchains").
+The following satisfies these dependencies:
+
+> TIP: This is a shortened version of https://github.com/cgrushko/proto_library/blob/master/WORKSPACE
+
+```python
+# proto_library rules implicitly depend on @com_google_protobuf//:protoc,
+# which is the proto-compiler.
+# This statement defines the @com_google_protobuf repo.
+http_archive(
+ name = "com_google_protobuf",
+ urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
+)
+
+# cc_proto_library rules implicitly depend on @com_google_protobuf_cc//:cc_toolchain,
+# which is the C++ proto runtime (base classes and common utilities).
+http_archive(
+ name = "com_google_protobuf_cc",
+ urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
+)
+
+# java_proto_library rules implicitly depend on @com_google_protobuf_java//:java_toolchain,
+# which is the Java proto runtime (base classes and common utilities).
+http_archive(
+ name = "com_google_protobuf_java",
+ urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
+)
+```
+
+### BUILD files
+
+> TIP: This is a shortened version of https://github.com/cgrushko/proto_library/blob/master/src/BUILD
+
+```python
+java_proto_library(
+ name = "person_java_proto",
+ deps = [":person_proto"],
+)
+
+cc_proto_library(
+ name = "person_cc_proto",
+ deps = [":person_proto"],
+)
+proto_library(
+ name = "person_proto",
+ srcs = ["person.proto"],
+ deps = [":address_proto"],
+)
+
+proto_library(
+ name = "address_proto",
+ srcs = ["address.proto"],
+ deps = [":zip_code_proto"],
+)
+
+proto_library(
+ name = "zip_code_proto",
+ srcs = ["zip_code.proto"],
+)
+```
+
+This file yields the following dependency graph:
+
+![proto_library dependency graph](/assets/proto_library-dep-graph.png)
+
+Notice how the `proto_library` provide structure for both Java and C++ code
+generators, and how there's only one `java_proto_library` even though there
+multiple `.proto` files.
+
+## Benefits
+
+... in comparison with a macro that's responsible for compiling all `.proto`
+files in a project.
+
+1. Caching + incrementality: changing a single `.proto` only causes the
+ rebuilding of dependant `.proto` files. This includes not only regenerating
+ code, but also recompiling it. For large proto graphs this could be
+ significant.
+2. Depend on pieces of a proto graph from multiple places: in the example
+ above, one can add a `cc_proto_library` that `deps` on `zip_code_proto`, and
+ including it together with `//src:person_cc_proto` in the same project.
+ Though they both transitively depend on `zip_code_proto`, there won't be a
+ linking error.
+
+## Recommended Code Organization
+
+1. One proto_library rule per `.proto` file.
+2. A file named `foo.proto` will be in a rule named `foo_proto`, which is
+ located in the same package.
+3. A `X_proto_library` that wraps a `proto_library` named `foo_proto` should be
+ called `foo_X_proto`, and be located in the same package.
+
+## FAQ
+
+**Q:** I already have rules named `java_proto_library` and `cc_proto_library`.
+Will there be a problem? \
+**A:** No. Since Skylark extensions imported through `load` statements take
+precedence over native rules with the same name, the new rule should not affect
+existing usage of the `java_proto_library` macro.
+
+**Q:** How do I use gRPC with these rules? \
+**A:** The Bazel rules do not generate RPC code since `protobuf` is independent
+of any RPC system. We will work with the gRPC team to create Skylark extensions
+to do so. ([C++ Issue](https://github.com/grpc/grpc/issues/9873), [Java
+Issue](https://github.com/grpc/grpc-java/issues/2756))
+
+**Q:** Do you plan to release additional languages? \
+**A:** We can relatively easily create `py_proto_library`. Our end goal is to
+improve Skylark to the point where these rules can be written in Skylark, making
+them independent of Bazel.
+
+**Q:** How does one use well-known types? (e.g., `any.proto`,
+`descriptor.proto`) \
+**A:** Once https://github.com/google/protobuf/issues/2763 is resolved, the
+following should be added to a `.proto` file: `import google/protobuf/any.proto`
+and the following: `@com_google_protobuf//:well_known_types_protos` to one's
+`proto_library` rule.
+
+**Q:** Any tips for writing my own such rules? \
+**A:** First, make sure you're able to register actions that compile your target
+language. (as far as I know, Bazel Python actions are not exposed to Skylark,
+for example). \
+Second, take extra care to generate unique symbol names and unique filenames.
+There's an implicit assumption that different proto rules with different
+options, generate different symbols. For example, if you write a new rule
+`foo_java_proto_library`, it must not generate symbols that `java_proto_library`
+might. The risk is that a binary will contain both, leading to a one-definition
+rule violation (e.g., linking errors). The downside is that the binary might
+bloat, as it must contain multiple generated code for the same proto. We're
+working on a Skylark version of `java_lite_proto_library` which should provide a
+good example.
+
+## Implementation Details
+
+### Implicit Dependencies and Proto Toolchains
+
+The `proto_library` rule implicitly depends on `@com_google_protobuf//:protoc`,
+which is the protocol buffer compiler. It must be a binary rule (in protobuf,
+it's a `cc_binary`). The rule can be overridden using the `--proto_compiler`
+command-line flag.
+
+`X_proto_library` rules implicitly depend on
+`@com_google_protobuf_X//:X_toolchain`, which is a `proto_lang_toolchain` rule.
+These rules can be overridden using the `--proto_toolchain_for_X` command-line
+flags.
+
+A `proto_lang_toolchain` rule describes how to call the protocol compiler, and
+what is the library (if any) that the resulting generated code needs to compile
+against. See an [example in the protobuf
+repository](https://github.com/google/protobuf/blob/b4b0e304be5a68de3d0ee1af9b286f958750f5e4/BUILD#L773).
+
+### Bazel Aspects
+
+The `X_proto_library` rules are implemented using [Bazel
+Aspects](https://bazel.build/versions/master/docs/skylark/aspects.html) to have
+the best of two worlds -
+
+1. Only need a single `X_proto_library` rule for an arbitrarily-large proto
+ graph.
+2. Incrementality, caching and no linking errors.
+
+Conceptually, an `X_proto_library` rule creates a shadow graph of the
+`proto_library` it depends on, and each shadow node calls protocol-compiler and
+then compiles the generated code. This way, if there are multiple paths from a
+rule to a `proto_library` through `X_proto_library`, they all share the same
+node.
+
+### Descriptor Sets
+
+When compiled on the command-line, a `proto_library` creates a descriptor set
+for the messages it `srcs`. The file is a serialized `FileDescriptorSet`, which
+is described in
+https://developers.google.com/protocol-buffers/docs/techniques#self-description.
+
+One use case for the descriptor set is generating code without having to parse
+`.proto` files. (https://github.com/google/protobuf/issues/2725 tracks this
+ability in the protobuf compiler)
+
+The aforementioned file only contains information about the `.proto` files
+directly mentioned by a `proto_library` rule; the collection of transitive
+descriptor sets is available through the 'proto.transitive_descriptor_sets'
+Skylark provider. See [documentation in
+ProtoSourcesProvider](https://github.com/bazelbuild/bazel/blob/5dbb23ba44ec0037cf0944b17716ea3f08a69c27/src/main/java/com/google/devtools/build/lib/rules/proto/ProtoSourcesProvider.java#L121).