aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorGravatar Jon Brandvein <brandjon@google.com>2017-02-24 16:25:15 +0000
committerGravatar Yue Gan <yueg@google.com>2017-02-27 15:04:54 +0000
commit25aa033ad5657a5cfa16e8307464648b9374be2d (patch)
tree87be5359781fc494a7b92721f557a7c18434ef1d
parentdacc16dde31498f35d124ff632f2d710a0de1111 (diff)
Add more depset documentation
Updated class doc, added examples to depsets.md. Constructor/method doc updates will be a follow-up CL. -- PiperOrigin-RevId: 148463032 MOS_MIGRATED_REVID=148463032
-rw-r--r--site/versions/master/docs/skylark/depsets.md125
-rw-r--r--src/main/java/com/google/devtools/build/lib/syntax/SkylarkNestedSet.java94
2 files changed, 156 insertions, 63 deletions
diff --git a/site/versions/master/docs/skylark/depsets.md b/site/versions/master/docs/skylark/depsets.md
index a29043e431..6c56a3454f 100644
--- a/site/versions/master/docs/skylark/depsets.md
+++ b/site/versions/master/docs/skylark/depsets.md
@@ -172,7 +172,16 @@ depset, the result is a copy of `a` where:
In all cases, the original depset is left unmodified because depsets are
immutable. The returned value shares most of its internal structure with the old
-depset.
+depset. As with other immutable types, `s += t` is shorthand for `s = s + t`.
+
+```python
+s = depset(["a", "b", "c"])
+t = s
+s += depset(["d", "e"])
+
+print(s) # depset(["d", "e", "a", "b", "c"])
+print(t) # depset(["a", "b", "c"])
+```
To retrieve the contents of a depset, use the
[to_list()](lib/depset.html#to_list) method. It returns a list of all transitive
@@ -180,6 +189,63 @@ elements, not including duplicates. There is no way to directly inspect the
precise structure of the DAG, although this structure does affect the order in
which the elements are returned.
+```python
+s = depset(["a", "b", "c"])
+
+print("c" in s.to_list()) # True
+print(s.to_list() == ["a", "b", "c"]) # True
+```
+
+The allowed items in a depset are restricted, just as the allowed keys in
+dictionaries are restricted. In particular, depset contents may not be mutable.
+
+Depsets use reference equality: a depset is equal to itself, but unequal to any
+other depset, even if they have the same contents and same internal structure.
+
+```python
+s = depset(["a", "b", "c"])
+t = s
+print(s == t) # True
+
+t = depset(["a", "b", "c"])
+print(s == t) # False
+
+t = s
+# Trivial modification that adds no elements
+t += []
+print(s == t) # False
+
+d = {}
+d[s] = None
+d[t] = None
+print(len(d)) # 2
+```
+
+To compare depsets by their contents, convert them to sorted lists.
+
+```python
+s = depset(["a", "b", "c"])
+t = depset(["c", "b", "a"])
+print(sorted(s.to_list()) == sorted(t.to_list())) # True
+```
+
+There is no ability to remove elements from a depset. If this is needed, you
+must read out the entire contents of the depset, filter the elements you want to
+remove, and reconstruct a new depset. This is not particularly efficient.
+
+```python
+s = depset(["a", "b", "c"])
+t = depset(["b", "c"])
+
+# Compute set difference s - t. Precompute t.to_list() so it's not done
+# in a loop, and convert it to a dictionary for fast membership tests.
+t_items = {e: None for e in t.to_list()}
+diff_items = [x for x in s.to_list() if x not in t_items]
+# Convert back to depset if it's still going to be used for merge operations.
+s = depset(diff_items)
+print(s) # depset(["a"])
+```
+
## Order
The `to_list` operation performs a traversal over the DAG. The kind of traversal
@@ -195,20 +261,60 @@ traversals](https://en.wikipedia.org/wiki/Tree_traversal#Depth-first_search)
except that they operate on DAGs and skip already visited nodes. The third order
works as a topological sort from root to leaves, essentially the same as
preorder except that shared children are listed only after all of their parents.
-Preorder and postorder operate as left-to-right traversals. For topological
+Preorder and postorder operate as left-to-right traversals, but note that within
+each node direct elements have no order relative to children. For topological
order, there is no left-to-right guarantee, and even the
all-parents-before-child guarantee does not apply in the case that there are
duplicate elements in different nodes of the DAG.
+```python
+# This demonstrates how the + operator interacts with traversal orders.
+
+def create(order):
+ # Create s with "a" and "b" as direct elements.
+ s = depset(["a", "b"], order=order)
+ # Add a new child with contents "c" and "d".
+ s += depset(["c", "d"], order=order)
+ # Append "e" and "f" as direct elements.
+ s += ["e", "f"]
+ # Add a new child with contents "g" and "h"
+ s += depset(["g", "h"], order=order)
+ # During postorder traversal, all contents of children are emitted first,
+ # then the direct contents.
+ return s
+
+print(create("postorder").to_list()) # ["c", "d", "g", "h", "a", "b", "e", "f"]
+print(create("preorder").to_list()) # ["a", "b", "e", "f", "c", "d", "g", "h"]
+```
+
+```python
+# This demonstrates different orders on a diamond graph.
+
+def create(order):
+ a = depset(["a"], order=order)
+ b = depset(["b"], order=order)
+ b += a
+ c = depset(["c"], order=order)
+ c += a
+ d = depset(["d"], order=order)
+ d = d + b + c
+ return d
+
+print(create("postorder").to_list()) # ["a", "b", "c", "d"]
+print(create("preorder").to_list()) # ["d", "b", "a", "c"]
+print(create("topological").to_list()) # ["d", "b", "c", "a"]
+```
+
Due to how traversals are implemented, the order must be specified at the time
the depset is created with the constructor’s `order` keyword argument. If this
argument is omitted, the depset has the special `default` order, in which case
-there are no guarantees about the order of any of its elements. For safety,
-depsets with different orders cannot be merged with the `+` operator unless one
-of them uses the default order; the resulting depset’s order is the same as the
-left operand. Note that when two depsets of different order are merged in this
-way, the child may appear to have had its elements rearranged when it is
-traversed via the parent.
+there are no guarantees about the order of any of its elements.
+
+For safety, depsets with different orders cannot be merged with the `+` operator
+unless one of them uses the default order; the resulting depset’s order is the
+same as the left operand. Note that when two depsets of different order are
+merged in this way, the child may appear to have had its elements rearranged
+when it is traversed via the parent.
## Performance
@@ -284,6 +390,9 @@ and/or upcoming changes.
the depset itself. Direct iteration over depsets is deprecated and will be
removed.
+* Depset elements currently must have the same type, e.g. all ints or all
+ strings. This restriction will be lifted.
+
* The `|` operator is defined for depsets as a synonym for `+`. This will be
going away; use `+` instead.
diff --git a/src/main/java/com/google/devtools/build/lib/syntax/SkylarkNestedSet.java b/src/main/java/com/google/devtools/build/lib/syntax/SkylarkNestedSet.java
index e7efcd301c..902d4a9c1d 100644
--- a/src/main/java/com/google/devtools/build/lib/syntax/SkylarkNestedSet.java
+++ b/src/main/java/com/google/devtools/build/lib/syntax/SkylarkNestedSet.java
@@ -44,62 +44,46 @@ import javax.annotation.Nullable;
@SkylarkModule(
name = "depset",
category = SkylarkModuleCategory.BUILTIN,
- // TODO(bazel-team): Move this documentation to a dedicated page and link to it from here.
doc =
- "A language built-in type that supports efficiently accumulating data from transitive "
- + "dependencies. Note that depsets are not hash sets: they don't support fast membership "
- + "tests, but on the contrary they support fast union. Depsets are designed to be used as "
- + "a collection of items (such as file names) generated by Bazel targets. "
- + "Depsets can be created using the <a href=\"globals.html#depset\">depset</a> function, "
- + "and they support the <code>+</code> operator to extend the depset with more elements or "
- + "to nest other depsets inside of it (the <code>|</code> is equivalent). Examples:<br>"
-
- + "<pre class=language-python>s = depset([1, 2])\n"
- + "s = s + [3] # s == {1, 2, 3}\n"
- + "s = s + depset([4, 5]) # s == {1, 2, 3, {4, 5}}\n"
- + "other = depset([\"a\", \"b\", \"c\"], order=\"postorder\")</pre>"
-
- + "Note that in these examples <code>{..}</code> is not a valid literal to create depsets. "
- + "Depsets have a fixed generic type, so <code>depset([1]) + [\"a\"]</code> or "
- + "<code>depset([1]) + depset([\"a\"])</code> results in an error.<br>"
- + "Elements in a depset can neither be mutable nor be of type <code>list</code>, "
- + "<code>struct</code> or <code>dict</code>.<br>"
- + "When aggregating data from providers, depsets can take significantly less memory than "
- + "other types as they support nesting, that is, their subsets are shared in memory.<br>"
- + "Every depset has an <code>order</code> parameter which determines the iteration order. "
- + "There are four possible values:"
-
- + "<ul><li><code>postorder</code> (formerly <code>compile</code>): Defines a left-to-right "
- + "post-ordering where child elements come after those of nested depsets (parent-last). "
- + "For example, <code>{1, 2, 3, {4, 5}}</code> leads to <code>4 5 1 2 3</code>. "
- + "Left-to-right order is preserved for both the child elements and the references to "
- + "nested depsets.</li>"
-
- + "<li><code>default</code> (formerly <code>stable</code>): Same behavior as "
- + "<code>postorder</code>.</li>"
-
- + "<li><code>topological</code> (formerly <code>link</code>): Defines a variation of "
- + "left-to-right pre-ordering, i.e. <code>{1, 2, 3, {4, 5}}</code> leads to "
- + "<code>1 2 3 4 5</code>. This ordering enforces that elements of the depset always come "
- + "before elements of nested depsets (parent-first), which may lead to situations where "
- + "left-to-right order cannot be preserved (<a href=\"https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/collect/nestedset/LinkOrderExpander.java#L56\">Example</a>)."
- + "</li>"
-
- + "<li><code>preorder</code> (formerly <code>naive_link</code>): Defines \"naive\" "
- + "left-to-right pre-ordering (parent-first), i.e. <code>{1, 2, 3, {4, 5}}</code> leads to "
- + "<code>1 2 3 4 5</code>. Unlike <code>topological</code> ordering, it will sacrifice the "
- + "parent-first property in order to uphold left-to-right order in cases where both "
- + "properties cannot be guaranteed (<a href=\"https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/collect/nestedset/NaiveLinkOrderExpander.java#L26\">Example</a>)."
- + "</li></ul>"
-
- + "Except for <code>default</code>, the above values are incompatible with each other. "
- + "Consequently, two depsets can only be merged via the <code>+</code> operator or via "
- + "<code>union()</code> if either both depsets have the same <code>order</code> or one of "
- + "the depsets has <code>stable</code> order. In the latter case the iteration order will "
- + "be determined by the outer depset, thus ignoring the <code>order</code> parameter of "
- + "nested depsets.<br>"
- + "The function <code>set</code> is a deprecated alias for <code>depset</code>. Please "
- + "update legacy code and use only <code>depset</code>."
+ "<p>A specialized data structure that supports efficient merge operations and has a "
+ + "defined traversal order. Commonly used for accumulating data from transitive "
+ + "dependencies in rules and aspects. For more information see "
+ + "<a href=\"../depsets.html\">here</a>."
+
+ + "<p>Depsets are not implemented as hash sets and do not support fast membership tests. "
+ + "If you need a general set datatype, you can simulate one using a dictionary where all "
+ + "keys map to <code>None</code>."
+
+ + "<p>Depsets are immutable. They can be created using their "
+ + "<a href=\"globals.html#depset\">constructor function</a> and merged or augmented using "
+ + "the <code>+</code> operator."
+
+ + "<p>The <code>order</code> parameter determines the kind of traversal that is done to "
+ + "convert the depset to an iterable. There are four possible values:"
+
+ + "<ul>"
+ + "<li><code>default</code> (formerly <code>stable</code>): Order is unspecified (but "
+ + "deterministic).</li>"
+
+ + "<li><code>postorder</code> (formerly <code>compile</code>): A left-to-right post-"
+ + "ordering. Precisely, this recursively traverses all children leftmost-first, then the "
+ + "direct elements leftmost-first.</li>"
+
+ + "<li><code>preorder</code> (formerly <code>naive_link</code>): A left-to-right pre-"
+ + "ordering. Precisely, this traverses the direct elements leftmost-first, then "
+ + "recursively traverses the children leftmost-first.</li>"
+
+ + "<li><code>topological</code> (formerly <code>link</code>): A topological ordering from "
+ + "the root down to the leaves. There is no left-to-right guarantee.</li>"
+ + "</ul>"
+
+ + "<p>Two depsets may only be merged (via <code>+</code> or the <code>union()</code> "
+ + "method) if either both depsets have the same order, or one of them has <code>stable"
+ + "</code> order. In the latter case the resulting depset's order will be the same as the "
+ + "left operand's."
+
+ + "<p>The function <code>set()</code> is a deprecated alias for <code>depset()</code>. "
+ + "Please update legacy code and use only <code>depset()</code>."
)
@Immutable
public final class SkylarkNestedSet implements SkylarkValue, SkylarkQueryable {