aboutsummaryrefslogtreecommitdiffhomepage
path: root/third_party/nccl
diff options
context:
space:
mode:
authorGravatar A. Unique TensorFlower <gardener@tensorflow.org>2018-10-05 08:46:54 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-10-05 08:53:12 -0700
commit53faa313b7628cd8c9fbb836544cc6482cafb7a4 (patch)
tree89f113eb0e9239f0f9ce4eba0ffdc1eff16b58d0 /third_party/nccl
parentcea6b4959152981ab778001f30ff9ad87bb4fc9e (diff)
Switch NCCL to build from open source (version 2.3.5-5) by default.
Note to users manually patching ptxas from a later toolkit version: Building NCCL requires the same version of ptxas and nvlink. PiperOrigin-RevId: 215911973
Diffstat (limited to 'third_party/nccl')
-rw-r--r--third_party/nccl/LICENSE231
-rw-r--r--third_party/nccl/archive.BUILD179
-rw-r--r--third_party/nccl/build_defs.bzl.tpl210
-rw-r--r--third_party/nccl/nccl_archive.BUILD68
-rw-r--r--third_party/nccl/nccl_configure.bzl214
5 files changed, 536 insertions, 366 deletions
diff --git a/third_party/nccl/LICENSE b/third_party/nccl/LICENSE
index 146d9b765c..b958518186 100644
--- a/third_party/nccl/LICENSE
+++ b/third_party/nccl/LICENSE
@@ -1,203 +1,30 @@
-Copyright 2018 The TensorFlow Authors. All rights reserved.
- Apache License
- Version 2.0, January 2004
- http://www.apache.org/licenses/
-
- TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
- 1. Definitions.
-
- "License" shall mean the terms and conditions for use, reproduction,
- and distribution as defined by Sections 1 through 9 of this document.
-
- "Licensor" shall mean the copyright owner or entity authorized by
- the copyright owner that is granting the License.
-
- "Legal Entity" shall mean the union of the acting entity and all
- other entities that control, are controlled by, or are under common
- control with that entity. For the purposes of this definition,
- "control" means (i) the power, direct or indirect, to cause the
- direction or management of such entity, whether by contract or
- otherwise, or (ii) ownership of fifty percent (50%) or more of the
- outstanding shares, or (iii) beneficial ownership of such entity.
-
- "You" (or "Your") shall mean an individual or Legal Entity
- exercising permissions granted by this License.
-
- "Source" form shall mean the preferred form for making modifications,
- including but not limited to software source code, documentation
- source, and configuration files.
-
- "Object" form shall mean any form resulting from mechanical
- transformation or translation of a Source form, including but
- not limited to compiled object code, generated documentation,
- and conversions to other media types.
-
- "Work" shall mean the work of authorship, whether in Source or
- Object form, made available under the License, as indicated by a
- copyright notice that is included in or attached to the work
- (an example is provided in the Appendix below).
-
- "Derivative Works" shall mean any work, whether in Source or Object
- form, that is based on (or derived from) the Work and for which the
- editorial revisions, annotations, elaborations, or other modifications
- represent, as a whole, an original work of authorship. For the purposes
- of this License, Derivative Works shall not include works that remain
- separable from, or merely link (or bind by name) to the interfaces of,
- the Work and Derivative Works thereof.
-
- "Contribution" shall mean any work of authorship, including
- the original version of the Work and any modifications or additions
- to that Work or Derivative Works thereof, that is intentionally
- submitted to Licensor for inclusion in the Work by the copyright owner
- or by an individual or Legal Entity authorized to submit on behalf of
- the copyright owner. For the purposes of this definition, "submitted"
- means any form of electronic, verbal, or written communication sent
- to the Licensor or its representatives, including but not limited to
- communication on electronic mailing lists, source code control systems,
- and issue tracking systems that are managed by, or on behalf of, the
- Licensor for the purpose of discussing and improving the Work, but
- excluding communication that is conspicuously marked or otherwise
- designated in writing by the copyright owner as "Not a Contribution."
-
- "Contributor" shall mean Licensor and any individual or Legal Entity
- on behalf of whom a Contribution has been received by Licensor and
- subsequently incorporated within the Work.
-
- 2. Grant of Copyright License. Subject to the terms and conditions of
- this License, each Contributor hereby grants to You a perpetual,
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
- copyright license to reproduce, prepare Derivative Works of,
- publicly display, publicly perform, sublicense, and distribute the
- Work and such Derivative Works in Source or Object form.
-
- 3. Grant of Patent License. Subject to the terms and conditions of
- this License, each Contributor hereby grants to You a perpetual,
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
- (except as stated in this section) patent license to make, have made,
- use, offer to sell, sell, import, and otherwise transfer the Work,
- where such license applies only to those patent claims licensable
- by such Contributor that are necessarily infringed by their
- Contribution(s) alone or by combination of their Contribution(s)
- with the Work to which such Contribution(s) was submitted. If You
- institute patent litigation against any entity (including a
- cross-claim or counterclaim in a lawsuit) alleging that the Work
- or a Contribution incorporated within the Work constitutes direct
- or contributory patent infringement, then any patent licenses
- granted to You under this License for that Work shall terminate
- as of the date such litigation is filed.
-
- 4. Redistribution. You may reproduce and distribute copies of the
- Work or Derivative Works thereof in any medium, with or without
- modifications, and in Source or Object form, provided that You
- meet the following conditions:
-
- (a) You must give any other recipients of the Work or
- Derivative Works a copy of this License; and
-
- (b) You must cause any modified files to carry prominent notices
- stating that You changed the files; and
-
- (c) You must retain, in the Source form of any Derivative Works
- that You distribute, all copyright, patent, trademark, and
- attribution notices from the Source form of the Work,
- excluding those notices that do not pertain to any part of
- the Derivative Works; and
-
- (d) If the Work includes a "NOTICE" text file as part of its
- distribution, then any Derivative Works that You distribute must
- include a readable copy of the attribution notices contained
- within such NOTICE file, excluding those notices that do not
- pertain to any part of the Derivative Works, in at least one
- of the following places: within a NOTICE text file distributed
- as part of the Derivative Works; within the Source form or
- documentation, if provided along with the Derivative Works; or,
- within a display generated by the Derivative Works, if and
- wherever such third-party notices normally appear. The contents
- of the NOTICE file are for informational purposes only and
- do not modify the License. You may add Your own attribution
- notices within Derivative Works that You distribute, alongside
- or as an addendum to the NOTICE text from the Work, provided
- that such additional attribution notices cannot be construed
- as modifying the License.
-
- You may add Your own copyright statement to Your modifications and
- may provide additional or different license terms and conditions
- for use, reproduction, or distribution of Your modifications, or
- for any such Derivative Works as a whole, provided Your use,
- reproduction, and distribution of the Work otherwise complies with
- the conditions stated in this License.
-
- 5. Submission of Contributions. Unless You explicitly state otherwise,
- any Contribution intentionally submitted for inclusion in the Work
- by You to the Licensor shall be under the terms and conditions of
- this License, without any additional terms or conditions.
- Notwithstanding the above, nothing herein shall supersede or modify
- the terms of any separate license agreement you may have executed
- with Licensor regarding such Contributions.
-
- 6. Trademarks. This License does not grant permission to use the trade
- names, trademarks, service marks, or product names of the Licensor,
- except as required for reasonable and customary use in describing the
- origin of the Work and reproducing the content of the NOTICE file.
-
- 7. Disclaimer of Warranty. Unless required by applicable law or
- agreed to in writing, Licensor provides the Work (and each
- Contributor provides its Contributions) on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied, including, without limitation, any warranties or conditions
- of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
- PARTICULAR PURPOSE. You are solely responsible for determining the
- appropriateness of using or redistributing the Work and assume any
- risks associated with Your exercise of permissions under this License.
-
- 8. Limitation of Liability. In no event and under no legal theory,
- whether in tort (including negligence), contract, or otherwise,
- unless required by applicable law (such as deliberate and grossly
- negligent acts) or agreed to in writing, shall any Contributor be
- liable to You for damages, including any direct, indirect, special,
- incidental, or consequential damages of any character arising as a
- result of this License or out of the use or inability to use the
- Work (including but not limited to damages for loss of goodwill,
- work stoppage, computer failure or malfunction, or any and all
- other commercial damages or losses), even if such Contributor
- has been advised of the possibility of such damages.
-
- 9. Accepting Warranty or Additional Liability. While redistributing
- the Work or Derivative Works thereof, You may choose to offer,
- and charge a fee for, acceptance of support, warranty, indemnity,
- or other liability obligations and/or rights consistent with this
- License. However, in accepting such obligations, You may act only
- on Your own behalf and on Your sole responsibility, not on behalf
- of any other Contributor, and only if You agree to indemnify,
- defend, and hold each Contributor harmless for any liability
- incurred by, or claims asserted against, such Contributor by reason
- of your accepting any such warranty or additional liability.
-
- END OF TERMS AND CONDITIONS
-
- APPENDIX: How to apply the Apache License to your work.
-
- To apply the Apache License to your work, attach the following
- boilerplate notice, with the fields enclosed by brackets "[]"
- replaced with your own identifying information. (Don't include
- the brackets!) The text should be enclosed in the appropriate
- comment syntax for the file format. We also recommend that a
- file or class name and description of purpose be included on the
- same "printed page" as the copyright notice for easier
- identification within third-party archives.
-
- Copyright 2018, The TensorFlow Authors.
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
+ Copyright (c) 2015-2018, NVIDIA CORPORATION. All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+ * Neither the name of NVIDIA CORPORATION, Lawrence Berkeley National
+ Laboratory, the U.S. Department of Energy, nor the names of their
+ contributors may be used to endorse or promote products derived
+ from this software without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
+ EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ The U.S. Department of Energy funded the development of this software
+ under subcontract 7078610 with Lawrence Berkeley National Laboratory.
diff --git a/third_party/nccl/archive.BUILD b/third_party/nccl/archive.BUILD
new file mode 100644
index 0000000000..f57f04c75e
--- /dev/null
+++ b/third_party/nccl/archive.BUILD
@@ -0,0 +1,179 @@
+# NVIDIA NCCL 2
+# A package of optimized primitives for collective multi-GPU communication.
+
+licenses(["restricted"])
+
+exports_files(["LICENSE.txt"])
+
+load(
+ "@local_config_nccl//:build_defs.bzl",
+ "device_link",
+ "gen_nccl_h",
+ "nccl_library",
+ "rdc_copts",
+)
+load(
+ "@local_config_cuda//cuda:build_defs.bzl",
+ "cuda_default_copts",
+)
+
+# Generate the nccl.h header file.
+gen_nccl_h(
+ name = "nccl_h",
+ output = "src/nccl.h",
+ template = "src/nccl.h.in",
+)
+
+nccl_library(
+ name = "src_hdrs",
+ hdrs = [
+ "src/nccl.h",
+ # src/include/common_coll.h #includes "collectives/collectives.h".
+ # All other #includes of collectives.h are patched in process_srcs.
+ "src/collectives/collectives.h",
+ ],
+ strip_include_prefix = "src",
+)
+
+nccl_library(
+ name = "include_hdrs",
+ hdrs = glob(["src/include/*.h"]),
+ strip_include_prefix = "src/include",
+)
+
+filegroup(
+ name = "device_hdrs",
+ srcs = glob(["src/collectives/device/*.h"]),
+)
+
+filegroup(
+ name = "device_srcs",
+ srcs = [
+ "src/collectives/device/all_gather.cu",
+ "src/collectives/device/all_reduce.cu",
+ "src/collectives/device/broadcast.cu",
+ "src/collectives/device/reduce.cu",
+ "src/collectives/device/reduce_scatter.cu",
+ ],
+)
+
+nccl_library(
+ name = "sum",
+ srcs = [
+ ":device_hdrs",
+ ":device_srcs",
+ ],
+ copts = ["-DNCCL_OP=0"] + rdc_copts(),
+ prefix = "sum_",
+ deps = [
+ ":src_hdrs",
+ ":include_hdrs",
+ "@local_config_cuda//cuda:cuda_headers",
+ ],
+ linkstatic = True,
+)
+
+nccl_library(
+ name = "prod",
+ srcs = [
+ ":device_hdrs",
+ ":device_srcs",
+ ],
+ copts = ["-DNCCL_OP=1"] + rdc_copts(),
+ prefix = "_prod",
+ deps = [
+ ":src_hdrs",
+ ":include_hdrs",
+ "@local_config_cuda//cuda:cuda_headers",
+ ],
+ linkstatic = True,
+)
+
+nccl_library(
+ name = "min",
+ srcs = [
+ ":device_hdrs",
+ ":device_srcs",
+ ],
+ copts = ["-DNCCL_OP=2"] + rdc_copts(),
+ prefix = "min_",
+ deps = [
+ ":src_hdrs",
+ ":include_hdrs",
+ "@local_config_cuda//cuda:cuda_headers",
+ ],
+ linkstatic = True,
+)
+
+nccl_library(
+ name = "max",
+ srcs = [
+ ":device_hdrs",
+ ":device_srcs",
+ ],
+ copts = ["-DNCCL_OP=3"] + rdc_copts(),
+ prefix = "max_",
+ deps = [
+ ":src_hdrs",
+ ":include_hdrs",
+ "@local_config_cuda//cuda:cuda_headers",
+ ],
+ linkstatic = True,
+)
+
+nccl_library(
+ name = "functions",
+ srcs = [
+ ":device_hdrs",
+ "src/collectives/device/functions.cu",
+ ],
+ copts = rdc_copts(),
+ deps = [
+ ":src_hdrs",
+ ":include_hdrs",
+ "@local_config_cuda//cuda:cuda_headers",
+ ],
+ linkstatic = True,
+)
+
+device_link(
+ name = "device_code",
+ srcs = [
+ ":functions",
+ ":max",
+ ":min",
+ ":prod",
+ ":sum",
+ ],
+)
+
+# Primary NCCL target.
+nccl_library(
+ name = "nccl",
+ srcs = glob(
+ include = ["src/**/*.cu"],
+ # Exclude device-library code.
+ exclude = ["src/collectives/device/**"],
+ ) + [
+ # Required for header inclusion checking (see
+ # http://docs.bazel.build/versions/master/be/c-cpp.html#hdrs).
+ # Files in src/ which #include "nccl.h" load it from there rather than
+ # from the virtual includes directory.
+ "src/nccl.h",
+ ],
+ hdrs = ["src/nccl.h"],
+ include_prefix = "third_party/nccl",
+ strip_include_prefix = "src",
+ copts = cuda_default_copts(),
+ deps = [
+ ":device_code",
+ ":functions",
+ ":include_hdrs",
+ ":max",
+ ":min",
+ ":prod",
+ ":src_hdrs",
+ ":sum",
+ ],
+ visibility = ["//visibility:public"],
+)
diff --git a/third_party/nccl/build_defs.bzl.tpl b/third_party/nccl/build_defs.bzl.tpl
new file mode 100644
index 0000000000..ede1d3dad5
--- /dev/null
+++ b/third_party/nccl/build_defs.bzl.tpl
@@ -0,0 +1,210 @@
+"""Repository rule for NCCL."""
+
+load("@local_config_cuda//cuda:build_defs.bzl", "cuda_default_copts")
+
+def _gen_nccl_h_impl(ctx):
+ """Creates nccl.h from a template."""
+ ctx.actions.expand_template(
+ output = ctx.outputs.output,
+ template = ctx.file.template,
+ substitutions = {
+ "${nccl:Major}": "2",
+ "${nccl:Minor}": "3",
+ "${nccl:Patch}": "5",
+ "${nccl:Suffix}": "",
+ "${nccl:Version}": "2305",
+ },
+ )
+gen_nccl_h = rule(
+ implementation = _gen_nccl_h_impl,
+ attrs = {
+ "template": attr.label(allow_single_file = True),
+ "output": attr.output(),
+ },
+)
+"""Creates the NCCL header file."""
+
+
+def _process_srcs_impl(ctx):
+ """Appends .cc to .cu files, patches include directives."""
+ files = []
+ for src in ctx.files.srcs:
+ if not src.is_source:
+ # Process only once, specifically "src/nccl.h".
+ files.append(src)
+ continue
+ name = src.basename
+ if src.extension == "cu":
+ name = ctx.attr.prefix + name + ".cc"
+ file = ctx.actions.declare_file(name, sibling = src)
+ ctx.actions.expand_template(
+ output = file,
+ template = src,
+ substitutions = {
+ "\"collectives.h": "\"collectives/collectives.h",
+ "\"../collectives.h": "\"collectives/collectives.h",
+ "#if __CUDACC_VER_MAJOR__":
+ "#if defined __CUDACC_VER_MAJOR__ && __CUDACC_VER_MAJOR__",
+ # Substitutions are applied in order.
+ "std::nullptr_t": "nullptr_t",
+ "nullptr_t": "std::nullptr_t",
+ },
+ )
+ files.append(file)
+ return [DefaultInfo(files = depset(files))]
+_process_srcs = rule(
+ implementation = _process_srcs_impl,
+ attrs = {
+ "srcs": attr.label_list(allow_files = True),
+ "prefix": attr.string(default = ""),
+ },
+)
+"""Processes the NCCL srcs so they can be compiled with bazel and clang."""
+
+
+def nccl_library(name, srcs=None, hdrs=None, prefix=None, **kwargs):
+ """Processes the srcs and hdrs and creates a cc_library."""
+
+ _process_srcs(
+ name = name + "_srcs",
+ srcs = srcs,
+ prefix = prefix,
+ )
+ _process_srcs(
+ name = name + "_hdrs",
+ srcs = hdrs,
+ )
+
+ native.cc_library(
+ name = name,
+ srcs = [name + "_srcs"] if srcs else [],
+ hdrs = [name + "_hdrs"] if hdrs else [],
+ **kwargs
+ )
+
+
+def rdc_copts():
+ """Returns copts for compiling relocatable device code."""
+
+ # The global functions can not have a lower register count than the
+ # device functions. This is enforced by setting a fixed register count.
+ # https://github.com/NVIDIA/nccl/blob/f93fe9bfd94884cec2ba711897222e0df5569a53/makefiles/common.mk#L48
+ maxrregcount = "-maxrregcount=96"
+
+ return cuda_default_copts() + select({
+ "@local_config_cuda//cuda:using_nvcc": [
+ "-nvcc_options",
+ "relocatable-device-code=true",
+ "-nvcc_options",
+ "ptxas-options=" + maxrregcount,
+ ],
+ "@local_config_cuda//cuda:using_clang": [
+ "-fcuda-rdc",
+ "-Xcuda-ptxas",
+ maxrregcount,
+ ],
+ "//conditions:default": [],
+ }) + ["-fvisibility=hidden"]
+
+
+def _filter_impl(ctx):
+ suffix = ctx.attr.suffix
+ files = [src for src in ctx.files.srcs if src.path.endswith(suffix)]
+ return [DefaultInfo(files = depset(files))]
+_filter = rule(
+ implementation = _filter_impl,
+ attrs = {
+ "srcs": attr.label_list(allow_files = True),
+ "suffix": attr.string(),
+ },
+)
+"""Filters the srcs to the ones ending with suffix."""
+
+
+def _gen_link_src_impl(ctx):
+ ctx.actions.expand_template(
+ output = ctx.outputs.output,
+ template = ctx.file.template,
+ substitutions = {
+ "REGISTERLINKBINARYFILE": '"%s"' % ctx.file.register_hdr.short_path,
+ "FATBINFILE": '"%s"' % ctx.file.fatbin_hdr.short_path,
+ },
+ )
+_gen_link_src = rule(
+ implementation = _gen_link_src_impl,
+ attrs = {
+ "register_hdr": attr.label(allow_single_file = True),
+ "fatbin_hdr": attr.label(allow_single_file = True),
+ "template": attr.label(allow_single_file = True),
+ "output": attr.output(),
+ },
+)
+"""Patches the include directives for the link.stub file."""
+
+
+def device_link(name, srcs):
+ """Links seperately compiled relocatable device code into a cc_library."""
+
+ # From .a and .pic.a archives, just use the latter.
+ _filter(
+ name = name + "_pic_a",
+ srcs = srcs,
+ suffix = ".pic.a",
+ )
+
+ # Device-link to cubins for each architecture.
+ images = []
+ cubins = []
+ for arch in %{gpu_architectures}:
+ cubin = "%s_%s.cubin" % (name, arch)
+ register_hdr = "%s_%s.h" % (name, arch)
+ nvlink = "@local_config_nccl//:nvlink"
+ cmd = ("$(location %s) --cpu-arch=X86_64 " % nvlink +
+ "--arch=%s $(SRCS) " % arch +
+ "--register-link-binaries=$(location %s) " % register_hdr +
+ "--output-file=$(location %s)" % cubin)
+ native.genrule(
+ name = "%s_%s" % (name, arch),
+ outs = [register_hdr, cubin],
+ srcs = [name + "_pic_a"],
+ cmd = cmd,
+ tools = [nvlink],
+ )
+ images.append("--image=profile=%s,file=$(location %s)" % (arch, cubin))
+ cubins.append(cubin)
+
+ # Generate fatbin header from all cubins.
+ fatbin_hdr = name + ".fatbin.h"
+ fatbinary = "@local_config_nccl//:cuda/bin/fatbinary"
+ cmd = ("PATH=$$CUDA_TOOLKIT_PATH/bin:$$PATH " + # for bin2c
+ "$(location %s) -64 --cmdline=--compile-only --link " % fatbinary +
+ "--compress-all %s --create=%%{name}.fatbin " % " ".join(images) +
+ "--embedded-fatbin=$@")
+ native.genrule(
+ name = name + "_fatbin_h",
+ outs = [fatbin_hdr],
+ srcs = cubins,
+ cmd = cmd,
+ tools = [fatbinary],
+ )
+
+ # Generate the source file #including the headers generated above.
+ _gen_link_src(
+ name = name + "_cc",
+ # Include just the last one, they are equivalent.
+ register_hdr = register_hdr,
+ fatbin_hdr = fatbin_hdr,
+ template = "@local_config_nccl//:cuda/bin/crt/link.stub",
+ output = name + ".cc",
+ )
+
+ # Compile the source file into the cc_library.
+ native.cc_library(
+ name = name,
+ srcs = [name + "_cc"],
+ textual_hdrs = [register_hdr, fatbin_hdr],
+ deps = [
+ "@local_config_cuda//cuda:cuda_headers",
+ "@local_config_cuda//cuda:cudart_static",
+ ],
+ )
diff --git a/third_party/nccl/nccl_archive.BUILD b/third_party/nccl/nccl_archive.BUILD
deleted file mode 100644
index a05899e38d..0000000000
--- a/third_party/nccl/nccl_archive.BUILD
+++ /dev/null
@@ -1,68 +0,0 @@
-# NVIDIA nccl
-# A package of optimized primitives for collective multi-GPU communication.
-
-licenses(["notice"]) # BSD
-
-exports_files(["LICENSE.txt"])
-
-load("@local_config_cuda//cuda:build_defs.bzl", "cuda_default_copts", "if_cuda")
-
-SRCS = [
- "src/all_gather.cu",
- "src/all_reduce.cu",
- "src/broadcast.cu",
- "src/core.cu",
- "src/libwrap.cu",
- "src/reduce.cu",
- "src/reduce_scatter.cu",
-]
-
-# Copy .cu to .cu.cc so they can be in srcs of cc_library.
-[
- genrule(
- name = "gen_" + src,
- srcs = [src],
- outs = [src + ".cc"],
- cmd = "cp $(location " + src + ") $(location " + src + ".cc)",
- )
- for src in SRCS
-]
-
-SRCS_CU_CC = [src + ".cc" for src in SRCS]
-
-cc_library(
- name = "nccl",
- srcs = if_cuda(SRCS_CU_CC + glob(["src/*.h"])),
- hdrs = if_cuda(["src/nccl.h"]),
- copts = [
- "-DCUDA_MAJOR=0",
- "-DCUDA_MINOR=0",
- "-DNCCL_MAJOR=0",
- "-DNCCL_MINOR=0",
- "-DNCCL_PATCH=0",
- "-Iexternal/nccl_archive/src",
- "-O3",
- ] + cuda_default_copts(),
- include_prefix = "third_party/nccl",
- linkopts = select({
- "@org_tensorflow//tensorflow:android": [
- "-pie",
- ],
- "@org_tensorflow//tensorflow:darwin": [
- "-Wl,-framework",
- "-Wl,CoreFoundation",
- "-Wl,-framework",
- "-Wl,Security",
- ],
- "@org_tensorflow//tensorflow:ios": [],
- "@org_tensorflow//tensorflow:windows": [
- "-DEFAULTLIB:ws2_32.lib",
- ],
- "//conditions:default": [
- "-lrt",
- ],
- }),
- strip_include_prefix = "src",
- visibility = ["//visibility:public"],
- deps = ["@local_config_cuda//cuda:cuda_headers"],
-)
diff --git a/third_party/nccl/nccl_configure.bzl b/third_party/nccl/nccl_configure.bzl
index d78fe8f3aa..7f00df0962 100644
--- a/third_party/nccl/nccl_configure.bzl
+++ b/third_party/nccl/nccl_configure.bzl
@@ -11,12 +11,16 @@
load(
"//third_party/gpus:cuda_configure.bzl",
"auto_configure_fail",
+ "compute_capabilities",
+ "cuda_toolkit_path",
"find_cuda_define",
"matches_version",
)
-_NCCL_INSTALL_PATH = "NCCL_INSTALL_PATH"
+_CUDA_TOOLKIT_PATH = "CUDA_TOOLKIT_PATH"
_NCCL_HDR_PATH = "NCCL_HDR_PATH"
+_NCCL_INSTALL_PATH = "NCCL_INSTALL_PATH"
+_TF_CUDA_COMPUTE_CAPABILITIES = "TF_CUDA_COMPUTE_CAPABILITIES"
_TF_NCCL_VERSION = "TF_NCCL_VERSION"
_TF_NCCL_CONFIG_REPO = "TF_NCCL_CONFIG_REPO"
@@ -37,6 +41,12 @@ cc_library(
"""
_NCCL_ARCHIVE_BUILD_CONTENT = """
+exports_files([
+ "cuda/bin/crt/link.stub",
+ "cuda/bin/fatbinary",
+ "nvlink",
+])
+
filegroup(
name = "LICENSE",
data = ["@nccl_archive//:LICENSE.txt"],
@@ -50,113 +60,125 @@ alias(
)
"""
-# Local build results in dynamic link and the license should not be included.
-_NCCL_REMOTE_BUILD_TEMPLATE = Label("//third_party/nccl:remote.BUILD.tpl")
-_NCCL_LOCAL_BUILD_TEMPLATE = Label("//third_party/nccl:system.BUILD.tpl")
+def _label(file):
+ return Label("//third_party/nccl:{}".format(file))
def _find_nccl_header(repository_ctx, nccl_install_path):
- """Finds the NCCL header on the system.
-
- Args:
- repository_ctx: The repository context.
- nccl_install_path: The NCCL library install directory.
+ """Finds the NCCL header on the system.
- Returns:
- The path to the NCCL header.
- """
- header_path = repository_ctx.path("%s/include/nccl.h" % nccl_install_path)
- if not header_path.exists:
- auto_configure_fail("Cannot find %s" % str(header_path))
- return header_path
+ Args:
+ repository_ctx: The repository context.
+ nccl_install_path: The NCCL library install directory.
+ Returns:
+ The path to the NCCL header.
+ """
+ header_path = repository_ctx.path("%s/include/nccl.h" % nccl_install_path)
+ if not header_path.exists:
+ auto_configure_fail("Cannot find %s" % str(header_path))
+ return header_path
def _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version):
- """Checks whether the header file matches the specified version of NCCL.
-
- Args:
- repository_ctx: The repository context.
- nccl_install_path: The NCCL library install directory.
- nccl_version: The expected NCCL version.
-
- Returns:
- A string containing the library version of NCCL.
- """
- header_path = repository_ctx.path("%s/nccl.h" % nccl_hdr_path)
- if not header_path.exists:
- header_path = _find_nccl_header(repository_ctx, nccl_install_path)
- header_dir = str(header_path.realpath.dirname)
- major_version = find_cuda_define(repository_ctx, header_dir, "nccl.h",
- _DEFINE_NCCL_MAJOR)
- minor_version = find_cuda_define(repository_ctx, header_dir, "nccl.h",
- _DEFINE_NCCL_MINOR)
- patch_version = find_cuda_define(repository_ctx, header_dir, "nccl.h",
- _DEFINE_NCCL_PATCH)
- header_version = "%s.%s.%s" % (major_version, minor_version, patch_version)
- if not matches_version(nccl_version, header_version):
- auto_configure_fail(
- ("NCCL library version detected from %s/nccl.h (%s) does not match " +
- "TF_NCCL_VERSION (%s). To fix this rerun configure again.") %
- (header_dir, header_version, nccl_version))
-
-
-def _find_nccl_lib(repository_ctx, nccl_install_path, nccl_version):
- """Finds the given NCCL library on the system.
-
- Args:
- repository_ctx: The repository context.
- nccl_install_path: The NCCL library installation directory.
- nccl_version: The version of NCCL library files as returned
- by _nccl_version.
-
- Returns:
- The path to the NCCL library.
- """
- lib_path = repository_ctx.path("%s/lib/libnccl.so.%s" % (nccl_install_path,
- nccl_version))
- if not lib_path.exists:
- auto_configure_fail("Cannot find NCCL library %s" % str(lib_path))
- return lib_path
-
+ """Checks whether the header file matches the specified version of NCCL.
+
+ Args:
+ repository_ctx: The repository context.
+ nccl_install_path: The NCCL library install directory.
+ nccl_hdr_path: The NCCL header path.
+ nccl_version: The expected NCCL version.
+
+ Returns:
+ A string containing the library version of NCCL.
+ """
+ header_path = repository_ctx.path("%s/nccl.h" % nccl_hdr_path)
+ if not header_path.exists:
+ header_path = _find_nccl_header(repository_ctx, nccl_install_path)
+ header_dir = str(header_path.realpath.dirname)
+ major_version = find_cuda_define(
+ repository_ctx,
+ header_dir,
+ "nccl.h",
+ _DEFINE_NCCL_MAJOR,
+ )
+ minor_version = find_cuda_define(
+ repository_ctx,
+ header_dir,
+ "nccl.h",
+ _DEFINE_NCCL_MINOR,
+ )
+ patch_version = find_cuda_define(
+ repository_ctx,
+ header_dir,
+ "nccl.h",
+ _DEFINE_NCCL_PATCH,
+ )
+ header_version = "%s.%s.%s" % (major_version, minor_version, patch_version)
+ if not matches_version(nccl_version, header_version):
+ auto_configure_fail(
+ ("NCCL library version detected from %s/nccl.h (%s) does not match " +
+ "TF_NCCL_VERSION (%s). To fix this rerun configure again.") %
+ (header_dir, header_version, nccl_version),
+ )
def _nccl_configure_impl(repository_ctx):
- """Implementation of the nccl_configure repository rule."""
- if _TF_NCCL_VERSION not in repository_ctx.os.environ:
- # Add a dummy build file to make bazel query happy.
- repository_ctx.file("BUILD", _NCCL_DUMMY_BUILD_CONTENT)
- return
-
- if _TF_NCCL_CONFIG_REPO in repository_ctx.os.environ:
- # Forward to the pre-configured remote repository.
- repository_ctx.template("BUILD", _NCCL_REMOTE_BUILD_TEMPLATE, {
- "%{target}": repository_ctx.os.environ[_TF_NCCL_CONFIG_REPO],
- })
- return
-
- nccl_version = repository_ctx.os.environ[_TF_NCCL_VERSION].strip()
- if matches_version("1", nccl_version):
- # Alias to GitHub target from @nccl_archive.
- if not matches_version(nccl_version, "1.3"):
- auto_configure_fail(
- "NCCL from GitHub must use version 1.3 (got %s)" % nccl_version)
- repository_ctx.file("BUILD", _NCCL_ARCHIVE_BUILD_CONTENT)
- else:
- # Create target for locally installed NCCL.
- nccl_install_path = repository_ctx.os.environ[_NCCL_INSTALL_PATH].strip()
- nccl_hdr_path = repository_ctx.os.environ[_NCCL_HDR_PATH].strip()
- _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version)
- repository_ctx.template("BUILD", _NCCL_LOCAL_BUILD_TEMPLATE, {
- "%{version}": nccl_version,
- "%{install_path}": nccl_install_path,
- "%{hdr_path}": nccl_hdr_path,
- })
-
+ """Implementation of the nccl_configure repository rule."""
+ if _TF_NCCL_VERSION not in repository_ctx.os.environ:
+ # Add a dummy build file to make bazel query happy.
+ repository_ctx.file("BUILD", _NCCL_DUMMY_BUILD_CONTENT)
+ return
+
+ if _TF_NCCL_CONFIG_REPO in repository_ctx.os.environ:
+ # Forward to the pre-configured remote repository.
+ repository_ctx.template("BUILD", _label("remote.BUILD.tpl"), {
+ "%{target}": repository_ctx.os.environ[_TF_NCCL_CONFIG_REPO],
+ })
+ return
+
+ nccl_version = repository_ctx.os.environ[_TF_NCCL_VERSION].strip()
+ if nccl_version == "":
+ # Alias to open source build from @nccl_archive.
+ repository_ctx.file("BUILD", _NCCL_ARCHIVE_BUILD_CONTENT)
+
+ # TODO(csigg): implement and reuse in cuda_configure.bzl.
+ gpu_architectures = [
+ "sm_" + capability.replace(".", "")
+ for capability in compute_capabilities(repository_ctx)
+ ]
+
+ # Round-about way to make the list unique.
+ gpu_architectures = dict(zip(gpu_architectures, gpu_architectures)).keys()
+ repository_ctx.template("build_defs.bzl", _label("build_defs.bzl.tpl"), {
+ "%{gpu_architectures}": str(gpu_architectures),
+ })
+
+ repository_ctx.symlink(cuda_toolkit_path(repository_ctx), "cuda")
+
+ # Temporary work-around for setups which symlink ptxas to a newer
+ # version. The versions of nvlink and ptxas need to agree, so we find
+ # nvlink next to the real location of ptxas. This is only temporary and
+ # will be removed again soon.
+ nvlink_dir = repository_ctx.path("cuda/bin/ptxas").realpath.dirname
+ repository_ctx.symlink(nvlink_dir.get_child("nvlink"), "nvlink")
+ else:
+ # Create target for locally installed NCCL.
+ nccl_install_path = repository_ctx.os.environ[_NCCL_INSTALL_PATH].strip()
+ nccl_hdr_path = repository_ctx.os.environ[_NCCL_HDR_PATH].strip()
+ _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version)
+ repository_ctx.template("BUILD", _label("system.BUILD.tpl"), {
+ "%{version}": nccl_version,
+ "%{install_path}": nccl_install_path,
+ "%{hdr_path}": nccl_hdr_path,
+ })
nccl_configure = repository_rule(
- implementation=_nccl_configure_impl,
- environ=[
- _NCCL_INSTALL_PATH,
+ implementation = _nccl_configure_impl,
+ environ = [
+ _CUDA_TOOLKIT_PATH,
_NCCL_HDR_PATH,
+ _NCCL_INSTALL_PATH,
_TF_NCCL_VERSION,
+ _TF_CUDA_COMPUTE_CAPABILITIES,
+ _TF_NCCL_CONFIG_REPO,
],
)
"""Detects and configures the NCCL configuration.