diff options
author | A. Unique TensorFlower <gardener@tensorflow.org> | 2018-10-05 08:46:54 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2018-10-05 08:53:12 -0700 |
commit | 53faa313b7628cd8c9fbb836544cc6482cafb7a4 (patch) | |
tree | 89f113eb0e9239f0f9ce4eba0ffdc1eff16b58d0 /third_party/nccl | |
parent | cea6b4959152981ab778001f30ff9ad87bb4fc9e (diff) |
Switch NCCL to build from open source (version 2.3.5-5) by default.
Note to users manually patching ptxas from a later toolkit version:
Building NCCL requires the same version of ptxas and nvlink.
PiperOrigin-RevId: 215911973
Diffstat (limited to 'third_party/nccl')
-rw-r--r-- | third_party/nccl/LICENSE | 231 | ||||
-rw-r--r-- | third_party/nccl/archive.BUILD | 179 | ||||
-rw-r--r-- | third_party/nccl/build_defs.bzl.tpl | 210 | ||||
-rw-r--r-- | third_party/nccl/nccl_archive.BUILD | 68 | ||||
-rw-r--r-- | third_party/nccl/nccl_configure.bzl | 214 |
5 files changed, 536 insertions, 366 deletions
diff --git a/third_party/nccl/LICENSE b/third_party/nccl/LICENSE index 146d9b765c..b958518186 100644 --- a/third_party/nccl/LICENSE +++ b/third_party/nccl/LICENSE @@ -1,203 +1,30 @@ -Copyright 2018 The TensorFlow Authors. All rights reserved. - Apache License - Version 2.0, January 2004 - http://www.apache.org/licenses/ - - TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - - 1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - - 2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - - 3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - - 4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - - 5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - - 6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - - 7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - - 8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - - 9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf - of any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - - END OF TERMS AND CONDITIONS - - APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don't include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - - Copyright 2018, The TensorFlow Authors. - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. + Copyright (c) 2015-2018, NVIDIA CORPORATION. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + * Neither the name of NVIDIA CORPORATION, Lawrence Berkeley National + Laboratory, the U.S. Department of Energy, nor the names of their + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY + EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY + OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + The U.S. Department of Energy funded the development of this software + under subcontract 7078610 with Lawrence Berkeley National Laboratory. diff --git a/third_party/nccl/archive.BUILD b/third_party/nccl/archive.BUILD new file mode 100644 index 0000000000..f57f04c75e --- /dev/null +++ b/third_party/nccl/archive.BUILD @@ -0,0 +1,179 @@ +# NVIDIA NCCL 2 +# A package of optimized primitives for collective multi-GPU communication. + +licenses(["restricted"]) + +exports_files(["LICENSE.txt"]) + +load( + "@local_config_nccl//:build_defs.bzl", + "device_link", + "gen_nccl_h", + "nccl_library", + "rdc_copts", +) +load( + "@local_config_cuda//cuda:build_defs.bzl", + "cuda_default_copts", +) + +# Generate the nccl.h header file. +gen_nccl_h( + name = "nccl_h", + output = "src/nccl.h", + template = "src/nccl.h.in", +) + +nccl_library( + name = "src_hdrs", + hdrs = [ + "src/nccl.h", + # src/include/common_coll.h #includes "collectives/collectives.h". + # All other #includes of collectives.h are patched in process_srcs. + "src/collectives/collectives.h", + ], + strip_include_prefix = "src", +) + +nccl_library( + name = "include_hdrs", + hdrs = glob(["src/include/*.h"]), + strip_include_prefix = "src/include", +) + +filegroup( + name = "device_hdrs", + srcs = glob(["src/collectives/device/*.h"]), +) + +filegroup( + name = "device_srcs", + srcs = [ + "src/collectives/device/all_gather.cu", + "src/collectives/device/all_reduce.cu", + "src/collectives/device/broadcast.cu", + "src/collectives/device/reduce.cu", + "src/collectives/device/reduce_scatter.cu", + ], +) + +nccl_library( + name = "sum", + srcs = [ + ":device_hdrs", + ":device_srcs", + ], + copts = ["-DNCCL_OP=0"] + rdc_copts(), + prefix = "sum_", + deps = [ + ":src_hdrs", + ":include_hdrs", + "@local_config_cuda//cuda:cuda_headers", + ], + linkstatic = True, +) + +nccl_library( + name = "prod", + srcs = [ + ":device_hdrs", + ":device_srcs", + ], + copts = ["-DNCCL_OP=1"] + rdc_copts(), + prefix = "_prod", + deps = [ + ":src_hdrs", + ":include_hdrs", + "@local_config_cuda//cuda:cuda_headers", + ], + linkstatic = True, +) + +nccl_library( + name = "min", + srcs = [ + ":device_hdrs", + ":device_srcs", + ], + copts = ["-DNCCL_OP=2"] + rdc_copts(), + prefix = "min_", + deps = [ + ":src_hdrs", + ":include_hdrs", + "@local_config_cuda//cuda:cuda_headers", + ], + linkstatic = True, +) + +nccl_library( + name = "max", + srcs = [ + ":device_hdrs", + ":device_srcs", + ], + copts = ["-DNCCL_OP=3"] + rdc_copts(), + prefix = "max_", + deps = [ + ":src_hdrs", + ":include_hdrs", + "@local_config_cuda//cuda:cuda_headers", + ], + linkstatic = True, +) + +nccl_library( + name = "functions", + srcs = [ + ":device_hdrs", + "src/collectives/device/functions.cu", + ], + copts = rdc_copts(), + deps = [ + ":src_hdrs", + ":include_hdrs", + "@local_config_cuda//cuda:cuda_headers", + ], + linkstatic = True, +) + +device_link( + name = "device_code", + srcs = [ + ":functions", + ":max", + ":min", + ":prod", + ":sum", + ], +) + +# Primary NCCL target. +nccl_library( + name = "nccl", + srcs = glob( + include = ["src/**/*.cu"], + # Exclude device-library code. + exclude = ["src/collectives/device/**"], + ) + [ + # Required for header inclusion checking (see + # http://docs.bazel.build/versions/master/be/c-cpp.html#hdrs). + # Files in src/ which #include "nccl.h" load it from there rather than + # from the virtual includes directory. + "src/nccl.h", + ], + hdrs = ["src/nccl.h"], + include_prefix = "third_party/nccl", + strip_include_prefix = "src", + copts = cuda_default_copts(), + deps = [ + ":device_code", + ":functions", + ":include_hdrs", + ":max", + ":min", + ":prod", + ":src_hdrs", + ":sum", + ], + visibility = ["//visibility:public"], +) diff --git a/third_party/nccl/build_defs.bzl.tpl b/third_party/nccl/build_defs.bzl.tpl new file mode 100644 index 0000000000..ede1d3dad5 --- /dev/null +++ b/third_party/nccl/build_defs.bzl.tpl @@ -0,0 +1,210 @@ +"""Repository rule for NCCL.""" + +load("@local_config_cuda//cuda:build_defs.bzl", "cuda_default_copts") + +def _gen_nccl_h_impl(ctx): + """Creates nccl.h from a template.""" + ctx.actions.expand_template( + output = ctx.outputs.output, + template = ctx.file.template, + substitutions = { + "${nccl:Major}": "2", + "${nccl:Minor}": "3", + "${nccl:Patch}": "5", + "${nccl:Suffix}": "", + "${nccl:Version}": "2305", + }, + ) +gen_nccl_h = rule( + implementation = _gen_nccl_h_impl, + attrs = { + "template": attr.label(allow_single_file = True), + "output": attr.output(), + }, +) +"""Creates the NCCL header file.""" + + +def _process_srcs_impl(ctx): + """Appends .cc to .cu files, patches include directives.""" + files = [] + for src in ctx.files.srcs: + if not src.is_source: + # Process only once, specifically "src/nccl.h". + files.append(src) + continue + name = src.basename + if src.extension == "cu": + name = ctx.attr.prefix + name + ".cc" + file = ctx.actions.declare_file(name, sibling = src) + ctx.actions.expand_template( + output = file, + template = src, + substitutions = { + "\"collectives.h": "\"collectives/collectives.h", + "\"../collectives.h": "\"collectives/collectives.h", + "#if __CUDACC_VER_MAJOR__": + "#if defined __CUDACC_VER_MAJOR__ && __CUDACC_VER_MAJOR__", + # Substitutions are applied in order. + "std::nullptr_t": "nullptr_t", + "nullptr_t": "std::nullptr_t", + }, + ) + files.append(file) + return [DefaultInfo(files = depset(files))] +_process_srcs = rule( + implementation = _process_srcs_impl, + attrs = { + "srcs": attr.label_list(allow_files = True), + "prefix": attr.string(default = ""), + }, +) +"""Processes the NCCL srcs so they can be compiled with bazel and clang.""" + + +def nccl_library(name, srcs=None, hdrs=None, prefix=None, **kwargs): + """Processes the srcs and hdrs and creates a cc_library.""" + + _process_srcs( + name = name + "_srcs", + srcs = srcs, + prefix = prefix, + ) + _process_srcs( + name = name + "_hdrs", + srcs = hdrs, + ) + + native.cc_library( + name = name, + srcs = [name + "_srcs"] if srcs else [], + hdrs = [name + "_hdrs"] if hdrs else [], + **kwargs + ) + + +def rdc_copts(): + """Returns copts for compiling relocatable device code.""" + + # The global functions can not have a lower register count than the + # device functions. This is enforced by setting a fixed register count. + # https://github.com/NVIDIA/nccl/blob/f93fe9bfd94884cec2ba711897222e0df5569a53/makefiles/common.mk#L48 + maxrregcount = "-maxrregcount=96" + + return cuda_default_copts() + select({ + "@local_config_cuda//cuda:using_nvcc": [ + "-nvcc_options", + "relocatable-device-code=true", + "-nvcc_options", + "ptxas-options=" + maxrregcount, + ], + "@local_config_cuda//cuda:using_clang": [ + "-fcuda-rdc", + "-Xcuda-ptxas", + maxrregcount, + ], + "//conditions:default": [], + }) + ["-fvisibility=hidden"] + + +def _filter_impl(ctx): + suffix = ctx.attr.suffix + files = [src for src in ctx.files.srcs if src.path.endswith(suffix)] + return [DefaultInfo(files = depset(files))] +_filter = rule( + implementation = _filter_impl, + attrs = { + "srcs": attr.label_list(allow_files = True), + "suffix": attr.string(), + }, +) +"""Filters the srcs to the ones ending with suffix.""" + + +def _gen_link_src_impl(ctx): + ctx.actions.expand_template( + output = ctx.outputs.output, + template = ctx.file.template, + substitutions = { + "REGISTERLINKBINARYFILE": '"%s"' % ctx.file.register_hdr.short_path, + "FATBINFILE": '"%s"' % ctx.file.fatbin_hdr.short_path, + }, + ) +_gen_link_src = rule( + implementation = _gen_link_src_impl, + attrs = { + "register_hdr": attr.label(allow_single_file = True), + "fatbin_hdr": attr.label(allow_single_file = True), + "template": attr.label(allow_single_file = True), + "output": attr.output(), + }, +) +"""Patches the include directives for the link.stub file.""" + + +def device_link(name, srcs): + """Links seperately compiled relocatable device code into a cc_library.""" + + # From .a and .pic.a archives, just use the latter. + _filter( + name = name + "_pic_a", + srcs = srcs, + suffix = ".pic.a", + ) + + # Device-link to cubins for each architecture. + images = [] + cubins = [] + for arch in %{gpu_architectures}: + cubin = "%s_%s.cubin" % (name, arch) + register_hdr = "%s_%s.h" % (name, arch) + nvlink = "@local_config_nccl//:nvlink" + cmd = ("$(location %s) --cpu-arch=X86_64 " % nvlink + + "--arch=%s $(SRCS) " % arch + + "--register-link-binaries=$(location %s) " % register_hdr + + "--output-file=$(location %s)" % cubin) + native.genrule( + name = "%s_%s" % (name, arch), + outs = [register_hdr, cubin], + srcs = [name + "_pic_a"], + cmd = cmd, + tools = [nvlink], + ) + images.append("--image=profile=%s,file=$(location %s)" % (arch, cubin)) + cubins.append(cubin) + + # Generate fatbin header from all cubins. + fatbin_hdr = name + ".fatbin.h" + fatbinary = "@local_config_nccl//:cuda/bin/fatbinary" + cmd = ("PATH=$$CUDA_TOOLKIT_PATH/bin:$$PATH " + # for bin2c + "$(location %s) -64 --cmdline=--compile-only --link " % fatbinary + + "--compress-all %s --create=%%{name}.fatbin " % " ".join(images) + + "--embedded-fatbin=$@") + native.genrule( + name = name + "_fatbin_h", + outs = [fatbin_hdr], + srcs = cubins, + cmd = cmd, + tools = [fatbinary], + ) + + # Generate the source file #including the headers generated above. + _gen_link_src( + name = name + "_cc", + # Include just the last one, they are equivalent. + register_hdr = register_hdr, + fatbin_hdr = fatbin_hdr, + template = "@local_config_nccl//:cuda/bin/crt/link.stub", + output = name + ".cc", + ) + + # Compile the source file into the cc_library. + native.cc_library( + name = name, + srcs = [name + "_cc"], + textual_hdrs = [register_hdr, fatbin_hdr], + deps = [ + "@local_config_cuda//cuda:cuda_headers", + "@local_config_cuda//cuda:cudart_static", + ], + ) diff --git a/third_party/nccl/nccl_archive.BUILD b/third_party/nccl/nccl_archive.BUILD deleted file mode 100644 index a05899e38d..0000000000 --- a/third_party/nccl/nccl_archive.BUILD +++ /dev/null @@ -1,68 +0,0 @@ -# NVIDIA nccl -# A package of optimized primitives for collective multi-GPU communication. - -licenses(["notice"]) # BSD - -exports_files(["LICENSE.txt"]) - -load("@local_config_cuda//cuda:build_defs.bzl", "cuda_default_copts", "if_cuda") - -SRCS = [ - "src/all_gather.cu", - "src/all_reduce.cu", - "src/broadcast.cu", - "src/core.cu", - "src/libwrap.cu", - "src/reduce.cu", - "src/reduce_scatter.cu", -] - -# Copy .cu to .cu.cc so they can be in srcs of cc_library. -[ - genrule( - name = "gen_" + src, - srcs = [src], - outs = [src + ".cc"], - cmd = "cp $(location " + src + ") $(location " + src + ".cc)", - ) - for src in SRCS -] - -SRCS_CU_CC = [src + ".cc" for src in SRCS] - -cc_library( - name = "nccl", - srcs = if_cuda(SRCS_CU_CC + glob(["src/*.h"])), - hdrs = if_cuda(["src/nccl.h"]), - copts = [ - "-DCUDA_MAJOR=0", - "-DCUDA_MINOR=0", - "-DNCCL_MAJOR=0", - "-DNCCL_MINOR=0", - "-DNCCL_PATCH=0", - "-Iexternal/nccl_archive/src", - "-O3", - ] + cuda_default_copts(), - include_prefix = "third_party/nccl", - linkopts = select({ - "@org_tensorflow//tensorflow:android": [ - "-pie", - ], - "@org_tensorflow//tensorflow:darwin": [ - "-Wl,-framework", - "-Wl,CoreFoundation", - "-Wl,-framework", - "-Wl,Security", - ], - "@org_tensorflow//tensorflow:ios": [], - "@org_tensorflow//tensorflow:windows": [ - "-DEFAULTLIB:ws2_32.lib", - ], - "//conditions:default": [ - "-lrt", - ], - }), - strip_include_prefix = "src", - visibility = ["//visibility:public"], - deps = ["@local_config_cuda//cuda:cuda_headers"], -) diff --git a/third_party/nccl/nccl_configure.bzl b/third_party/nccl/nccl_configure.bzl index d78fe8f3aa..7f00df0962 100644 --- a/third_party/nccl/nccl_configure.bzl +++ b/third_party/nccl/nccl_configure.bzl @@ -11,12 +11,16 @@ load( "//third_party/gpus:cuda_configure.bzl", "auto_configure_fail", + "compute_capabilities", + "cuda_toolkit_path", "find_cuda_define", "matches_version", ) -_NCCL_INSTALL_PATH = "NCCL_INSTALL_PATH" +_CUDA_TOOLKIT_PATH = "CUDA_TOOLKIT_PATH" _NCCL_HDR_PATH = "NCCL_HDR_PATH" +_NCCL_INSTALL_PATH = "NCCL_INSTALL_PATH" +_TF_CUDA_COMPUTE_CAPABILITIES = "TF_CUDA_COMPUTE_CAPABILITIES" _TF_NCCL_VERSION = "TF_NCCL_VERSION" _TF_NCCL_CONFIG_REPO = "TF_NCCL_CONFIG_REPO" @@ -37,6 +41,12 @@ cc_library( """ _NCCL_ARCHIVE_BUILD_CONTENT = """ +exports_files([ + "cuda/bin/crt/link.stub", + "cuda/bin/fatbinary", + "nvlink", +]) + filegroup( name = "LICENSE", data = ["@nccl_archive//:LICENSE.txt"], @@ -50,113 +60,125 @@ alias( ) """ -# Local build results in dynamic link and the license should not be included. -_NCCL_REMOTE_BUILD_TEMPLATE = Label("//third_party/nccl:remote.BUILD.tpl") -_NCCL_LOCAL_BUILD_TEMPLATE = Label("//third_party/nccl:system.BUILD.tpl") +def _label(file): + return Label("//third_party/nccl:{}".format(file)) def _find_nccl_header(repository_ctx, nccl_install_path): - """Finds the NCCL header on the system. - - Args: - repository_ctx: The repository context. - nccl_install_path: The NCCL library install directory. + """Finds the NCCL header on the system. - Returns: - The path to the NCCL header. - """ - header_path = repository_ctx.path("%s/include/nccl.h" % nccl_install_path) - if not header_path.exists: - auto_configure_fail("Cannot find %s" % str(header_path)) - return header_path + Args: + repository_ctx: The repository context. + nccl_install_path: The NCCL library install directory. + Returns: + The path to the NCCL header. + """ + header_path = repository_ctx.path("%s/include/nccl.h" % nccl_install_path) + if not header_path.exists: + auto_configure_fail("Cannot find %s" % str(header_path)) + return header_path def _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version): - """Checks whether the header file matches the specified version of NCCL. - - Args: - repository_ctx: The repository context. - nccl_install_path: The NCCL library install directory. - nccl_version: The expected NCCL version. - - Returns: - A string containing the library version of NCCL. - """ - header_path = repository_ctx.path("%s/nccl.h" % nccl_hdr_path) - if not header_path.exists: - header_path = _find_nccl_header(repository_ctx, nccl_install_path) - header_dir = str(header_path.realpath.dirname) - major_version = find_cuda_define(repository_ctx, header_dir, "nccl.h", - _DEFINE_NCCL_MAJOR) - minor_version = find_cuda_define(repository_ctx, header_dir, "nccl.h", - _DEFINE_NCCL_MINOR) - patch_version = find_cuda_define(repository_ctx, header_dir, "nccl.h", - _DEFINE_NCCL_PATCH) - header_version = "%s.%s.%s" % (major_version, minor_version, patch_version) - if not matches_version(nccl_version, header_version): - auto_configure_fail( - ("NCCL library version detected from %s/nccl.h (%s) does not match " + - "TF_NCCL_VERSION (%s). To fix this rerun configure again.") % - (header_dir, header_version, nccl_version)) - - -def _find_nccl_lib(repository_ctx, nccl_install_path, nccl_version): - """Finds the given NCCL library on the system. - - Args: - repository_ctx: The repository context. - nccl_install_path: The NCCL library installation directory. - nccl_version: The version of NCCL library files as returned - by _nccl_version. - - Returns: - The path to the NCCL library. - """ - lib_path = repository_ctx.path("%s/lib/libnccl.so.%s" % (nccl_install_path, - nccl_version)) - if not lib_path.exists: - auto_configure_fail("Cannot find NCCL library %s" % str(lib_path)) - return lib_path - + """Checks whether the header file matches the specified version of NCCL. + + Args: + repository_ctx: The repository context. + nccl_install_path: The NCCL library install directory. + nccl_hdr_path: The NCCL header path. + nccl_version: The expected NCCL version. + + Returns: + A string containing the library version of NCCL. + """ + header_path = repository_ctx.path("%s/nccl.h" % nccl_hdr_path) + if not header_path.exists: + header_path = _find_nccl_header(repository_ctx, nccl_install_path) + header_dir = str(header_path.realpath.dirname) + major_version = find_cuda_define( + repository_ctx, + header_dir, + "nccl.h", + _DEFINE_NCCL_MAJOR, + ) + minor_version = find_cuda_define( + repository_ctx, + header_dir, + "nccl.h", + _DEFINE_NCCL_MINOR, + ) + patch_version = find_cuda_define( + repository_ctx, + header_dir, + "nccl.h", + _DEFINE_NCCL_PATCH, + ) + header_version = "%s.%s.%s" % (major_version, minor_version, patch_version) + if not matches_version(nccl_version, header_version): + auto_configure_fail( + ("NCCL library version detected from %s/nccl.h (%s) does not match " + + "TF_NCCL_VERSION (%s). To fix this rerun configure again.") % + (header_dir, header_version, nccl_version), + ) def _nccl_configure_impl(repository_ctx): - """Implementation of the nccl_configure repository rule.""" - if _TF_NCCL_VERSION not in repository_ctx.os.environ: - # Add a dummy build file to make bazel query happy. - repository_ctx.file("BUILD", _NCCL_DUMMY_BUILD_CONTENT) - return - - if _TF_NCCL_CONFIG_REPO in repository_ctx.os.environ: - # Forward to the pre-configured remote repository. - repository_ctx.template("BUILD", _NCCL_REMOTE_BUILD_TEMPLATE, { - "%{target}": repository_ctx.os.environ[_TF_NCCL_CONFIG_REPO], - }) - return - - nccl_version = repository_ctx.os.environ[_TF_NCCL_VERSION].strip() - if matches_version("1", nccl_version): - # Alias to GitHub target from @nccl_archive. - if not matches_version(nccl_version, "1.3"): - auto_configure_fail( - "NCCL from GitHub must use version 1.3 (got %s)" % nccl_version) - repository_ctx.file("BUILD", _NCCL_ARCHIVE_BUILD_CONTENT) - else: - # Create target for locally installed NCCL. - nccl_install_path = repository_ctx.os.environ[_NCCL_INSTALL_PATH].strip() - nccl_hdr_path = repository_ctx.os.environ[_NCCL_HDR_PATH].strip() - _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version) - repository_ctx.template("BUILD", _NCCL_LOCAL_BUILD_TEMPLATE, { - "%{version}": nccl_version, - "%{install_path}": nccl_install_path, - "%{hdr_path}": nccl_hdr_path, - }) - + """Implementation of the nccl_configure repository rule.""" + if _TF_NCCL_VERSION not in repository_ctx.os.environ: + # Add a dummy build file to make bazel query happy. + repository_ctx.file("BUILD", _NCCL_DUMMY_BUILD_CONTENT) + return + + if _TF_NCCL_CONFIG_REPO in repository_ctx.os.environ: + # Forward to the pre-configured remote repository. + repository_ctx.template("BUILD", _label("remote.BUILD.tpl"), { + "%{target}": repository_ctx.os.environ[_TF_NCCL_CONFIG_REPO], + }) + return + + nccl_version = repository_ctx.os.environ[_TF_NCCL_VERSION].strip() + if nccl_version == "": + # Alias to open source build from @nccl_archive. + repository_ctx.file("BUILD", _NCCL_ARCHIVE_BUILD_CONTENT) + + # TODO(csigg): implement and reuse in cuda_configure.bzl. + gpu_architectures = [ + "sm_" + capability.replace(".", "") + for capability in compute_capabilities(repository_ctx) + ] + + # Round-about way to make the list unique. + gpu_architectures = dict(zip(gpu_architectures, gpu_architectures)).keys() + repository_ctx.template("build_defs.bzl", _label("build_defs.bzl.tpl"), { + "%{gpu_architectures}": str(gpu_architectures), + }) + + repository_ctx.symlink(cuda_toolkit_path(repository_ctx), "cuda") + + # Temporary work-around for setups which symlink ptxas to a newer + # version. The versions of nvlink and ptxas need to agree, so we find + # nvlink next to the real location of ptxas. This is only temporary and + # will be removed again soon. + nvlink_dir = repository_ctx.path("cuda/bin/ptxas").realpath.dirname + repository_ctx.symlink(nvlink_dir.get_child("nvlink"), "nvlink") + else: + # Create target for locally installed NCCL. + nccl_install_path = repository_ctx.os.environ[_NCCL_INSTALL_PATH].strip() + nccl_hdr_path = repository_ctx.os.environ[_NCCL_HDR_PATH].strip() + _check_nccl_version(repository_ctx, nccl_install_path, nccl_hdr_path, nccl_version) + repository_ctx.template("BUILD", _label("system.BUILD.tpl"), { + "%{version}": nccl_version, + "%{install_path}": nccl_install_path, + "%{hdr_path}": nccl_hdr_path, + }) nccl_configure = repository_rule( - implementation=_nccl_configure_impl, - environ=[ - _NCCL_INSTALL_PATH, + implementation = _nccl_configure_impl, + environ = [ + _CUDA_TOOLKIT_PATH, _NCCL_HDR_PATH, + _NCCL_INSTALL_PATH, _TF_NCCL_VERSION, + _TF_CUDA_COMPUTE_CAPABILITIES, + _TF_NCCL_CONFIG_REPO, ], ) """Detects and configures the NCCL configuration. |