diff options
Diffstat (limited to 'tensorflow/docs_src/deploy/hadoop.md')
-rw-r--r-- | tensorflow/docs_src/deploy/hadoop.md | 65 |
1 files changed, 0 insertions, 65 deletions
diff --git a/tensorflow/docs_src/deploy/hadoop.md b/tensorflow/docs_src/deploy/hadoop.md deleted file mode 100644 index b0d416df2e..0000000000 --- a/tensorflow/docs_src/deploy/hadoop.md +++ /dev/null @@ -1,65 +0,0 @@ -# How to run TensorFlow on Hadoop - -This document describes how to run TensorFlow on Hadoop. It will be expanded to -describe running on various cluster managers, but only describes running on HDFS -at the moment. - -## HDFS - -We assume that you are familiar with [reading data](../api_guides/python/reading_data.md). - -To use HDFS with TensorFlow, change the file paths you use to read and write -data to an HDFS path. For example: - -```python -filename_queue = tf.train.string_input_producer([ - "hdfs://namenode:8020/path/to/file1.csv", - "hdfs://namenode:8020/path/to/file2.csv", -]) -``` - -If you want to use the namenode specified in your HDFS configuration files, then -change the file prefix to `hdfs://default/`. - -When launching your TensorFlow program, the following environment variables must -be set: - -* **JAVA_HOME**: The location of your Java installation. -* **HADOOP_HDFS_HOME**: The location of your HDFS installation. You can also - set this environment variable by running: - - ```shell - source ${HADOOP_HOME}/libexec/hadoop-config.sh - ``` - -* **LD_LIBRARY_PATH**: To include the path to libjvm.so, and optionally the path - to libhdfs.so if your Hadoop distribution does not install libhdfs.so in - `$HADOOP_HDFS_HOME/lib/native`. On Linux: - - ```shell - export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${JAVA_HOME}/jre/lib/amd64/server - ``` - -* **CLASSPATH**: The Hadoop jars must be added prior to running your - TensorFlow program. The CLASSPATH set by - `${HADOOP_HOME}/libexec/hadoop-config.sh` is insufficient. Globs must be - expanded as described in the libhdfs documentation: - - ```shell - CLASSPATH=$(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob) python your_script.py - ``` - For older version of Hadoop/libhdfs (older than 2.6.0), you have to expand the - classpath wildcard manually. For more details, see - [HADOOP-10903](https://issues.apache.org/jira/browse/HADOOP-10903). - -If the Hadoop cluster is in secure mode, the following environment variable must -be set: - -* **KRB5CCNAME**: The path of Kerberos ticket cache file. For example: - - ```shell - export KRB5CCNAME=/tmp/krb5cc_10002 - ``` - -If you are running [Distributed TensorFlow](../deploy/distributed.md), then all -workers must have the environment variables set and Hadoop installed. |