Difference between revisions of "Infrastructure/software/tensorflow"
(→Installation on Abel) |
(→Installation on Abel) |
||
(24 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Background = | = Background = | ||
− | TensorFlow is one of the most widely used Deep Learning frameworks in NLP (in mid-2018, at least), with corporate support from Google. | + | [https://www.tensorflow.org/ TensorFlow] is one of the most widely used Deep Learning frameworks in NLP (in mid-2018, at least), with corporate support from Google. |
+ | = Usage = | ||
+ | |||
+ | The module <tt>nlpl-tensorflow</tt> provides a TensorFlow installation | ||
+ | in a Python 3.5 virtual environment. | ||
+ | Some [http://wiki.nlpl.eu/index.php/Infrastructure/software/python general background] | ||
+ | on ‘mixing and matching’ of NLPL Python modules is discussed as a | ||
+ | [http://wiki.nlpl.eu/index.php/Infrastructure/software/python separate page]. | ||
+ | Besides TensorFlow and its dependencies (e.g. NumPy), the virtual | ||
+ | environment includes a selection of popular add-on packages, e.g. | ||
+ | [http://scikit-learn.org/stable/documentation.html SciKit-Learn], | ||
+ | the [https://pandas.pydata.org/ Python Data Analysis Library] (Pandas), | ||
+ | and [https://keras.io/ Keras]. | ||
+ | This installation should support both cpu and gpu nodes on Abel and Taito. | ||
+ | |||
+ | <pre> | ||
+ | module purge | ||
+ | module use -a /proj*/nlpl/software/modulefiles | ||
+ | module load nlpl-tensorflow | ||
+ | </pre> | ||
+ | |||
+ | There is a short sample program that test availability of cpu vs. | ||
+ | gpu computing devices. | ||
+ | |||
+ | <pre> | ||
+ | python /proj*/nlpl/software/tensorflow/1.11/test.py | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | qlogin --account=nn9447k --time=1:00:00 --partition=accel --gres=gpu:1 | ||
+ | module load nlpl-tensorflow | ||
+ | python /projects/nlpl/software/tensorflow/1.11/test.py | ||
+ | </pre> | ||
+ | |||
+ | = Available Versions = | ||
+ | |||
+ | As of September 2018, TensorFlow 1.11 is available (and, thus, the default version for this module). | ||
+ | On Abel, an older installation of | ||
+ | [https://www.uio.no/english/services/it/research/hpc/abel/help/software/tensorflow.html TensorFlow 1.0.1] is provided by USIT (the system operators); | ||
+ | this installation is ‘containerized’, however, i.e. is not easily interoperable with other software modules, and it does not work transparently on both cpu and gpu nodes. | ||
+ | On Taito, [https://research.csc.fi/-/tensorflow various versions] of TensorFlow are available system-wide, | ||
+ | albeit primarily only for gpu nodes. | ||
= Installation on Abel = | = Installation on Abel = | ||
− | + | ||
− | |||
<pre> | <pre> | ||
module purge | module purge | ||
− | module load gcc/4.9.2 cuda/ | + | module load gcc/4.9.2 cuda/9.0 |
module load python3/3.5.0 | module load python3/3.5.0 | ||
</pre> | </pre> | ||
Line 21: | Line 61: | ||
First things first: Enable use of our custom (more modern) GNU C Library | First things first: Enable use of our custom (more modern) GNU C Library | ||
− | installation, by wrapping the basic <tt>python</tt> binary: | + | installation, by [http://wiki.nlpl.eu/index.php/Infrastructure/software/glibc wrapping the basic <tt>python</tt> binary]: |
<pre> | <pre> | ||
mv /projects/nlpl/software/tensorflow/1.11/bin/{,.}python3.5 | mv /projects/nlpl/software/tensorflow/1.11/bin/{,.}python3.5 | ||
− | + | cp /projects/nlpl/software/glibc/2.18/wrapper \ | |
− | /projects/nlpl/software/ | + | /projects/nlpl/software/tensorflow/1.11/bin/python3.5 |
− | + | </pre> | |
+ | |||
+ | Next, create a module definition, in this case | ||
+ | <tt>/projects/nlpl/software/modulefiles/nlpl-tensorflow/1.11</tt>, | ||
+ | following the ‘standard’ template for NLPL virtual environments. | ||
+ | |||
+ | <pre> | ||
+ | module load nlpl-tensorflow/1.11 | ||
+ | pip install --upgrade pip | ||
+ | pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}') | ||
+ | pip install --upgrade -r /projects/nlpl/software/tensorflow/1.11/modules.txt | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | qlogin --account=nn9447k --time=00:30:00 \ | ||
+ | --mem-per-cpu=2048M --partition=accel --gres=gpu:1 | ||
+ | cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \ | ||
+ | /projects/nlpl/software/tensorflow/1.11/lib | ||
+ | module purge | ||
+ | module use -a /projects/nlpl/software/modulefiles | ||
+ | module load nlpl-tensorflow | ||
+ | pip install --upgrade tensorflow-gpu | ||
+ | </pre> | ||
+ | |||
+ | = Installation on Taito = | ||
+ | |||
+ | <pre> | ||
+ | module purge | ||
+ | module load cuda-env/9.0 | ||
+ | module load python-env/3.5.3 | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | cd /proj/nlpl/software | ||
+ | svn co http://svn.nlpl.eu/software/tensorflow | ||
+ | virtualenv /proj/nlpl/software/tensorflow/1.11 | ||
+ | </pre> | ||
+ | |||
+ | First things first: Enable use of our custom (more modern) GNU C Library | ||
+ | installation, by [http://wiki.nlpl.eu/index.php/Infrastructure/software/glibc wrapping the basic <tt>python</tt> binary]: | ||
+ | <pre> | ||
+ | mv /proj/nlpl/software/tensorflow/1.11/bin/{,.}python3.5 | ||
+ | cp /proj/nlpl/software/glibc/2.18/wrapper \ | ||
+ | /proj/nlpl/software/tensorflow/1.11/bin/python3.5 | ||
</pre> | </pre> | ||
Next, create a module definition, in this case | Next, create a module definition, in this case | ||
− | <tt>/ | + | <tt>/proj/nlpl/software/modulefiles/nlpl-tensorflow/1.11.lua</tt>. |
<pre> | <pre> | ||
− | module load nlpl- | + | module load nlpl-tensorflow/1.11 |
pip install --upgrade pip | pip install --upgrade pip | ||
− | pip install --upgrade | + | pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}') |
− | pip install | + | pip install --upgrade -r /proj/nlpl/software/tensorflow/1.11/modules.txt |
</pre> | </pre> | ||
<pre> | <pre> | ||
− | + | ssh taito-gpu | |
− | + | cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \ | |
− | pip install - | + | /proj/nlpl/software/tensorflow/1.11/lib |
+ | module purge | ||
+ | module use -a /projects/nlpl/software/modulefiles | ||
+ | module load nlpl-tensorflow | ||
+ | pip install --upgrade tensorflow-gpu | ||
+ | srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 \ | ||
+ | python /proj/nlpl/software/tensorflow/1.11/test.py | ||
</pre> | </pre> |
Latest revision as of 21:22, 24 October 2018
Background
TensorFlow is one of the most widely used Deep Learning frameworks in NLP (in mid-2018, at least), with corporate support from Google.
Usage
The module nlpl-tensorflow provides a TensorFlow installation in a Python 3.5 virtual environment. Some general background on ‘mixing and matching’ of NLPL Python modules is discussed as a separate page. Besides TensorFlow and its dependencies (e.g. NumPy), the virtual environment includes a selection of popular add-on packages, e.g. SciKit-Learn, the Python Data Analysis Library (Pandas), and Keras. This installation should support both cpu and gpu nodes on Abel and Taito.
module purge module use -a /proj*/nlpl/software/modulefiles module load nlpl-tensorflow
There is a short sample program that test availability of cpu vs. gpu computing devices.
python /proj*/nlpl/software/tensorflow/1.11/test.py
qlogin --account=nn9447k --time=1:00:00 --partition=accel --gres=gpu:1 module load nlpl-tensorflow python /projects/nlpl/software/tensorflow/1.11/test.py
Available Versions
As of September 2018, TensorFlow 1.11 is available (and, thus, the default version for this module). On Abel, an older installation of TensorFlow 1.0.1 is provided by USIT (the system operators); this installation is ‘containerized’, however, i.e. is not easily interoperable with other software modules, and it does not work transparently on both cpu and gpu nodes. On Taito, various versions of TensorFlow are available system-wide, albeit primarily only for gpu nodes.
Installation on Abel
module purge module load gcc/4.9.2 cuda/9.0 module load python3/3.5.0
cd /projects/nlpl/software mkdir tensorflow virtualenv tensorflow/1.11
First things first: Enable use of our custom (more modern) GNU C Library installation, by wrapping the basic python binary:
mv /projects/nlpl/software/tensorflow/1.11/bin/{,.}python3.5 cp /projects/nlpl/software/glibc/2.18/wrapper \ /projects/nlpl/software/tensorflow/1.11/bin/python3.5
Next, create a module definition, in this case /projects/nlpl/software/modulefiles/nlpl-tensorflow/1.11, following the ‘standard’ template for NLPL virtual environments.
module load nlpl-tensorflow/1.11 pip install --upgrade pip pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}') pip install --upgrade -r /projects/nlpl/software/tensorflow/1.11/modules.txt
qlogin --account=nn9447k --time=00:30:00 \ --mem-per-cpu=2048M --partition=accel --gres=gpu:1 cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \ /projects/nlpl/software/tensorflow/1.11/lib module purge module use -a /projects/nlpl/software/modulefiles module load nlpl-tensorflow pip install --upgrade tensorflow-gpu
Installation on Taito
module purge module load cuda-env/9.0 module load python-env/3.5.3
cd /proj/nlpl/software svn co http://svn.nlpl.eu/software/tensorflow virtualenv /proj/nlpl/software/tensorflow/1.11
First things first: Enable use of our custom (more modern) GNU C Library installation, by wrapping the basic python binary:
mv /proj/nlpl/software/tensorflow/1.11/bin/{,.}python3.5 cp /proj/nlpl/software/glibc/2.18/wrapper \ /proj/nlpl/software/tensorflow/1.11/bin/python3.5
Next, create a module definition, in this case /proj/nlpl/software/modulefiles/nlpl-tensorflow/1.11.lua.
module load nlpl-tensorflow/1.11 pip install --upgrade pip pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}') pip install --upgrade -r /proj/nlpl/software/tensorflow/1.11/modules.txt
ssh taito-gpu cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \ /proj/nlpl/software/tensorflow/1.11/lib module purge module use -a /projects/nlpl/software/modulefiles module load nlpl-tensorflow pip install --upgrade tensorflow-gpu srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 \ python /proj/nlpl/software/tensorflow/1.11/test.py