Difference between revisions of "Infrastructure/software/tensorflow"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Available Versions)
(Usage on Abel)
Line 7: Line 7:
 
The module <tt>nlpl-tensorflow</tt> provides a TensorFlow installation
 
The module <tt>nlpl-tensorflow</tt> provides a TensorFlow installation
 
in a Python 3.5 virtual environment.
 
in a Python 3.5 virtual environment.
 +
Some [http://wiki.nlpl.eu/index.php/Infrastructure/software/python general background]
 +
on ‘mixing and matching’ of NLPL Python modules is discussed as a
 +
[http://wiki.nlpl.eu/index.php/Infrastructure/software/python separate page].
 
Besides TensorFlow and its dependencies (e.g. NumPy), the virtual
 
Besides TensorFlow and its dependencies (e.g. NumPy), the virtual
 
environment includes a selection of popular add-on packages, e.g.
 
environment includes a selection of popular add-on packages, e.g.
 
[http://scikit-learn.org/stable/documentation.html SciKit-Learn],
 
[http://scikit-learn.org/stable/documentation.html SciKit-Learn],
 
the [https://pandas.pydata.org/ Python Data Analysis Library] (Pandas),
 
the [https://pandas.pydata.org/ Python Data Analysis Library] (Pandas),
[https://radimrehurek.com/gensim/ GenSim],
 
 
and [https://keras.io/ Keras].
 
and [https://keras.io/ Keras].
 
This installation should support both cpu and gpu nodes on Abel.
 
This installation should support both cpu and gpu nodes on Abel.
Line 17: Line 19:
 
<pre>
 
<pre>
 
module purge
 
module purge
module use -a /projects/nlpl/software/modulefiles
+
module use -a /proj*/nlpl/software/modulefiles
 
module load nlpl-tensorflow
 
module load nlpl-tensorflow
 
</pre>
 
</pre>
Line 25: Line 27:
  
 
<pre>
 
<pre>
python /projects/nlpl/software/tensorflow/1.11/test.py
+
python /proj*/nlpl/software/tensorflow/1.11/test.py
 
</pre>
 
</pre>
  

Revision as of 14:22, 2 October 2018

Background

TensorFlow is one of the most widely used Deep Learning frameworks in NLP (in mid-2018, at least), with corporate support from Google.

Usage on Abel

The module nlpl-tensorflow provides a TensorFlow installation in a Python 3.5 virtual environment. Some general background on ‘mixing and matching’ of NLPL Python modules is discussed as a separate page. Besides TensorFlow and its dependencies (e.g. NumPy), the virtual environment includes a selection of popular add-on packages, e.g. SciKit-Learn, the Python Data Analysis Library (Pandas), and Keras. This installation should support both cpu and gpu nodes on Abel.

module purge
module use -a /proj*/nlpl/software/modulefiles
module load nlpl-tensorflow

There is a short sample program that test availability of cpu vs. gpu computing devices.

python /proj*/nlpl/software/tensorflow/1.11/test.py
qlogin --account=nn9447k --time=1:00:00 --partition=accel --gres=gpu:1
module load nlpl-tensorflow
python /projects/nlpl/software/tensorflow/1.11/test.py

Available Versions

As of September 2018, TensorFlow 1.11 is available (and, thus, the default version for this module). On Abel, an older installation of TensorFlow 1.0.1 is provided by USIT (the system operators); this installation is ‘containerized’, however, i.e. is not easily interoperable with other software modules, and it does not work transparently on both cpu and gpu nodes. On Taito, various versions of TensorFlow are available system-wide, albeit primarily only for gpu nodes.

Installation on Abel

module purge
module load gcc/4.9.2 cuda/9.0
module load python3/3.5.0
cd /projects/nlpl/software
mkdir tensorflow
virtualenv tensorflow/1.11

First things first: Enable use of our custom (more modern) GNU C Library installation, by wrapping the basic python binary:

mv /projects/nlpl/software/tensorflow/1.11/bin/{,.}python3.5
sed 's@pytorch/0.4.1@tensorflow/1.11@' \
  /projects/nlpl/software/pytorch/0.4.1/bin/python3.5 \
  > /projects/nlpl/software/tensorflow/1.11/bin/python3.5
chmod 755 /projects/nlpl/software/tensorflow/1.11/bin/python3.5

Next, create a module definition, in this case /projects/nlpl/software/modulefiles/nlpl-tensorflow/1.11.

module load nlpl-tensorflow/1.11
pip install --upgrade pip
pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}')
pip install --upgrade -r /projects/nlpl/software/tensorflow/1.11/modules.txt
qlogin --account=nn9447k --time=1:00:00 --partition=accel --gres=gpu:1
cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \
  /projects/nlpl/software/tensorflow/1.11/lib
module purge
module use -a /projects/nlpl/software/modulefiles
module load nlpl-tensorflow
pip install --upgrade tensorflow-gpu

Installation on Taito

module purge
module load cuda-env/9.0
module load python-env/3.5.3
cd /proj/nlpl/software
svn co http://svn.nlpl.eu/software/tensorflow
virtualenv /proj/nlpl/software/tensorflow/1.11

First things first: Enable use of our custom (more modern) GNU C Library installation, by wrapping the basic python binary:

mv /proj/nlpl/software/tensorflow/1.11/bin/{,.}python3.5
cp /proj/nlpl/software/glibc/2.18/wrapper \
  /proj/nlpl/software/tensorflow/1.11/bin/python3.5

Next, create a module definition, in this case /proj/nlpl/software/modulefiles/nlpl-tensorflow/1.11.lua.

module load nlpl-tensorflow/1.11
pip install --upgrade pip
pip install --upgrade $(pip list | tail -n +3 | awk '{print $1}')
pip install --upgrade -r /proj/nlpl/software/tensorflow/1.11/modules.txt
ssh taito-gpu
cp -av /usr/lib64/libcuda.so* /usr/lib64/libnvidia* \
  /proj/nlpl/software/tensorflow/1.11/lib
module purge
module use -a /projects/nlpl/software/modulefiles
module load nlpl-tensorflow
pip install --upgrade tensorflow-gpu
srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 \
  python /proj/nlpl/software/tensorflow/1.11/test.py