Infrastructure/software/eosc/singularity

From Nordic Language Processing Laboratory
Revision as of 16:48, 10 December 2020 by Raganato (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Background

A singularity container is a way to keep software stack all in one place (a single file ".sif") abstracting it from the underlying environment. This ensures the reproducibility of systems, i.e., custom software applications, bringing the advantage to use exactly the same software stack package also in different HPC, e.g., Puhti in Finland and Saga in Norway.

In short, we can build an application using a singularity container (by simply loading the container in an HPC), share the same singularity container in another HPC, or between different users in the same HPC, and be sure that our application will be reproducible, without worrying about the loading of the right versions of the required libraries.


GPU use case

In this section, we provide a use case on how to create a singularity container from an existing docker image, update it with other python libraries, and finally create a single singularity container file to be shared across HPC. In this example, we aim at creating a singularity container without requiring sudo access, and so building it directly in an HPC. The running example was tested on Puhti.

The running example uses PyTorch docker implementation from NVIDIA, https://ngc.nvidia.com/catalog/containers/nvidia:pytorch More details about each specific PyTorch docker release are here: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html In this example, we will use the 20.01 release, size 3.36 GB, which includes Ubuntu 18.04, Python 3.6, NVIDIA CUDA 10.2.89, cuBLAS 10.2.2.89, NVIDIA cuDNN 7.6.5, among others. The full content list of the 20.01 PyTorch container can be seen here: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_20-01.html#rel_20-01

First we need to log-in in a HPC and move in a project folder as working directory. The following script will create a singularity container folder:

#!/bin/bash
#SBATCH -J job
#SBATCH -o job.%J.txt
#SBATCH -e job.%J.txt
#SBATCH -p small
#SBATCH -n 1
#SBATCH -N 1
#SBATCH -t 01:00:00
#SBATCH --mem-per-cpu=8G
#SBATCH --account=project_number
#SBATCH --mail-type=ALL
#SBATCH --mail-user=EMAIL@EMAIL.CODE

LOCAL_SCRATCH=/path/to/project_folder/local/
LOCAL_HOME=/path/to/project_folder/home/

# Let's use the fast local drive for temporary storage
export SINGULARITY_TMPDIR=$LOCAL_SCRATCH
export SINGULARITY_CACHEDIR=$LOCAL_SCRATCH
export HOME=$LOCAL_HOME

# This is just to avoid some annoying warnings
unset XDG_RUNTIME_DIR

# Do the actual conversion
singularity build --sandbox sandBoxPytorch2001/ docker://nvcr.io/nvidia/pytorch:20.01-py3

We will end up with the "sandBoxPytorch2001" folder containing the singularity container in our working directory. In this way, we can install other libraries, e.g., the transformers package, in the singularity container. To install an external package we can log-in inside the container and follow the usual python pip procedure to install other packages.

For example (this can be run directly from bash, without submitting a job):

singularity shell --no-home -e -w sandBoxPytorch2001/
pip install transformers
exit

Now that we installed all the external packages, we can convert the singularity folder into a single file to be used across HPC.

LOCAL_SCRATCH=/scratch/project_2002007/EOSC2/local/
LOCAL_HOME=/scratch/project_2002007/EOSC2/home/

# Let's use the fast local drive for temporary storage
export SINGULARITY_TMPDIR=$LOCAL_SCRATCH
export SINGULARITY_CACHEDIR=$LOCAL_SCRATCH
export HOME=$LOCAL_HOME

# This is just to avoid some annoying warnings
unset XDG_RUNTIME_DIR

# create the pytorch2001Transformers.sif container from the sandBoxPytorch2001 folder
singularity build pytorch2001Transformers.sif sandBoxPytorch2001/

Finally, we can use the singularity container to run python files, without loading other modules.

# run the myprog.py using the singularity container created. 
srun singularity_wrapper exec --nv pytorch2001Transformers.sif python myprog.py

the --nv option is needed for GPU run

singularity_wrapper automatically includes all the necessary binds for CSC's environment. Otherwise, the -B option must be used.


References

Running singularity containers in Puhti: https://docs.csc.fi/computing/containers/run-existing/

Webinar-Running Singularity containers in Puhti: https://www.youtube.com/watch?v=tXM38BkC2WU

Running singularity containers in Saga: https://documentation.sigma2.no/software/containers.html