Difference between revisions of "Eosc/easybuild"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Background)
(References)
(5 intermediate revisions by the same user not shown)
Line 10: Line 10:
 
To get started, we set out to re-create one common stack of NLPL
 
To get started, we set out to re-create one common stack of NLPL
 
modules in a fully automated EasyBuild configuration, viz.
 
modules in a fully automated EasyBuild configuration, viz.
Python 3.7.4, NumPy 1.18.1, the SciPy Bundle (SciPy 1.4.1, SciKit-Learn 0.22.1, iPython 7.11.1, MatPlotLib 3.1.2, Pandas 0.23.1), and TensorFlow 1.15.2).
+
Python 3.7.4, NumPy 1.18.1, the SciPy Bundle (SciPy 1.4.1, SciKit-Learn 0.22.1, iPython 7.11.1, MatPlotLib 3.1.2, Pandas 0.23.1), and TensorFlow 1.15.2.
 
For additional thrill, there should be two versions of NumPy, one installed
 
For additional thrill, there should be two versions of NumPy, one installed
 
with the MKL backend, the other without (using the default, which we
 
with the MKL backend, the other without (using the default, which we
 
believe is OpenBLAS).
 
believe is OpenBLAS).
 +
All modules should be maximally optimized for the available
 +
hardware, and TensorFlow should be built on top of CUDA/10.0 and
 +
cuDNN/7.6.4.
 +
 
Ideally, this choice near the bottom of the dependency tree
 
Ideally, this choice near the bottom of the dependency tree
 
should not propagate into the higher-level modules, i.e. we would
 
should not propagate into the higher-level modules, i.e. we would
 
hope to have only once instance of the SciPy bundle or TensorFlow,
 
hope to have only once instance of the SciPy bundle or TensorFlow,
 
and they would interoperate seamlessly with either choice for NumPy.
 
and they would interoperate seamlessly with either choice for NumPy.
 +
Furthermore, we are interested in re-using system-wide modules
 +
on Saga, i.e. preferably the NLPL add-on module stack should
 +
not include its own version of the core Python intepreter,
 +
nor of the MKL, CUDA, or cuDNN libraries.
 +
To distinguish system-wide from NLPL-specific modules,
 +
we would want to prefix the names of our own modules with 'nlpl-'.
 +
At the same time, module identities should not be unnecessarily
 +
specific: for example, CUDA versions are independent of
 +
toolchains, so their modules should be toolchain-agnostic.
  
= Important stuff to remember =
+
= References =
'''export EB_PYTHON=python3'''
 
  
'''module load EasyBuild/4.3.0'''
+
Jülich & Ghent: http://easybuilders.github.io/easybuild/files/eb-jsc-hust16.pdf
 
 
Playground on Saga: /cluster/shared/nlpl/software/easybuild4
 
 
 
'''export EASYBUILD_ROBOT_PATHS=/cluster/software/EasyBuild/4.3.0/easybuild/easyconfigs:/cluster/shared/nlpl/software/easybuild4'''
 
 
 
Repository: https://source.coderefinery.org/nlpl/easybuild
 
 
 
= References =
 
  
 
Compute Canada: https://www.youtube.com/watch?v=_0j5Shuf2uE
 
Compute Canada: https://www.youtube.com/watch?v=_0j5Shuf2uE
  
Jülich & Ghent: http://easybuilders.github.io/easybuild/files/eb-jsc-hust16.pdf
+
EESSI: https://eessi.github.io/docs/

Revision as of 10:21, 21 October 2020

Background

The goal is to organize provisioning of software (for NLP research) in a manner that makes it possible and cost-efficient to maintain the exact same software stack on multiple systems. Here, systems initially means different superclusters, e.g. Puhti in Finland and Saga in Norway; sometime in 2021, we anticipate to additionally support the LUMI environment. In principle, As part of the NLPL use case in EOSC-Nordic, we are evaluating EasyBuild for this purpose.

Desk Pilot

To get started, we set out to re-create one common stack of NLPL modules in a fully automated EasyBuild configuration, viz. Python 3.7.4, NumPy 1.18.1, the SciPy Bundle (SciPy 1.4.1, SciKit-Learn 0.22.1, iPython 7.11.1, MatPlotLib 3.1.2, Pandas 0.23.1), and TensorFlow 1.15.2. For additional thrill, there should be two versions of NumPy, one installed with the MKL backend, the other without (using the default, which we believe is OpenBLAS). All modules should be maximally optimized for the available hardware, and TensorFlow should be built on top of CUDA/10.0 and cuDNN/7.6.4.

Ideally, this choice near the bottom of the dependency tree should not propagate into the higher-level modules, i.e. we would hope to have only once instance of the SciPy bundle or TensorFlow, and they would interoperate seamlessly with either choice for NumPy. Furthermore, we are interested in re-using system-wide modules on Saga, i.e. preferably the NLPL add-on module stack should not include its own version of the core Python intepreter, nor of the MKL, CUDA, or cuDNN libraries. To distinguish system-wide from NLPL-specific modules, we would want to prefix the names of our own modules with 'nlpl-'. At the same time, module identities should not be unnecessarily specific: for example, CUDA versions are independent of toolchains, so their modules should be toolchain-agnostic.

References

Jülich & Ghent: http://easybuilders.github.io/easybuild/files/eb-jsc-hust16.pdf

Compute Canada: https://www.youtube.com/watch?v=_0j5Shuf2uE

EESSI: https://eessi.github.io/docs/