Difference between revisions of "Infrastructure/software/frameworks"
(Created page with "= Background = General programming environment for so-called neural or deep learning (DL) are a prerequisite to much current NLP research, somewhat analogous to compilers or ...") |
(→Background) |
||
(18 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Background = | = Background = | ||
− | General programming | + | General programming environments for so-called neural or deep learning (DL) |
are a prerequisite to much current NLP research, somewhat analogous to | are a prerequisite to much current NLP research, somewhat analogous to | ||
compilers or core libraries (like Boost or ICU). | compilers or core libraries (like Boost or ICU). | ||
− | These frameworks are not in principle discipline-specific and will | + | These frameworks are not in principle discipline-specific and will sometimes |
be difficult to install in user space (i.e. without administrator privileges). | be difficult to install in user space (i.e. without administrator privileges). | ||
+ | Thus, general DL support should arguably be provided at the system level (not on a | ||
+ | per-project basis), and NLPL seeks to work with the system administrators | ||
+ | to make sure that (a) relevant frameworks are available (in all relevant | ||
+ | versions, both for CPU and GPU usage) and (b) that system-wide installations | ||
+ | provide at least some degree of parallelism between the systems available | ||
+ | to NLPL users. | ||
+ | These are no small ambitions :-). | ||
+ | In mid-2018, the NLPL infrastructure task force has at least temporarily | ||
+ | deviated from some of the above and has created project-owned installations | ||
+ | of PyTorch and TensorFlow (on both systems). | ||
+ | There are certain benefits to this approach too, notably increased | ||
+ | parallelism across the two systems and tighter integration with other | ||
+ | elements of the project-specific NLPL installations, for example | ||
+ | ‘stacking’ of Python virtual environments that are mutually interoperable | ||
+ | (because they all derive from the same basic Python installation and | ||
+ | choice of compilation and dynamic library environment). | ||
+ | |||
+ | <pre> | ||
+ | module purge | ||
+ | module use -a /proj*/nlpl/software/modulefiles | ||
+ | module load nlpl-nltk/3.4/3.5 nlpl-pytorch/0.4.1 nlpl-tensorflow/1.11 | ||
+ | cat << EOF | python3 | ||
+ | import nltk; import torch; import tensorflow; | ||
+ | print(nltk.__version__); | ||
+ | print("%s: %s" % (torch.__version__, torch.cuda.is_available())); | ||
+ | print(tensorflow.__version__); | ||
+ | EOF | ||
+ | </pre> | ||
= Taito = | = Taito = | ||
+ | |||
+ | It appears that CSC has started to actively support deep learning, possibly at | ||
+ | least in part in response to requests from NLPL users. | ||
+ | In early 2018, the following packages are available on Taito (although some do | ||
+ | not seem easily discoverable via the <tt>module</tt> command): | ||
[https://research.csc.fi/software/-/asset_publisher/wfvLxzjnZlJx/content/dynet2 DyNet] 2.0 and 2.0.1, apparently both for CPU and GPU nodes | [https://research.csc.fi/software/-/asset_publisher/wfvLxzjnZlJx/content/dynet2 DyNet] 2.0 and 2.0.1, apparently both for CPU and GPU nodes | ||
Line 14: | Line 47: | ||
[https://research.csc.fi/-/tensorflow TensorFlow] many versions, from 0.11 through 1.5.0; seemingly only installed for GPU nodes, though page provides instructions for user-level installation of CPU-only version. | [https://research.csc.fi/-/tensorflow TensorFlow] many versions, from 0.11 through 1.5.0; seemingly only installed for GPU nodes, though page provides instructions for user-level installation of CPU-only version. | ||
− | [https://research.csc.fi/-/mlpython ML Bundle] Python 2.x and 3.x installations with multiple DL frameworks included, notably MxNet, PyTorch, Theano, and TensorFlow. No version information available for individual DL frameworks, and version scheme for the bundle at large seems to follow Python versions. Potentially troubling announcement that “packages in the Mlpython environments are updated periodically.” | + | [https://research.csc.fi/-/mlpython ML Bundle] Python 2.x and 3.x installations with multiple DL frameworks included, notably MxNet, PyTorch, Theano, and TensorFlow. No version information available for individual DL frameworks, and version scheme for the bundle at large seems to follow Python versions. Seemingly only available on GPU nodes. Potentially troubling announcement that “packages in the Mlpython environments are updated periodically.” |
+ | |||
+ | = Abel = | ||
+ | |||
+ | In early 2018, the two system-wide installations of DL frameworks | ||
+ | on Abel appear to reflect requests from NLPL users. | ||
+ | |||
+ | [http://www.uio.no/english/services/it/research/hpc/abel/help/software/dynet.html DyNet] no version number specified, but apparently 2.0 (<tt>dynet.__version__</tt>); may be CPU-only; Singularity container | ||
+ | [http://www.uio.no/english/services/it/research/hpc/abel/help/software/tensorflow.html TensorFlow] no version number specified, but apparently 1.0.1 (<tt>tensorflow.__version__</tt>); supports both CPU and GPU nodes; Singularity container. | ||
− | = Abel | + | Installation inside of Singularity containers on Abel limits |
+ | interoperability, as there is no straightforward way of ‘mixing and matching’ | ||
+ | with other modules. | ||
+ | |||
+ | = Vision = | ||
+ | |||
+ | The ideal situation, from the NLPL point of view, would be | ||
+ | (a) availability and support for all relevant DL frameworks, | ||
+ | in (b) multiple, clearly discernible release versions as the | ||
+ | frameworks evolve. | ||
+ | It should be possible to (c) combine multiple frameworks with | ||
+ | each other (the Uppsala parser, for example, internally | ||
+ | both requires DyNet and TensorFlow, each in a particular | ||
+ | version). | ||
+ | Installations and usage patterns on Abel and Taito should | ||
+ | be (d) uniform to the highest degree possible across the | ||
+ | two systems. | ||
+ | Even though there appears to be a current dominant choice | ||
+ | of ‘glue’ programming language, viz. Python, at least some | ||
+ | DL frameworks also provide APIs (and relevant use patterns) | ||
+ | in other programming languages, i.e. installations should be | ||
+ | (e) not too closely tied to Python. | ||
+ | |||
+ | How to best accomplish (something resembling) the above is | ||
+ | no trivial question. | ||
+ | Possibly the [https://research.csc.fi/-/mlpython ‘bundle’] | ||
+ | approach pursued at CSC is a good way to go, but then one | ||
+ | would need a relatively large number of bundles, i.e. one | ||
+ | version for each distinct combination of DL framework | ||
+ | releases in the bundle (cross-multiplied with relevant | ||
+ | Python versions). | ||
+ | At the cost of having to rebuild (as a new version) a fresh | ||
+ | bundle whenever a new release of any of its components | ||
+ | becomes available, this strategy would seem to address | ||
+ | desiderata (a) through (d), though not obviously (e). |
Latest revision as of 22:14, 11 May 2019
Contents
Background
General programming environments for so-called neural or deep learning (DL) are a prerequisite to much current NLP research, somewhat analogous to compilers or core libraries (like Boost or ICU). These frameworks are not in principle discipline-specific and will sometimes be difficult to install in user space (i.e. without administrator privileges). Thus, general DL support should arguably be provided at the system level (not on a per-project basis), and NLPL seeks to work with the system administrators to make sure that (a) relevant frameworks are available (in all relevant versions, both for CPU and GPU usage) and (b) that system-wide installations provide at least some degree of parallelism between the systems available to NLPL users. These are no small ambitions :-).
In mid-2018, the NLPL infrastructure task force has at least temporarily deviated from some of the above and has created project-owned installations of PyTorch and TensorFlow (on both systems). There are certain benefits to this approach too, notably increased parallelism across the two systems and tighter integration with other elements of the project-specific NLPL installations, for example ‘stacking’ of Python virtual environments that are mutually interoperable (because they all derive from the same basic Python installation and choice of compilation and dynamic library environment).
module purge module use -a /proj*/nlpl/software/modulefiles module load nlpl-nltk/3.4/3.5 nlpl-pytorch/0.4.1 nlpl-tensorflow/1.11 cat << EOF | python3 import nltk; import torch; import tensorflow; print(nltk.__version__); print("%s: %s" % (torch.__version__, torch.cuda.is_available())); print(tensorflow.__version__); EOF
Taito
It appears that CSC has started to actively support deep learning, possibly at least in part in response to requests from NLPL users. In early 2018, the following packages are available on Taito (although some do not seem easily discoverable via the module command):
DyNet 2.0 and 2.0.1, apparently both for CPU and GPU nodes
TensorFlow many versions, from 0.11 through 1.5.0; seemingly only installed for GPU nodes, though page provides instructions for user-level installation of CPU-only version.
ML Bundle Python 2.x and 3.x installations with multiple DL frameworks included, notably MxNet, PyTorch, Theano, and TensorFlow. No version information available for individual DL frameworks, and version scheme for the bundle at large seems to follow Python versions. Seemingly only available on GPU nodes. Potentially troubling announcement that “packages in the Mlpython environments are updated periodically.”
Abel
In early 2018, the two system-wide installations of DL frameworks on Abel appear to reflect requests from NLPL users.
DyNet no version number specified, but apparently 2.0 (dynet.__version__); may be CPU-only; Singularity container
TensorFlow no version number specified, but apparently 1.0.1 (tensorflow.__version__); supports both CPU and GPU nodes; Singularity container.
Installation inside of Singularity containers on Abel limits interoperability, as there is no straightforward way of ‘mixing and matching’ with other modules.
Vision
The ideal situation, from the NLPL point of view, would be (a) availability and support for all relevant DL frameworks, in (b) multiple, clearly discernible release versions as the frameworks evolve. It should be possible to (c) combine multiple frameworks with each other (the Uppsala parser, for example, internally both requires DyNet and TensorFlow, each in a particular version). Installations and usage patterns on Abel and Taito should be (d) uniform to the highest degree possible across the two systems. Even though there appears to be a current dominant choice of ‘glue’ programming language, viz. Python, at least some DL frameworks also provide APIs (and relevant use patterns) in other programming languages, i.e. installations should be (e) not too closely tied to Python.
How to best accomplish (something resembling) the above is no trivial question. Possibly the ‘bundle’ approach pursued at CSC is a good way to go, but then one would need a relatively large number of bundles, i.e. one version for each distinct combination of DL framework releases in the bundle (cross-multiplied with relevant Python versions). At the cost of having to rebuild (as a new version) a fresh bundle whenever a new release of any of its components becomes available, this strategy would seem to address desiderata (a) through (d), though not obviously (e).