Difference between revisions of "Infrastructure/software/catalogue"
(→Module Catalogue) |
(→Activity A: Basic Infrastructure) |
||
Line 53: | Line 53: | ||
|- | |- | ||
! Module Name/Version !! Description !! System !! Install Date !! Maintainer | ! Module Name/Version !! Description !! System !! Install Date !! Maintainer | ||
+ | |- | ||
+ | | nlpl-cupy/5.4.0 || Matrix Library Accelerated by CUDA || Abel (3.7) || May 2018 || Stephan Oepen | ||
|- | |- | ||
| nlpl-cython/0.29.3 || C Extensions for Python || Abel (3.5, 3.7) || December 2018 || Stephan Oepen | | nlpl-cython/0.29.3 || C Extensions for Python || Abel (3.5, 3.7) || December 2018 || Stephan Oepen | ||
Line 59: | Line 61: | ||
|- | |- | ||
| nlpl-gensim/3.7.0 || C Extensions for Python || Abel (3.5, 3.7) || December 2018 || Stephan Oepen | | nlpl-gensim/3.7.0 || C Extensions for Python || Abel (3.5, 3.7) || December 2018 || Stephan Oepen | ||
+ | |- | ||
+ | | nlpl-gensim/3.7.3 || C Extensions for Python || Abel (3.5, 3.7) || May 2018 || Stephan Oepen | ||
|- | |- | ||
| [http://wiki.nlpl.eu/index.php/Infrastructure/software/nltk nlpl-nltk/3.3] || Natural Language Toolkit (NLTK) || Abel, Taito || September 2018 || Stephan Oepen | | [http://wiki.nlpl.eu/index.php/Infrastructure/software/nltk nlpl-nltk/3.3] || Natural Language Toolkit (NLTK) || Abel, Taito || September 2018 || Stephan Oepen | ||
Line 65: | Line 69: | ||
|- | |- | ||
| [http://wiki.nlpl.eu/index.php/Infrastructure/software/pytorch nlpl-pytorch/1.0.0] || PyTorch Deep Learning Framework (CPU and GPU) || Abel (3.5, 3.7) || January 2019 || Stephan Oepen | | [http://wiki.nlpl.eu/index.php/Infrastructure/software/pytorch nlpl-pytorch/1.0.0] || PyTorch Deep Learning Framework (CPU and GPU) || Abel (3.5, 3.7) || January 2019 || Stephan Oepen | ||
+ | |- | ||
+ | | [http://wiki.nlpl.eu/index.php/Infrastructure/software/pytorch nlpl-pytorch/1.1.0] || PyTorch Deep Learning Framework (CPU and GPU) || Abel (3.5, 3.7) || May 2019 || Stephan Oepen | ||
|- | |- | ||
| [http://wiki.nlpl.eu/index.php/Infrastructure/software/spacy nlpl-spacy/2.0.12] || spaCy: Natural Language Processing in Python || Abel, Taito || October 2018 || Stephan Oepen | | [http://wiki.nlpl.eu/index.php/Infrastructure/software/spacy nlpl-spacy/2.0.12] || spaCy: Natural Language Processing in Python || Abel, Taito || October 2018 || Stephan Oepen |
Revision as of 20:53, 11 May 2019
Contents
Background
This page provides a high-level summary of NLPL-specific software installed on either of our two systems. As a rule of thumb, NLPL aims to build on generic software installations provided by the system maintainers (e.g. development tools and libraries that are not discipline-specific), using the modules infrastructure. For example, an environment like OpenNMT is unlikely to be used by other disciplines, and NLPL stands to gain from in-house, shared expertise that comes with maintaining a project-specific installation. On the other hand, the CUDA libraries are general extensions to the operating system that most users of deep learning frameworks on gpus will want to use; hence, CUDA is most appropriately installed by the core system maintainers. Frameworks like PyTorch and TensorFlow, arguably, present a middle ground to this rule of thumb: In principle, they are not discipline-specific, but in mid-2018 at least the demand for installations of these frameworks is strong within NLPL, and the project will likely benefit from growing its competencies in this area.
Module Catalogue
The discipline-specific modules maintained by NLPL are not activated by default. To make available the NLPL directory of module configurations, on top of the pre-configured, system-wide modules, one needs to:
module use -a /proj*/nlpl/software/modulefiles/
We will at times assume a shell variable $NLPLROOT that points to the top-level project directory, i.e. /projects/nlpl/ (on Abel) or /proj/nlpl/ (on Taito). For NLPL users, we recommend that one adds the above module use command to the shell start-up script, e.g. .bashrc in the user home directory.
To inspect what is available, one can use the avail sub-command (on Abel), e.g.
module avail 2>&1 | grep nlpl
Activity A: Basic Infrastructure
Interoperability of NLPL installations with each other, as well as with system-wide software that is maintained by the core operations teams for Abel and Taito, is no small challenge; neither is parallelism across the two systems, for example in available software (and versions) and techniques for ‘mixing and matching’. These challenges are discussed in some more detail with regard to the Python programming environment and with regard to common Deep Learning frameworks.
Module Name/Version | Description | System | Install Date | Maintainer |
---|---|---|---|---|
nlpl-cupy/5.4.0 | Matrix Library Accelerated by CUDA | Abel (3.7) | May 2018 | Stephan Oepen |
nlpl-cython/0.29.3 | C Extensions for Python | Abel (3.5, 3.7) | December 2018 | Stephan Oepen |
nlpl-dynet/2.1 | DyNet Dynamic Neural Network Toolkit (CPU) | Abel (3.5, 3.7) | February 2019 | Stephan Oepen |
nlpl-gensim/3.7.0 | C Extensions for Python | Abel (3.5, 3.7) | December 2018 | Stephan Oepen |
nlpl-gensim/3.7.3 | C Extensions for Python | Abel (3.5, 3.7) | May 2018 | Stephan Oepen |
nlpl-nltk/3.3 | Natural Language Toolkit (NLTK) | Abel, Taito | September 2018 | Stephan Oepen |
nlpl-pytorch/0.4.1 | PyTorch Deep Learning Framework (CPU and GPU) | Abel, Taito | September 2018 | Stephan Oepen |
nlpl-pytorch/1.0.0 | PyTorch Deep Learning Framework (CPU and GPU) | Abel (3.5, 3.7) | January 2019 | Stephan Oepen |
nlpl-pytorch/1.1.0 | PyTorch Deep Learning Framework (CPU and GPU) | Abel (3.5, 3.7) | May 2019 | Stephan Oepen |
nlpl-spacy/2.0.12 | spaCy: Natural Language Processing in Python | Abel, Taito | October 2018 | Stephan Oepen |
nlpl-scipy/201901 | SciPy Ecosystem of Python Add-Ons | Abel (3.5, 3.7) | January 2019 | Stephan Oepen |
nlpl-tensorflow/1.11 | TensorFlow Deep Learning Framework (CPU and GPU) | Abel, Taito | September 2018 | Stephan Oepen |
Activity B: Statistical and Neural Machine Translation
Module Name/Version | Description | System | Install Date | Maintainer |
---|---|---|---|---|
nlpl-moses/mmt-mvp-v0.12.1-2739-gdc42bcb | Moses SMT system, including GIZA++, MGIZA, fast_align | Taito | July 2017 | Yves Scherrer |
nlpl-moses/4.0-65c75ff | Moses SMT System Release 4.0, including GIZA++, MGIZA, fast_align, SALM Some minor fixes added to existing install 2/2018. Should not break compatibility except when using tokenizer.perl for Finnish or Swedish. |
Taito, Abel | November 2017 | Yves Scherrer |
nlpl-efmaral/0.1_2017_07_20 | efmaral and eflomal word alignment tools | Taito | July 2017 | Yves Scherrer |
nlpl-efmaral/0.1_2017_11_24 | efmaral and eflomal word alignment tools | Taito, Abel | November 2017 | Yves Scherrer |
nlpl-efmaral/0.1_2018_12_13/17 | efmaral and eflomal word alignment tools | Taito, Abel | December 2018 | Yves Scherrer |
nlpl-hnmt/1.0.1 | HNMT neural machine translation system | Taito | March 2018 | Yves Scherrer |
nlpl-opennmt-py/0.2.1 | OpenNMT Python Library | Abel, Taito | September 2018 | Stephan Oepen |
nlpl-marian/1.2.0 | Marian neural machine translation system | Taito | March 2018 | Yves Scherrer |
marian/1.5 | Marian neural machine translation system | Taito | June 2018 | CSC staff |
nlpl-mttools/2018_12_23 | A collection of preprocessing and evaluation script for machine translation | Taito, Abel | December 2018 | Yves Scherrer |
Activity C: Data-Driven Parsing
Module Name/Version | Description | System | Install Date | Maintainer |
---|---|---|---|---|
nlpl-uuparser | Uppsala Parser | Abel | December 2018 | |
nlpl-udpipe/1.2.1-devel | UDPipe 1.2 with Pre-Trained Models | Taito, Abel | November 2017 | Jörg Tiedemann |
nlpl-dozat/201812 | Stanford Graph-Based Parser by Tim Dozat (v3) | Abel | December 2018 | Stephan Oepen |
nlpl-repp/201812 | REPP Tokenizer (and Sentence Splitter) | Abel | December 2018 | Stephan Oepen |
nlpl-stanfordnlp/0.1.1 | Stanford NLP Neural Pipeline | Abel | February 2019 | Stephan Oepen |
Activity E: Pre-Trained Word Embeddings
Module Name/Version | Description | System | Install Date | Maintainer |
---|---|---|---|---|
nlpl-gensim/3.6.0 | GenSim: Topic Modeling for Humans | Taito, Abel | October 2018 | Stephan Oepen |
Activity G: OPUS Parallel Corpus
Module Name/Version | Description | System | Install Date | Maintainer |
---|---|---|---|---|
nlpl-cwb/3.4.12 | Corpus Work Bench (CWB) | Taito, Abel | November 2017 | Jörg Tiedemann |
nlpl-opus/0.1 | Various OPUS Tools | Taito, Abel | November 2017 | Jörg Tiedemann |
nlpl-opus/0.2 | Various OPUS Tools | Taito, Abel | 2018 | Jörg Tiedemann |
nlpl-opus/201901 | Various OPUS Tools | Taito, Abel | January 2019 | Jörg Tiedemann |
nlpl-uplug/0.3.8dev | UPlug Parallel Corpus Tools | Taito, Abel | November 2017 | Jörg Tiedemann |