Difference between revisions of "Eosc/pretraining"
(→Available implementations) |
(→Available implementations) |
||
Line 14: | Line 14: | ||
== Available implementations == | == Available implementations == | ||
− | - [https://github.com/allenai/bilm-tf Reference Tensorflow implementation]. Requirements: Python >=3.5, 1.2 < TensorFlow < 1.13 (later versions produce too many deprecation warnings), h5py. | + | - [https://github.com/allenai/bilm-tf Reference Tensorflow implementation]. |
+ | Requirements: Python >=3.5, 1.2 < TensorFlow < 1.13 (later versions produce too many deprecation warnings), h5py. | ||
− | - [https://github.com/ltgoslo/simple_elmo_training LTG implementation]. Based on the reference implementation, but with improved data loading, hyperparameter handling, and the code updated to more recent versions of TensorFlow. Requirements: Python >=3.5, 1.15 <= Tensorflow < 2.0 (2.0 version is planned), h5py, smart_open. | + | - [https://github.com/ltgoslo/simple_elmo_training LTG implementation]. Based on the reference implementation, but with improved data loading, hyperparameter handling, and the code updated to more recent versions of TensorFlow. |
+ | Requirements: Python >=3.5, 1.15 <= Tensorflow < 2.0 (2.0 version is planned), h5py, smart_open. | ||
− | - [https://docs.allennlp.org/master/api/data/token_indexers/elmo_indexer/ PyTorch implementation in AllenNLP]. Not much interesting to us, since it does not support training, only inference. Requirements: Python >= 3.6, 1.6 <= PyTorch < 1.7. | + | - [https://docs.allennlp.org/master/api/data/token_indexers/elmo_indexer/ PyTorch implementation in AllenNLP]. Not much interesting to us, since it does not support training, only inference. |
+ | Requirements: Python >= 3.6, 1.6 <= PyTorch < 1.7. | ||
= BERT = | = BERT = |
Revision as of 17:27, 31 August 2020
Background
This page provides an informal, technically-oriented survey over available (and commonly used) architectures and implementations for large-scale pre-training (and fine-tuning) of contextualized neural language models.
The NLPL use case, will install, validate, and maintain a selection of these implementations, in an automated and uniform manner, on multiple HPC systems.
ELMo
Embeddings from Language Models (ELMo) use bidirectional LSTM language models to produce contextualized word token representations (Peters et al 2018)
Available implementations
- Reference Tensorflow implementation. Requirements: Python >=3.5, 1.2 < TensorFlow < 1.13 (later versions produce too many deprecation warnings), h5py.
- LTG implementation. Based on the reference implementation, but with improved data loading, hyperparameter handling, and the code updated to more recent versions of TensorFlow. Requirements: Python >=3.5, 1.15 <= Tensorflow < 2.0 (2.0 version is planned), h5py, smart_open.
- PyTorch implementation in AllenNLP. Not much interesting to us, since it does not support training, only inference. Requirements: Python >= 3.6, 1.6 <= PyTorch < 1.7.