Revision as of 17:10, 31 August 2020

Background

This page provides an informal, technically-oriented survey over available (and commonly used) architectures and implementations for large-scale pre-training (and fine-tuning) of contextualized neural language models.

The NLPL use case, will install, validate, and maintain a selection of these implementations, in an automated and uniform manner, on multiple HPC systems.

ELMo

Embeddings from Language Models (ELMo) use bidirectional LSTM language models to produce contextualized word token representations (Peters et al 2018)

Available implementations

- Reference Tensorflow implementation. Requirements: Python >=3.5, 1.2 < TensorFlow < 1.13 (later versions produce too many deprecation warnings), h5py.

@@ Line 11: / Line 11: @@
 = ELMo =
+Embeddings from Language Models (ELMo) use bidirectional LSTM language models to produce contextualized word token representations ([https://www.aclweb.org/anthology/N18-1202/ Peters et al 2018])
+== Available implementations ==
+- [https://github.com/allenai/bilm-tf Reference Tensorflow implementation]. Requirements: Python >=3.5, 1.2 < TensorFlow < 1.13 (later versions produce too many deprecation warnings), h5py.
 = BERT =

Difference between revisions of "Eosc/pretraining"

Revision as of 17:10, 31 August 2020

Contents

Background

ELMo

Available implementations

BERT

RoBERTa

ELECTRA

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools