Difference between revisions of "Lumi/pilot"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(LUMI-G Pilot)
(Software Support)
Line 25: Line 25:
 
See the links above for particular model's requirements.
 
See the links above for particular model's requirements.
  
In general, we rely on Python (>=3.7) and its [https://www.scipy.org/ SciPy] stack.
+
In general, we rely on Python (>=3.8) and its [https://www.scipy.org/ SciPy] stack.
  
We definitely will require fully functional GPU-enabled installations of PyTorch (1.8.1) and TensorFlow (preferably, both 1.15.5 and 2.4.1).
+
We definitely will require fully functional GPU-enabled installations of PyTorch (1.11) and TensorFlow (preferably, both 1.15.5 and 2.8.2).
  
 
Multi-GPU and multi-node training must be possible. In the NVIDIA world, [https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/overview.html NCCL] and [https://github.com/horovod/horovod Horovod] are used for this.  
 
Multi-GPU and multi-node training must be possible. In the NVIDIA world, [https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/overview.html NCCL] and [https://github.com/horovod/horovod Horovod] are used for this.  

Revision as of 13:45, 1 July 2022

Very Large Language Models in the Nordics (VLLMN)

Norbert.png

In the summer of 2022, the shared LUMI supercomputer will (likely) open for trial usage of its vast gpu partition. NLPL partners in Finland (Turku and Helsinki) and Norway (Oslo) are coordinating their efforts towards the creation of very large-scale (neural) language models for multiple Nordic languages. This work is part of the Nordic Language Modeling (NorLM) initiative.

Model Architectures

  • T5
  • Ablations with BERT
  • ELECTRA
  • BERT (separate Bokmål and Nynorsk models)
  • RoBERTa
  • GPT
  • Large language models with linguistically motivated inductive biases (linked to the dScience PhD position); one example is Google's ETC.

Software Support

See the links above for particular model's requirements.

In general, we rely on Python (>=3.8) and its SciPy stack.

We definitely will require fully functional GPU-enabled installations of PyTorch (1.11) and TensorFlow (preferably, both 1.15.5 and 2.8.2).

Multi-GPU and multi-node training must be possible. In the NVIDIA world, NCCL and Horovod are used for this. In the AMD world? No idea.

Data: Norwegian