Difference between revisions of "Vectors/elmo/tutorial"
(→Training ELMo on Saga) |
(→Training ELMo on Saga) |
||
| Line 41: | Line 41: | ||
python3 bin/train_elmo.py --train_prefix $DATA --size $SIZE --vocab_file $VOCAB --save_dir $OUT | python3 bin/train_elmo.py --train_prefix $DATA --size $SIZE --vocab_file $VOCAB --save_dir $OUT | ||
| − | $DATA is a path to the directory containing any number of (possibly gzipped) plain text files: your training corpus. $SIZE if the number of word tokens in $DATA (necessary to properly construct and log batches). $VOCAB is a (possibly gzipped) one-word-per-line vocabulary file; it should always contain at least <S>, </S> and <UNK>. $OUT is a directory where the TensorFlow checkpoints will be saved. | + | $DATA is a path to the directory containing any number of (possibly gzipped) plain text files: your training corpus. |
| + | $SIZE if the number of word tokens in $DATA (necessary to properly construct and log batches). | ||
| + | $VOCAB is a (possibly gzipped) one-word-per-line vocabulary file; it should always contain at least <nowiki><S></nowiki>, <nowiki></S></nowiki> and <UNK>. | ||
| + | $OUT is a directory where the TensorFlow checkpoints will be saved. | ||
Revision as of 19:22, 29 September 2019
Background
ELMo is a family of contextualized word embeddings first introduced in [Peter et al. 2018].
Training ELMo on Saga
As of now, one should use Anaconda to get working GPU-enabled TensorFlow on Saga. tensorflow-gpu Python package is then installed locally.
After that, the code from https://github.com/akutuzov/bilm-tf can be used to train a model. More instructions to appear later.
Example SLURM file:
#!/bin/bash
#SBATCH --job-name=elmo
#SBATCH --mail-type=FAIL
#SBATCH --account=nn9447k # Use your project number
#SBATCH --partition=accel # To use the accelerator nodes
#SBATCH --gres=gpu:2 # To specify how many GPUs to use
#SBATCH --time=10:00:00 # Max walltime is 14 days.
#SBATCH --mem-per-cpu=6G
#SBATCH --ntasks=8
set -o errexit # Recommended for easier debugging
module purge # Recommended for reproducibility
module load Anaconda3/2019.03
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/cluster/software/Anaconda3/2019.03/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/cluster/software/Anaconda3/2019.03/etc/profile.d/conda.sh" ]; then
. "/cluster/software/Anaconda3/2019.03/etc/profile.d/conda.sh"
else
export PATH="/cluster/software/Anaconda3/2019.03/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
conda activate python3.6
python3 bin/train_elmo.py --train_prefix $DATA --size $SIZE --vocab_file $VOCAB --save_dir $OUT
$DATA is a path to the directory containing any number of (possibly gzipped) plain text files: your training corpus. $SIZE if the number of word tokens in $DATA (necessary to properly construct and log batches). $VOCAB is a (possibly gzipped) one-word-per-line vocabulary file; it should always contain at least <S>, </S> and <UNK>. $OUT is a directory where the TensorFlow checkpoints will be saved.