From Nordic Language Processing Laboratory
Jump to: navigation, search

Using the mttools module

  • Activate the NLPL software repository and load the module:
    module use -a /projappl/nlpl/software/modules/etc         # Puhti
    module use -a /cluster/shared/nlpl/software/modules/etc   # Saga
    module load nlpl-mttools/
  • Module-specific help is available by typing:
    module help nlpl-mttools/20191218

The following scripts are part of this module:

  • moses-scripts
    • Tokenization, casing, corpus cleaning and evaluation scripts from Moses
    • Source: (scripts directory)
    • Installed revision: a89691f
    • The subfolders generic, recaser, tokenizer, training are in PATH
  • sacremoses
  • subword-nmt
    • Unsupervised Word Segmentation (a.k.a. Byte Pair Encoding) for Machine Translation and Text Generation
    • Source:
    • Installed version: 0.3.7
    • The subword-nmt executable is in PATH
  • sentencepiece
  • sacreBLEU
    • Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
    • Source:
    • Installed version: 1.4.3
    • The sacrebleu executable is in PATH
  • multeval
    • Tool to evaluate machine translation with various scores (BLEU, TER, METEOR) and to perform statistical significance testing with bootstrap resampling
    • Source:
    • Installed version: 0.5.1 with METEOR 1.5
    • The script is in PATH
  • compare-mt
    • Compare the output of multiple systems for language generation, including machine translation, summarization, dialog response generation. Computes common evaluation scores and runs analyses to find salient differences between the systems.
    • To run METEOR, consult the module-specific help page for the exact path.
    • Source:
    • Installed version: 0.2.7
    • The compare-mt executable is in PATH

Contact: Yves Scherrer, University of Helsinki,