Difference between revisions of "Translation/mttools"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Using the mttools module)
 
Line 8: Line 8:
 
</li>
 
</li>
 
<li>Module-specific help is available by typing:
 
<li>Module-specific help is available by typing:
<pre>module help nlpl-mttools/20191218</pre>
+
<pre>module help nlpl-mttools</pre>
 
</li>
 
</li>
 
</ul>
 
</ul>
Line 18: Line 18:
 
<li>Tokenization, casing, corpus cleaning and evaluation scripts from Moses</li>
 
<li>Tokenization, casing, corpus cleaning and evaluation scripts from Moses</li>
 
<li>Source: https://github.com/moses-smt/mosesdecoder (scripts directory)</li>
 
<li>Source: https://github.com/moses-smt/mosesdecoder (scripts directory)</li>
<li>Installed revision: a89691f</li>
+
<li>Installed revision: 3990724</li>
 
<li>The subfolders <code>generic</code>, <code>recaser</code>, <code>tokenizer</code>, <code>training</code> are in PATH</li>
 
<li>The subfolders <code>generic</code>, <code>recaser</code>, <code>tokenizer</code>, <code>training</code> are in PATH</li>
 
</ul>
 
</ul>
Line 31: Line 31:
 
<li>Unsupervised Word Segmentation (a.k.a. Byte Pair Encoding) for Machine Translation and Text Generation</li>
 
<li>Unsupervised Word Segmentation (a.k.a. Byte Pair Encoding) for Machine Translation and Text Generation</li>
 
<li>Source: https://github.com/rsennrich/subword-nmt</li>
 
<li>Source: https://github.com/rsennrich/subword-nmt</li>
<li>Installed version: 0.3.7</li>
+
<li>Installed version: 0.3.8</li>
 
<li>The <code>subword-nmt</code> executable is in PATH</li>
 
<li>The <code>subword-nmt</code> executable is in PATH</li>
 
</ul>
 
</ul>
Line 38: Line 38:
 
<li>Unsupervised text tokenizer for Neural Network-based text generation</li>
 
<li>Unsupervised text tokenizer for Neural Network-based text generation</li>
 
<li>Source: https://github.com/google/sentencepiece</li>
 
<li>Source: https://github.com/google/sentencepiece</li>
<li>Installed version: 0.1.85</li>
+
<li>Installed version: 0.1.97</li>
 
<li>The <code>spm_*</code> executables are in PATH</li>
 
<li>The <code>spm_*</code> executables are in PATH</li>
 
</ul>
 
</ul>
Line 45: Line 45:
 
<li>Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons</li>
 
<li>Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons</li>
 
<li>Source: https://github.com/mjpost/sacreBLEU</li>
 
<li>Source: https://github.com/mjpost/sacreBLEU</li>
<li>Installed version: 1.4.3</li>
+
<li>Installed version: 2.2.1</li>
 
<li>The <code>sacrebleu</code> executable is in PATH</li>
 
<li>The <code>sacrebleu</code> executable is in PATH</li>
 
</ul>
 
</ul>
Line 60: Line 60:
 
<li>To run METEOR, consult the module-specific help page for the exact path.</li>
 
<li>To run METEOR, consult the module-specific help page for the exact path.</li>
 
<li>Source: https://github.com/neulab/compare-mt</li>
 
<li>Source: https://github.com/neulab/compare-mt</li>
<li>Installed version: 0.2.7</li>
+
<li>Installed version: 0.2.10</li>
 
<li>The compare-mt executable is in PATH</li>
 
<li>The compare-mt executable is in PATH</li>
 
</ul>
 
</ul>

Latest revision as of 11:55, 21 October 2022

Using the mttools module

  • Activate the NLPL software repository and load the module:
    module use -a /projappl/nlpl/software/modules/etc         # Puhti
    module use -a /cluster/shared/nlpl/software/modules/etc   # Saga
    module load nlpl-mttools/
  • Module-specific help is available by typing:
    module help nlpl-mttools

The following scripts are part of this module:

  • moses-scripts
    • Tokenization, casing, corpus cleaning and evaluation scripts from Moses
    • Source: https://github.com/moses-smt/mosesdecoder (scripts directory)
    • Installed revision: 3990724
    • The subfolders generic, recaser, tokenizer, training are in PATH
  • sacremoses
  • subword-nmt
    • Unsupervised Word Segmentation (a.k.a. Byte Pair Encoding) for Machine Translation and Text Generation
    • Source: https://github.com/rsennrich/subword-nmt
    • Installed version: 0.3.8
    • The subword-nmt executable is in PATH
  • sentencepiece
  • sacreBLEU
    • Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
    • Source: https://github.com/mjpost/sacreBLEU
    • Installed version: 2.2.1
    • The sacrebleu executable is in PATH
  • multeval
    • Tool to evaluate machine translation with various scores (BLEU, TER, METEOR) and to perform statistical significance testing with bootstrap resampling
    • Source: https://github.com/jhclark/multeval
    • Installed version: 0.5.1 with METEOR 1.5
    • The multeval.sh script is in PATH
  • compare-mt
    • Compare the output of multiple systems for language generation, including machine translation, summarization, dialog response generation. Computes common evaluation scores and runs analyses to find salient differences between the systems.
    • To run METEOR, consult the module-specific help page for the exact path.
    • Source: https://github.com/neulab/compare-mt
    • Installed version: 0.2.10
    • The compare-mt executable is in PATH


Contact: Yves Scherrer, University of Helsinki, firstname.lastname@helsinki.fi