Parsing/ud

From Nordic Language Processing Laboratory
Revision as of 10:06, 31 December 2018 by Oe (talk | contribs)
Jump to: navigation, search

Universal Dependencies

For syntactic parsing experiments we provide data from the Universal Dependencies (UD) project for a high number of languages. The data is provided in v2.0, which was used for the CoNLL shared task 2017, v2.1, v2.2, which was used for the CoNLL shared task 2018, and v2.3.

All data is available on Abel at /projects/nlpl/data/parsing/ud and automatically replicated to Taito into the corresponding path below /proj/nlpl/.

UD version 2.0

folders:
/projects/nlpl/data/parsing/ud/ud-treebanks-v2.0-conll2017
/projects/nlpl/data/parsing/ud/ud-test-v2.0-conll2017

info:
Version 2.0 treebanks, archived at http://hdl.handle.net/11234/1-1983.
70 treebanks, 50 languages, released March 1, 2017.
Test data 2.0 are archived at http://hdl.handle.net/11234/1-2184.
81 treebanks, 49 languages, released May 18, 2017.

Release 2.0 has test data released separately from the test data, which is reflected in our folder structure. This data was released for the CoNLL 2017 shared task.

UD version 2.1

folders:
/projects/nlpl/data/parsing/ud/ud-treebanks-v2.1

info:
Version 2.1 treebanks are available at http://hdl.handle.net/11234/1-2515.
102 treebanks, 60 languages, released November 15, 2017.

UD version 2.2

folders:
/projects/nlpl/data/parsing/ud/ud-treebanks-v2.2

info:
Version 2.2 treebanks are available at http://hdl.handle.net/11234/1-2837.
122 treebanks, 71 languages, released July 1, 2018.

UD version 2.3

folders:
/projects/nlpl/data/parsing/ud/ud-treebanks-v2.3

info:
Version 2.2 treebanks are available at http://hdl.handle.net/11234/1-2895.
129 treebanks, 76 languages, released November 15, 2018.

Contact

Joakim Nivre, Uppsala University
Sara Stymne, Uppsala University
firstname.lastname@lingfil.uu.se