Difference between revisions of "Parsing/home"
(Created page with "= Background = An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). Initially, the soft...") |
(→Preprocessing Tools) |
||
(17 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
= Background = | = Background = | ||
− | An experimentation environment for data-driven dependency parsing | + | An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). |
− | is maintained for NLPL under the coordination of Uppsala University (UU). | ||
Initially, the software and data are commissioned on the Norwegian Abel supercluster. | Initially, the software and data are commissioned on the Norwegian Abel supercluster. | ||
+ | |||
+ | = Preprocessing Tools = | ||
+ | |||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/repp REPP Tokenizer (English and Norwegian)] | ||
+ | |||
+ | Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al. | ||
+ | are available through the NLPL installations of the | ||
+ | [http://nltk.org Natural Language Processing Toolkit (NLTK)] and the | ||
+ | [https://spacy.io spaCy: Natural Language Processing in Python] tools. | ||
+ | |||
+ | = Parsing Systems = | ||
+ | |||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser] | ||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe] | ||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/dozat Stanford Graph-Based Parser by Tim Dozat] | ||
+ | |||
+ | = Training and Evaluation Data = | ||
+ | |||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/ud Universal Dependencies v2.0–2.3] | ||
+ | * [http://wiki.nlpl.eu/index.php/Parsing/sdp Semantic Dependency Parsing] |
Revision as of 17:42, 30 January 2019
Background
An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). Initially, the software and data are commissioned on the Norwegian Abel supercluster.
Preprocessing Tools
Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al. are available through the NLPL installations of the Natural Language Processing Toolkit (NLTK) and the spaCy: Natural Language Processing in Python tools.