Difference between revisions of "Parsing/home"
 (→Available Data Sets)  | 
				 (→Parsing Systems)  | 
				||
| Line 3: | Line 3: | ||
An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU).  | An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU).  | ||
Initially, the software and data are commissioned on the Norwegian Abel supercluster.  | Initially, the software and data are commissioned on the Norwegian Abel supercluster.  | ||
| + | |||
| + | = Preprocessing Tools =  | ||
| + | |||
| + | * [http://wiki.nlpl.eu/index.php/Parsing/repp REPP Tokenizer (English and Norwegian)]  | ||
| + | |||
| + | Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al.  | ||
| + | are available through the NLPL installations of the  | ||
| + | [http://nltk.org Natural Language Processing Toolkit (NLTK)]  | ||
| + | [http spaCy: Natural Language Processing in Python] tools.  | ||
= Parsing Systems =  | = Parsing Systems =  | ||
| Line 8: | Line 17: | ||
* [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]  | * [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]  | ||
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]  | * [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]  | ||
| + | * [http://wiki.nlpl.eu/index.php/Parsing/dozat Stanford Graph-Based Parser by Tim Dozat]  | ||
= Training and Evaluation Data =    | = Training and Evaluation Data =    | ||
Revision as of 22:34, 7 January 2019
Background
An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). Initially, the software and data are commissioned on the Norwegian Abel supercluster.
Preprocessing Tools
Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al. are available through the NLPL installations of the Natural Language Processing Toolkit (NLTK) [http spaCy: Natural Language Processing in Python] tools.