Difference between revisions of "Parsing/home"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Available Data Sets)
(Preprocessing Tools)
(5 intermediate revisions by 2 users not shown)
Line 4: Line 4:
 
Initially, the software and data are commissioned on the Norwegian Abel supercluster.
 
Initially, the software and data are commissioned on the Norwegian Abel supercluster.
  
= Available Parsers =
+
= Preprocessing Tools =
 +
 
 +
* [http://wiki.nlpl.eu/index.php/Parsing/repp REPP Tokenizer (English and Norwegian)]
 +
 
 +
Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al.
 +
are available through the NLPL installations of the
 +
[http://nltk.org Natural Language Processing Toolkit (NLTK)] and the
 +
[https://spacy.io spaCy: Natural Language Processing in Python] tools.
 +
 
 +
= Parsing Systems =
  
 
* [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]
 
* [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]
 
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]
 
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]
 +
* [http://wiki.nlpl.eu/index.php/Parsing/dozat Stanford Graph-Based Parser by Tim Dozat]
  
= Available Data Sets =  
+
= Training and Evaluation Data =  
  
 
* [http://wiki.nlpl.eu/index.php/Parsing/ud Universal Dependencies v2.0–2.3]
 
* [http://wiki.nlpl.eu/index.php/Parsing/ud Universal Dependencies v2.0–2.3]
 
* [http://wiki.nlpl.eu/index.php/Parsing/sdp Semantic Dependency Parsing]
 
* [http://wiki.nlpl.eu/index.php/Parsing/sdp Semantic Dependency Parsing]

Revision as of 17:42, 30 January 2019

Background

An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). Initially, the software and data are commissioned on the Norwegian Abel supercluster.

Preprocessing Tools

Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al. are available through the NLPL installations of the Natural Language Processing Toolkit (NLTK) and the spaCy: Natural Language Processing in Python tools.

Parsing Systems

Training and Evaluation Data