Parsing/home

From Nordic Language Processing Laboratory
(Difference between revisions)
Jump to: navigation, search
(One intermediate revision by one user not shown)
Line 2: Line 2:
  
 
An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU).
 
An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU).
Initially, the software and data are commissioned on the Norwegian Abel supercluster.
+
The data is available on the Norwegian Saga cluster and on the Finnish Puhti cluster.
 +
The software is available on the Norwegian Saga cluster
 +
 
 +
Initially, software and data were commissioned on the Norwegian Abel supercluster, see [http://wiki.nlpl.eu/index.php/Parsing/abel The Abel page] for legacy information.
  
 
= Preprocessing Tools =
 
= Preprocessing Tools =
  
* [http://wiki.nlpl.eu/index.php/Parsing/repp REPP Tokenizer (English and Norwegian)]
+
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]
  
 
Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al.
 
Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al.
Line 17: Line 20:
 
* [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]
 
* [http://wiki.nlpl.eu/index.php/Parsing/uuparser The Uppsala Parser]
 
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]
 
* [http://wiki.nlpl.eu/index.php/Parsing/udpipe UDPipe]
* [http://wiki.nlpl.eu/index.php/Parsing/dozat Stanford Graph-Based Parser by Tim Dozat]
+
* [http://wiki.nlpl.eu/index.php/Parsing/turboparser TurboParser]
  
 
= Training and Evaluation Data =  
 
= Training and Evaluation Data =  

Revision as of 12:20, 14 January 2020

Contents

Background

An experimentation environment for data-driven dependency parsing is maintained for NLPL under the coordination of Uppsala University (UU). The data is available on the Norwegian Saga cluster and on the Finnish Puhti cluster. The software is available on the Norwegian Saga cluster

Initially, software and data were commissioned on the Norwegian Abel supercluster, see The Abel page for legacy information.

Preprocessing Tools

Additionally, a variety of tools for sentence splitting, tokenization, lemmatization, et al. are available through the NLPL installations of the Natural Language Processing Toolkit (NLTK) and the spaCy: Natural Language Processing in Python tools.

Parsing Systems

Training and Evaluation Data

Personal tools
Namespaces

Variants
Actions
Navigation
Tools