Difference between revisions of "Home"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Activities)
(Activities)
(20 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
The Nordic Language Processing Laboratory (NLPL) is a collaboration of
 
The Nordic Language Processing Laboratory (NLPL) is a collaboration of
academic research groups in Natural Language Processing (NLP) in Northern Europe.
+
university research groups in Natural Language Processing (NLP) in Northern Europe.
 
Our vision is to implement a virtual laboratory for large-scale NLP research by
 
Our vision is to implement a virtual laboratory for large-scale NLP research by
(a) piloting innovative ways to share high-performance computing and data resources
+
(a) creating new ways to enable  data- and compute-intensive  Natural Language Processin  research by implementing  a common software, data and service stack in multiple Nordic HPC centres,
across country borders,
 
 
(b) by pooling competencies within the user community and among expert support teams,
 
(b) by pooling competencies within the user community and among expert support teams,
 
and (c) by enabling internationally competitive, data-intensive research and experimentation
 
and (c) by enabling internationally competitive, data-intensive research and experimentation
 
on a scale that would be difficult to sustain on commodity computing resources.
 
on a scale that would be difficult to sustain on commodity computing resources.
 +
 +
  
 
= Activities =
 
= Activities =
  
As part of its ‘virtual laboratory’, NLPL will prepare software and data infrastructures for
+
<br/>
 +
[[File:neic.png|center]]
 +
<br/><br/>
 +
 
 +
As part of its ‘virtual laboratory’, NLPL prepares and maintains
 +
[http://wiki.nlpl.eu/index.php/Infrastructure/software/catalogue software] and data infrastructures for
 
(A) [http://wiki.nlpl.eu/index.php/Infrastructure/home Collaboration and Software Management];
 
(A) [http://wiki.nlpl.eu/index.php/Infrastructure/home Collaboration and Software Management];
(B) http://wiki.nlpl.eu/index.php/Translation/home Statistical and Neural Machine Translation];
+
(B) [http://wiki.nlpl.eu/index.php/Translation/home Statistical and Neural Machine Translation];
(C) Data-Driven Dependency Parsing;
+
(C) [http://wiki.nlpl.eu/index.php/Parsing/home Data-Driven Dependency Parsing];
(D) Very Large Corpora;
+
(D) [http://wiki.nlpl.eu/index.php/Corpora/home Very Large Corpora];
(E) Pre-Trained Word Embeddings;
+
(E) [http://wiki.nlpl.eu/index.php/Vectors/home Pre-Trained Word Embeddings];
(F) Automated Extrinsic Evaluation;
+
(F) [http://wiki.nlpl.eu/index.php/Evaluation/home Automated Extrinsic Evaluation];
(G) Parallel Corpora and OPUS; and
+
(G) [http://wiki.nlpl.eu/index.php/Corpora/OPUS Parallel Corpora and OPUS]; and
 
(H) [http://wiki.nlpl.eu/index.php/Community/home Community Formation and Outreach].
 
(H) [http://wiki.nlpl.eu/index.php/Community/home Community Formation and Outreach].
 +
Please see the [http://wiki.nlpl.eu/index.php/Infrastructure/software/catalogue catalogue of available software]
 +
and above links for information on how to gain access to and utilize the NLPL virtual laboratory.
 +
 +
= Resources =
  
In mid-2017, NLPL is starting to make available some of its resources and services to the public:
+
Since mid-2017, NLPL has started to make available some of its resources and services to the public:
  
* [http://hdl.handle.net/11234/1-1989 90 billion tokens of ‘raw’ text] extracted from web data, covering the 45 languages in the 2017 UD Parsing Shared Task
+
* [http://hdl.handle.net/11234/1-1989 90 billion tokens of ‘raw’ text] extracted from web data, covering the 45 languages in the 2017 UD Parsing Shared Task;
* The [http://epe.nlpl.eu Extrinsic Parser Evaluation 2017] (EPE) Shared Task at the DepLing and IWPT 2017 conferences
+
* The [http://corpora.nlpl.eu/engc3/ EngC3] corpus of some [http://corpora.nlpl.eu/engc3/ 130 billion tokens of clean English text] extraced from the Common Crawl;
* A [http://vectors.nlpl.eu/repository repository of pre-trained word embeddings] on very large corpora and [http://vectors.nlpl.eu/explore on-line explorer] for various applications of these models
+
* The [http://epe.nlpl.eu Extrinsic Parser Evaluation 2017] (EPE) Shared Task at the DepLing and IWPT 2017 conferences;
* The [http://opus.nlpl.eu Open Parallel Corpus] (OPUS; now hosted under the NLPL umbrella)
+
* A [http://vectors.nlpl.eu/repository repository of pre-trained word embeddings] on very large corpora and [http://vectors.nlpl.eu/explore on-line explorer] for these models;
 +
* The [http://opus.nlpl.eu Open Parallel Corpus] (OPUS; now maintained as a dedicated service under the NLPL umbrella);
 +
* An annual [http://wiki.nlpl.eu/index.php/Community/training winter school series] on machine learning and scientific programming for NLP research.
  
 
= Partners =
 
= Partners =
  
The NLPL consortium is comprised of research groups in NLP at the
+
The NLPL consortium is comprised of Nordic research groups in NLP and
Universities of Copenhagen (Denmark), Helsinki (Finland), Oslo (Norway), Turku (Finland), and Uppsala (Sweden).
+
the national e-infrastructure providers of Finland and Norway:
 +
Helsinki University (Finland), IT University Copenhagen (Denmark),
 +
University of Copenhagen (Denmark), University of Oslo (Norway),
 +
Turku University (Finland), and Uppsala University (Sweden) are the
 +
academic partners.
  
 
Between 2017 and 2020, NLPL is supported by the [https://neic.nordforsk.org/ Nordic e-Infrastructure Collaboration]
 
Between 2017 and 2020, NLPL is supported by the [https://neic.nordforsk.org/ Nordic e-Infrastructure Collaboration]
 
(NeIC) and the national e-Infrastructure providers in Finland ([http://www.csc.fi CSC]) and Norway ([https://www.sigma2.no/ Sigma2]).
 
(NeIC) and the national e-Infrastructure providers in Finland ([http://www.csc.fi CSC]) and Norway ([https://www.sigma2.no/ Sigma2]).
 +
 +
= Associates =
 +
 +
NLPL welcomes involvement of additional research groups in Language Technology in the Nordics, including the Baltic region, to make use of the virtual laboratory. The project has established an associate program where users can get access to NLPL resources.
 +
Please email the contact address below to ask for access.
 +
As part of your initial contact, please provide an indication of the
 +
expected types of computing, software, and data to be used and the
 +
anticipated group of users (including details on affiliation).
 +
 +
As of October 2018, the following research groups are NLPL associates:
 +
 +
* [https://clasp.gu.se/ Center for Linguistic Theory and Studies of Probability] at Gothenburg University (Sweden)
 +
* [https://www.ling.su.se/english/nlp Section for Computational Linguistics] at Stockholm University (Sweden)
 +
* [https://nlp.cs.ut.ee/ Natural Language Processing Research Group] at the University of Tartu (Estonia)
  
 
= Contact =
 
= Contact =

Revision as of 08:34, 17 November 2018

The Nordic Language Processing Laboratory (NLPL) is a collaboration of university research groups in Natural Language Processing (NLP) in Northern Europe. Our vision is to implement a virtual laboratory for large-scale NLP research by (a) creating new ways to enable data- and compute-intensive Natural Language Processin research by implementing a common software, data and service stack in multiple Nordic HPC centres, (b) by pooling competencies within the user community and among expert support teams, and (c) by enabling internationally competitive, data-intensive research and experimentation on a scale that would be difficult to sustain on commodity computing resources.


Activities


Neic.png



As part of its ‘virtual laboratory’, NLPL prepares and maintains software and data infrastructures for (A) Collaboration and Software Management; (B) Statistical and Neural Machine Translation; (C) Data-Driven Dependency Parsing; (D) Very Large Corpora; (E) Pre-Trained Word Embeddings; (F) Automated Extrinsic Evaluation; (G) Parallel Corpora and OPUS; and (H) Community Formation and Outreach. Please see the catalogue of available software and above links for information on how to gain access to and utilize the NLPL virtual laboratory.

Resources

Since mid-2017, NLPL has started to make available some of its resources and services to the public:

Partners

The NLPL consortium is comprised of Nordic research groups in NLP and the national e-infrastructure providers of Finland and Norway: Helsinki University (Finland), IT University Copenhagen (Denmark), University of Copenhagen (Denmark), University of Oslo (Norway), Turku University (Finland), and Uppsala University (Sweden) are the academic partners.

Between 2017 and 2020, NLPL is supported by the Nordic e-Infrastructure Collaboration (NeIC) and the national e-Infrastructure providers in Finland (CSC) and Norway (Sigma2).

Associates

NLPL welcomes involvement of additional research groups in Language Technology in the Nordics, including the Baltic region, to make use of the virtual laboratory. The project has established an associate program where users can get access to NLPL resources. Please email the contact address below to ask for access. As part of your initial contact, please provide an indication of the expected types of computing, software, and data to be used and the anticipated group of users (including details on affiliation).

As of October 2018, the following research groups are NLPL associates:

Contact

To email NLPL project management and its Steering Group, please use the address contact@nlpl.eu. In mid-2017, the project welcomes expressions of interest from additional NLP research groups in Northern Europe.

For additional background and the archive of official project documents (including the work plan and Steering Group minutes), please see the NLPL page on the NeIC wiki.