The Nordic Language Processing Laboratory (NLPL) is a collaboration of academic research groups in Natural Language Processing (NLP) in Northern Europe. Our vision is to implement a virtual laboratory for large-scale NLP research by (a) piloting innovative ways to share high-performance computing and data resources across country borders, (b) by pooling competencies within the user community and among expert support teams, and (c) by enabling internationally competitive, data-intensive research and experimentation on a scale that would be difficult to sustain on commodity computing resources.
As part of its ‘virtual laboratory’, NLPL will prepare software and data infrastructures for (A) Collaboration and Software Management; (B) Statistical and Neural Machine Translation; (C) Data-Driven Dependency Parsing; (D) Very Large Corpora; (E) Pre-Trained Word Embeddings; (F) Automated Extrinsic Evaluation; (G) Parallel Corpora and OPUS; and (H) Community Formation and Outreach.
In mid-2017, NLPL is starting to make available some of its resources and services to the public:
- 90 billion tokens of ‘raw’ text extracted from web data, covering the 45 languages in the 2017 UD Parsing Shared Task
- The Extrinsic Parser Evaluation 2017 (EPE) Shared Task at the DepLing and IWPT 2017 conferences
- A repository of pre-trained word embeddings on very large corpora and on-line explorer for various applications of these models
The NLPL consortium is comprised of research groups in NLP at the Universities of Copenhagen (Denmark), Helsinki (Finland), Oslo (Norway), Turku (Finland), and Uppsala (Sweden).
To email NLPL project management and its Steering Group, please use the address
In mid-2017, the project welcomes expressions of interest from additional NLP research groups in Northern Europe.
For additional background and the archive of official project documents (including the work plan and Steering Group minutes), please see the NLPL page on the NeIC wiki.