From Nordic Language Processing Laboratory
Revision as of 11:36, 9 November 2017 by Matthies (Talk | contribs)

Jump to: navigation, search

Note: This draft document is an internal working material for project management. In order not to risk causing misinterpretation and confusion it is not to be shared outside of NeIC without project management consent. Approved steering group minutes are made publicly available on the NeIC external wiki: https://wiki.neic.no/wiki/Nordic_Language_Processing_Laboratory


NLPL steering group meeting

Time: 2017-11-09

9 - 12 CET

Place: Radisson Blu Airport Hotel, OSL, Gardemoen, Norway


  • Tomasz Malkiewicz, NeIC (PO)
  • Joakim Nivre, Uppsala University
  • Jörg Tiedemann, University of Helsinki
  • Martin Matthiesen, CSC-IT Center for Science Ltd
  • Stephan Oepen, University of Oslo
  • Anders Søgaard, University of Copenhagen
  • Filip Ginter, University of Turku
  • Gunnar Bøe, UNINETT Sigma2 AS
  • Bjørn Lindi, NeIC (PM)

NLPL-SG 17-35 The Agenda for the SG meeting

  • 09:00 NLPL SG 17-35. Attendance and agenda (5’)
  • 09:05 NLPL SG 17-36. SG responsibilities according (15')
  • 09:20 NLPL SG 17-37. Status of the project - including status of NLPL task forces (30’)
  • 09:50 NLPL SG 17-38. Review of personnel situation (15’)
  • 10:05 NLPL SG 17-39. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen (10')
  • 10:15 NLPL SG 17-40. Status report to NeIC board (10')
  • 10:25 Coffee break (5')
  • 10:30 NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018 (10')
  • 10:40 NLPL SG 17-42. Priorities in 2018 (1h10')
  • 11:50 NLPL SG 17-43. Next meetings (5’)
  • 11:55 NLPL SG 17-44. AOB (5’)

Suggest to switch 17-38, 17-39

NLPL SG 17-36. SG responsibilities according

SG responsibilities according to the PPS model 15' http://www.ppsonline.se/nordforsk/en/main/skill/ah134

NLPL SG 17-37. Status of the project - including status of NLPL task forces

Milestone Milestone Description lead month 6 month 12 month 14 month 18
A1.1 Project Report Year one PM due
A2.1 Setup of colloboration infrastructure PM DONE(M2)
A2.2 Update of colloboration infrastructure PM due
A3.1 Trial environment for portable, modular installation PM due
A3.2 Survey needs and use of emerging technologies PM due
A3.3 Facilitate access to resources at Sigma2 and CSC PM due
A3.4 Cost-benefit Analysis of the laboratory PM due
B1.1 Install Moses Release 3.0 and support tols UoH DONE
B1.2 Moses Development Environment UoH due
B1.3 Moses Documentation and tutorials UoH due
B2.2 MT data sets and documentation UoH due
B3.1 Helsinki NMT system with documentation UoH due
C1.1 Dependency Parsing Data version 1 UU due
C2.1 Dependency Parsing Parsers version 1 UU due
C3.1 Dependency Parsing Parsing tutorial UU due
D1.1 Clarification of applicable licensing schemes UoT DONE (M3)
D1.2 Relevant data sets installed with license managment infrastructure UoT Will not be done
E1 Pre-trained embeddings for ENG,DAN,FIN,NNO,NOB,SWE UiO DONE
F1,1 Extrinsic Evaluation Data First Batch UiO(UoC) DONE
F2.1 Extrinsic Evaluation Code for First Batach UiO(UoC) DONE
G1.1 Running OPUS Server UoH DONE
G1.2 Mirror OPUS data UoH waiting on policy?
H1 Winter School UiO due M15
H2 Web site UiO DONE (M3)
H2.2 Position paper on NoLaiDa UiO DONE (M6)

NLPL SG 17-38. Review of personnel situation

NLPL SG 17-39. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen

From Stephan's email (7 Nov):
To formalize this hand-over, we will need to ‘sprinkle’ person months
into our spreadsheet.  i have revised the original allocations to
* reduce CU to a total of 0.6, 1.2, and 1.2 in 2017, 2018, and 2019,respectively; 
* introduce ITU at 0.4, 1.2, and 1.2 in 2017, 2018, and 2019, respectively;
* make up for the ‘loss’ of 1.4 months in 2017 by increasing UiO accordingly.

please see my current proposal here: https://goo.gl/rwqiru

NLPL SG 17-40. Status report to NeIC board

NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018

NLPL SG 17-42. Priorities in 2018

We will work with the prospects and outcomes of next year in two groups:

  • Group 1: TM,OE,MM,FG
  • Group 2: JN,AS,JT,GB,BL

Do a cost-benefit analysis of the NLPL for the current year

  • What has the involvement in NLPL cost you?
  • What are the immediate benefits, if any, and what are the benefits for 2018?


  • Stephan: ~2PMs, experience with different system (Taito)
  • Filip: Project was bureacracy heavy at the beginning, getting better now. Plus: Setting up infrastructure as end (and not as side effect)
  • Martin: More work than expected, but broader perspective on infrastructure issues as benefit

Do a emerging technology analysis

  • What are the technologies you would like to see available/used in the project
  • What are the consequences for the project if the identified technologies are incorporated in the project

Elaborate the milestones for 2018

These are the milestones due in a year:

Milestone Milestone description lead Usefulness Risk
A.1.1 Project Report Year two PM 5 1
B.1.1 Moses Development environement update UoH 3 Useful as baseline, for teaching; Users from Oslo using Taito; but: MT moved on to neural networks 1
B.2.2 Updated MT data sets and documentation UoH
B.3.2 Helsinki NMT system updates UoH 5 Important emerging technology
B.4.1 Documented Helsinki NMT Baselines UoH 4-5 NMT is new technology
B.4.2 Documented SMT Baselines UoH 3-4
C.1.2 Dependency Parsing version 2 5 1 Software exists. UU
C.2.2 Dependency Data version 2 UU 5 1 Data exists.
C.3.2 Dependency Parsing tutorial version 2 UU 5 2 does not yet exist, but funds available.
D.2.2 Common-Crawl-derived corpora for at least five languages UoT 5 Used by dozens of teams in the CoNNL-Shared Task 2017 (100+ registered users) 1
E.2 Updated Embeddings, including additional languages UiO 4 Downloadable, should be made available in Taito/Abel
G.3.1 Web services and their documentation UoH
5 1 Exists. H.1.2 Winter School ’19 UiO 3-4 We know more after WS 18 3 Funding, Organization, Visibility, Involvement

One way to discuss the milestones for 2018 is to identify more specific targets under each milestone, keeping the benefits and the emerging technologies in mind. Each group member then characterise, each target by usefulness and risk. Use a character from the set {1,2,3,4,5} for usefulness, where 1 = not so useful and 5 = very useful. Do the similar for risk, where 1 = low risk and 5 = high risk. Summaries and average the result and use it for sorting possible targets on usefulness and/or risk. Discuss the list.

NLPL SG 17-43. Next meetings

NLPL SG 17-44. AOB

Personal tools