Home/SG171109

From Nordic Language Processing Laboratory
Jump to: navigation, search


NLPL steering group meeting

Time: 2017-11-09

9 - 12 CET

Place: Radisson Blu Airport Hotel, OSL, Gardemoen, Norway

Invited:

  • Tomasz Malkiewicz, NeIC (PO)
  • Joakim Nivre, Uppsala University
  • Jörg Tiedemann, University of Helsinki
  • Martin Matthiesen, CSC-IT Center for Science Ltd
  • Stephan Oepen, University of Oslo
  • Anders Søgaard, University of Copenhagen
  • Filip Ginter, University of Turku
  • Gunnar Bøe, UNINETT Sigma2 AS
  • Bjørn Lindi, NeIC (PM)

Present:

  • Tomasz Malkiewicz, NeIC (PO)
  • Joakim Nivre, Uppsala University
  • Jörg Tiedemann, University of Helsinki
  • Martin Matthiesen, CSC-IT Center for Science Ltd
  • Stephan Oepen, University of Oslo
  • Anders Søgaard, University of Copenhagen
  • Filip Ginter, University of Turku
  • Gunnar Bøe, UNINETT Sigma2 AS
  • Bjørn Lindi, NeIC (PM)

Absent:

  • Gunnar Bøe, UNINETT Sigma2 AS

NLPL-SG 17-35 The Agenda for the SG meeting

  • 09:00 NLPL SG 17-35. Attendance and agenda (5’)
  • 09:05 NLPL SG 17-36. SG responsibilities according (15')
  • 09:20 NLPL SG 17-37. Status of the project - including status of NLPL task forces (30’)
  • 09:50 NLPL SG 17-38. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen (10')
  • 10:05 NLPL SG 17-39. Review of personnel situation (15’)
  • 10:15 NLPL SG 17-40. Status report to NeIC board (10')
  • 10:25 Coffee break (5')
  • 10:30 NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018 (10')
  • 10:40 NLPL SG 17-42. Priorities in 2018 (1h10')
  • 11:50 NLPL SG 17-43. Next meetings (5’)
  • 11:55 NLPL SG 17-44. AOB (5’)

Agenda item 17-38 and 17-39 was switched


NLPL SG 17-36. SG responsibilities according

Presenting NeIC. NeIC's funding comes from

  • 1/3 projects
  • 1/3 Nordforsk
  • 1/3 National e-Infrastructure providers

The Steering Group responsibilities according to the [1]

Currently the project is in the execution phase. In 6 months the Steering Group needs to discuss how to continue the project, by formulating a follow-up project or how to end the project. By ending the project, the Steering Group needs to discuss how project results can be transferred, see item NLPL SG 17-41.

NLPL SG 17-37. Status of the project - including status of NLPL task forces

The table gives the status of the project. All milestones for M12 are in progress.

Milestone Milestone Description lead month 6 month 12 month 14 month 18
A1.1 Project Report Year one PM In progress
A2.1 Setup of collaboration infrastructure PM DONE(M2)
A2.2 Update of collaboration infrastructure PM In progress
A3.1 Trial environment for portable, modular installation PM In progress
A3.2 Survey needs and use of emerging technologies PM In progress
A3.3 Facilitate access to resources at Sigma2 and CSC PM In progress
A3.4 Cost-benefit Analysis of the laboratory PM In progress
B1.1 Install Moses Release 3.0 and support tools UoH DONE
B1.2 Moses Development Environment UoH In progress
B1.3 Moses Documentation and tutorials UoH In progress
B2.2 MT data sets and documentation UoH In progress
B3.1 Helsinki NMT system with documentation UoH In progress
C1.1 Dependency Parsing Data version 1 UU In progress
C2.1 Dependency Parsing Parses version 1 UU In progress
C3.1 Dependency Parsing Parsing tutorial UU In progress
D1.1 Clarification of applicable licensing schemes UoT DONE (M3)
D1.2 Relevant data sets installed with license management infrastructure UoT Will not be done
E1 Pre-trained embeddings for ENG,DAN,FIN,NNO,NOB,SWE UiO DONE
F1,1 Extrinsic Evaluation Data First Batch UiO(UoC) DONE
F2.1 Extrinsic Evaluation Code for First Batch UiO(UoC) DONE
G1.1 Running OPUS Server UoH DONE
G1.2 Mirror OPUS data UoH In progress
H1 Winter School UiO due M15
H2 Web site UiO DONE (M3)
H2.2 Position paper on NoLaiDa UiO DONE (M6)

The table gives the status of the project. All milestones for M12 are in progress.

We have established task forces for Infrastructure and for Outreach. A few more are needed to cover all milestones. The following task forces are suggested:

Task force Area covered Status
Infrastructure A Technical Infrastructure Established
Parsing C Dependency Parsing
Data D Large Corpora, E Embeddings
Translation B Machine Translation, G Parallel Corpora and OPUS
Outreach H Community Building Established

The project will not use Slack anymore. Teams are free to still use it. Email and Google Hangout/Zoom video meetings are channels for communication.

NLPL SG 17-38. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen

The Machine Learning Research Group at the IT University of Copenhagen is now a partner of the project.

The Person Months spreadsheet has been revised:

  • The University of Copenhagen's part is reduced to a total of 0.6, 1.2, and 1.2 in 2017, 2018, and 2019,respectively;
  • The IT University will have the following contributions (fractions of a PM) 0.4, 1.2, and 1.2 in 2017, 2018, and 2019, respectively;
  • University of Oslo has gotten increased their share of work in 2017 1.4

NLPL SG 17-39. Review of personnel situation

The revised Person Month budget is available on the NeIC wiki:PM budget approved

Personnel list available at the NeIC wiki: Project Personnel


NLPL SG 17-40. Status report to NeIC board

The PM give a brief status of the project to the NeIC Board, by setting "traffic lights" for the project's results goals, cost/resource goals and time goal. The status is given to each The NeIC board meeting.

The following was reported to the NeIC Board meeting in September: NLPL Cost/Resource goal is changed from green to yellow: The Project Partner University of Copenhagen has reported that they currently do not have personnel who can contribute to the project. At earliest UoC can work on the project from next summer - one year after schedule.

For the upcoming NeIC board meeting in December, the PM will report: NLPL Cost/Resource goals is changed from yellow to green: The IT University (ITU), Copenhagen, Denmark has been included in the project. ITU and UoC will split the work earlier agreed upon by UoC. The personnel issue is resolved.

NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018

There will be a mid-term evaluation of the project by the NeIC board, either in June or September, next year. PM or PO will present the future direction of the project. The Steering Group must work beforehand on a proposal on how to proceed. The next Steering Group F2F should be held before the mid-term evaluation. This is very important since it will impact the continuation of the project, see NLPL SG 17-43

NLPL SG 17-42. Priorities in 2018

Do a cost-benefit analysis of the NLPL for the current year

  • What has the involvement in NLPL cost you?
  • What are the immediate benefits, if any, and what are the benefits for 2018?

Group1:

  • Stephan: ~2PMs, experience with different system (Taito)
  • Filip: Project was bureacracy heavy at the beginning, getting better now. Plus: Setting up infrastructure as end (and not as side effect)
  • Martin: More work than expected, but broader perspective on infrastructure issues as benefit

Group2:

* Access to taito and able is a real benefit.
* Some of the things we do in the project, would we have done anyway, but there is som extra work to comply with the project. 
* It is benefit that software/infrastructure are equal on taito and abel. 
* There is mixed experience with moving around from system to system, which as been the case previously
* OPUS has a more permenant "home" - better infrastructure for the OPUS service now than previous (benefit).
* Having baselines available from a common place is useful
* Increased visibility for NLP-research when applying for compute time

Do a emerging technology analysis

  • What are the technologies you would like to see available/used in the project
* GPUs to be investigated further.
* Exchange of experiences, CPU/GPU comparison (FG: factor 10)
  • What are the consequences for the project if the identified technologies are incorporated in the project
* Enabling new technologies.

Elaborate the milestones for 2018

These are the milestones due in a year:

Milestone Milestone description lead Usefulness Risk
A.1.1 Project Report Year two PM 5 1
B.1.1 Moses Development environement update UoH 3 Useful as baseline, for teaching; Users from Oslo using Taito; but: MT moved on to neural networks 1
B.2.2 Updated MT data sets and documentation UoH
B.3.2 Helsinki NMT system updates UoH 5 Important emerging technology
B.4.1 Documented Helsinki NMT Baselines UoH 4-5 NMT is new technology
B.4.2 Documented SMT Baselines UoH 3-4
C.1.2 Dependency Parsing version 2 5 1 Software exists. UU
C.2.2 Dependency Data version 2 UU 5 1 Data exists.
C.3.2 Dependency Parsing tutorial version 2 UU 5 2 does not yet exist, but funds available.
D.2.2 Common-Crawl-derived corpora for at least five languages UoT 5 Used by dozens of teams in the CoNNL-Shared Task 2017 (100+ registered users) 1
E.2 Updated Embeddings, including additional languages UiO 4 Downloadable, should be made available in Taito/Abel
G.3.1 Web services and their documentation UoH 5 1 Exists.
H.1.2 Winter School ’19 UiO 3-4 We know more after WS 18 3 Funding, Organization, Visibility, Involvement

One way to discuss the milestones for 2018 is to identify more specific targets under each milestone, keeping the benefits and the emerging technologies in mind. Each group member then characterise, each target by usefulness and risk. Use a character from the set {1,2,3,4,5} for usefulness, where 1 = not so useful and 5 = very useful. Do the similar for risk, where 1 = low risk and 5 = high risk. Summaries and average the result and use it for sorting possible targets on usefulness and/or risk. Discuss the list.

The overall conclusion is that the current work plan has milestones with high usefulness, but very low risk. Looking into '18, the project should reach all its targets for next year.

NLPL SG 17-43. Next meetings

  • Tuesday, January 30, 13:00–14:30 at Skeikampen, Norway during the Winter School '18
  • Tuesday, May 15, 9:00–12:00 @ Arlanda or Uppsala

NLPL SG 17-44. AOB

No item