Infrastructure/resources
Contents
Background
The NLPL initiative is supported by CSC and Sigma2, the Finnish and Norwegian national e-infrastructure providers, respectively.
The project has received computing allocations on the Puhti (Finland; the successor to Taito) and
Saga (Norway; successor to Abel) superclusters, which can be used by all project members.
Additionally, moderate ‘on-line’ storage allocations are provided on both machines
(accessible as /projappl/nlpl/
on Puhti and /cluster/shared/nlpl/
on Saga),
as well as a more generous storage allocation of 50 terabytes (in mid-2017) on NIRD.
Gaining Access
For the time being, there is no synchronization of user accounts across countries. Thus, project members need to obtain accounts separately for Saga and Puhti. In mid-2017 at least, staff and students at project member sites (and associate partners) are welcome to use the NLPL virtual laboratory. (Sub-)Allocation decisions (within NLPL) are made by the infrastructure task force; thus, please make sure to make contact with us before submitting account applications.
For Saga, there is an on-line account application form at https://www.metacenter.no/user/application/. Please request association with Notur project NN9447K and suggest a plausible end date for your association with the project. It will usually take a day or two before account activation is complete, and you will receive status updates by email and text messages. Stephan Oepen (UiO) is the Norwegian point of contact for these projects, but please direct all inquiries to the NLPL infrastructure task force.
For Puhti, there is an on-line account application form at https://sui.csc.fi/web/guest/csc-customer-registration/. Martin Matthiesen will need to approve NLPL-related account requests and manages allocation ‘projects’ (one per site). Stephan Oepen manages access rights to the community directory on Puhti (‘/projappl/nlpl/’). For all technical questions, please contact the NLPL infrastructure task force.
Resource Allocations
At present (mid-2019), there are different allocation mechanisms for the two systems.
For Abel (and Saga), allocations are made by the Norwegian Resource Allocation Committee for six-month periods, which start on April 1 and October 1, each year. NLPL received an allocation of 500,000 cpu hours for the allocation period 2016.2; of these, less than one third had been used by March 2017, and the allocation of period 2017.1 (lasting until the end of September 2017) was reduced (from the original estimates, in mutual agreement) to a fresh 500,000 hours. Towards the end of August 2017, it appears that NLPL usage on Abel actually has declined to some 50,000 hours in allocation period 2017.1; Stephan Oepen will optimistically request 200,000 hours for the next six-month period. All NLPL project members share these allocations and, over time, will need to find ways of using and distributing these resources fairly.
For Taito (and Puhti), each NLPL member site can be allocated its ‘own’ project, and CSC tends to make smaller allocations more frequently. The Steering Group has yet to develop a policy for how to make sure that the sum of these micro-allocations fit (fairly) within the bounds of the ‘blanket’ allocation of three million core hours per year granted to NLPL as the CSC in-kind contribution, but so far this has not been a practical concern.
Taito Usage Statistics
Id | Project | Principal Investigator | Users 2017 | Units 2017 | Users 2018 | Units 2018 | Users 2019 | Units 2019 |
---|---|---|---|---|---|---|---|---|
2000509 | Deep Learning for Natural Language Processing | Joakim Nivre | 7 | 531,099 | 10 | 609,617 | ||
2000582 | Neic-NLPL | Stephan Oepen | - | - | 3 | 62,298 | ||
2000661 | NLPL-OPUS | Jörg Tiedemann | 1 | 28,959 | 2 | 27,0915 | ||
2000288 | BAULT | Jörg Tiedemann | 1 | 992 | 1 | 7,089 | ||
2000309 | CrossNLP | Jörg Tiedemann | 10 | 84,4404 | 15 | 1,808,416 | ||
tuy4622 | Textual Data Mining for Bioinformation Management | Filip Ginter | 9 | 891,837 | 10 | 558,113 | ||
2000391 | TurkuNLP EDU | Filip Ginter | 1 | 142,808 | 2 | 177,932 | ||
2000989 | UCPH part of NeIC-NLPL | Anders Søgaard | - | - | 5 | 107,657 | ||
2001006 | NLPL-ITUNLP | Leon Derczynski | - | - | 7 | 345 | ||
Total | 26 | 2,440,099 | 47 | 3,602,382 |
Abel Usage Statistics
Allocations (and statistics) on Abel are organized into six-months periods (called, for example, 2018.1), which start in April and October each year. The table below shows the sum of hours for two allocation periods in a given ‘year’ (e.g. 2016.2 and 2017.1 for the first NLPL project year), and the maximum count of active users across the two periods.
Id | Project | Principal Investigator | Users 2017 | Hours 2017 | Users 2018 | Hours 2018 | Users 2019 | Hours 2019 |
---|---|---|---|---|---|---|---|---|
NN9447K | NLPL | Stephan Oepen | 7 | 202,080 | 16 | 203,590 | 41 | 960,000 |
NN9107K | DELPH-IN21 | Stephan Oepen | 8 | 1,981,377 | 10 | 214,644 | ||
Total | 15 | 2,183,457 | 26 | 418,234 | 41 | 960,000 |