Difference between revisions of "Infrastructure/resources"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Taito Usage Statistics)
(Gaining Access)
 
(23 intermediate revisions by 2 users not shown)
Line 2: Line 2:
  
 
The NLPL initiative is supported by CSC and Sigma2, the Finnish and Norwegian national e-infrastructure providers, respectively.
 
The NLPL initiative is supported by CSC and Sigma2, the Finnish and Norwegian national e-infrastructure providers, respectively.
The project has received computing allocations on the Taito (Finland) and
+
The project has received computing allocations on the Puhti (Finland; the successor to Taito) and
Abel (Norway) superclusters, which can be used by all project members.
+
Saga (Norway; successor to Abel) superclusters, which can be used by all project members.
Additionally, moderate ‘on-line’ storage allocations are provided on both machines (accessible as /projects/nlpl/on Abel, and /proj/nlpl/on Taito),
+
Additionally, moderate ‘on-line’ storage allocations are provided on both machines
as well as a more generous storage allocation of 100 terabytes (in mid-2017) on NIRD.
+
(accessible as <code>/projappl/nlpl/</code> on Puhti and <code>/cluster/shared/nlpl/</code> on Saga),
 +
as well as a more generous storage allocation of 50 terabytes (in mid-2017) on NIRD.
  
 
= Gaining Access =
 
= Gaining Access =
  
 
For the time being, there is no synchronization of user accounts across countries.
 
For the time being, there is no synchronization of user accounts across countries.
Thus, project members need to obtain accounts separately for Abel and Taito.
+
Thus, project members need to obtain accounts separately for Saga and Puhti.
In mid-2017 at least, staff and students at project member sites and associate partners
+
In mid-2017 at least, staff and students at project member sites (and associate partners)
 
are welcome to use the NLPL virtual laboratory.
 
are welcome to use the NLPL virtual laboratory.
Allocation decisions (within NLPL) are made by the
+
(Sub-)Allocation decisions (within NLPL) are made by the
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force];
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force];
 
thus, please make sure to make contact with us before submitting account applications.
 
thus, please make sure to make contact with us before submitting account applications.
  
For '''Abel''', there is an on-line account application form at https://www.metacenter.no/user/application/.
+
For '''Saga''', there is an on-line account application form at https://www.metacenter.no/user/application/.
Please request association with Notur project <tt>NN9447K</tt>.
+
Please request association with Notur project <tt>NN9447K</tt>
 +
and suggest a plausible end date for your association with the project.
 
It will usually take a day or two before account activation is complete, and you will receive status updates by email and text messages.
 
It will usually take a day or two before account activation is complete, and you will receive status updates by email and text messages.
 
Stephan Oepen (UiO) is the Norwegian point of contact for these projects, but please direct
 
Stephan Oepen (UiO) is the Norwegian point of contact for these projects, but please direct
Line 24: Line 26:
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force].
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force].
  
For '''Taito''', there is an on-line account application form at https://sui.csc.fi/web/guest/csc-customer-registration/.
+
For '''Puhti''', there is an on-line account application form at https://sui.csc.fi/web/guest/csc-customer-registration/.
 
Martin Matthiesen will need to approve NLPL-related account requests and manages
 
Martin Matthiesen will need to approve NLPL-related account requests and manages
 
allocation ‘projects’ (one per site).
 
allocation ‘projects’ (one per site).
Stephan Oepen manages the on-line storage on Taito (‘<tt>/proj/nlpl/</tt>’).
+
Stephan Oepen manages access rights to the community directory on Puhti (‘<tt>/projappl/nlpl/</tt>’).
 
For all technical questions, please contact the NLPL
 
For all technical questions, please contact the NLPL
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force].
 
[http://lists.nlpl.eu/mailman/listinfo/infrastructure infrastructure task force].
Line 33: Line 35:
 
= Resource Allocations =
 
= Resource Allocations =
  
At present (mid-2017), there are different allocation mechanisms for the two systems.
+
At present (mid-2019), there are different allocation mechanisms for the two systems.
  
For Abel, allocations are made by the Norwegian Resource Allocation Committee for six-months periods, which start on April 1 and October 1, each year.
+
For Abel (and Saga), allocations are made by the Norwegian Resource Allocation Committee for six-month periods, which start on April 1 and October 1, each year.
 
NLPL received an allocation of 500,000 cpu hours for the allocation period <tt>2016.2</tt>; of these, less than one third had been used by March 2017,
 
NLPL received an allocation of 500,000 cpu hours for the allocation period <tt>2016.2</tt>; of these, less than one third had been used by March 2017,
 
and the allocation of period <tt>2017.1</tt> (lasting until the end of September 2017) was reduced (from the original estimates, in mutual agreement)
 
and the allocation of period <tt>2017.1</tt> (lasting until the end of September 2017) was reduced (from the original estimates, in mutual agreement)
Line 43: Line 45:
 
All NLPL project members share these allocations and, over time, will need to find ways of using and distributing these resources fairly.
 
All NLPL project members share these allocations and, over time, will need to find ways of using and distributing these resources fairly.
  
For Taito, each NLPL member site can be allocated its ‘own’ project, and CSC tends to make smaller allocations more frequently.
+
For Taito (and Puhti), each NLPL member site can be allocated its ‘own’ project, and CSC tends to make smaller allocations more frequently.
 
The Steering Group has yet to develop a policy for how to make sure that the sum of these micro-allocations fit (fairly) within the bounds
 
The Steering Group has yet to develop a policy for how to make sure that the sum of these micro-allocations fit (fairly) within the bounds
of the ‘blanket’ allocation of three million core hours over three years granted to NLPL as the CSC in-kind contribution.
+
of the ‘blanket’ allocation of three million core hours per year granted to NLPL as the CSC in-kind contribution, but
 +
so far this has not been a practical concern.
  
 
= Taito Usage Statistics =
 
= Taito Usage Statistics =
Line 53: Line 56:
 
! Id !! Project !! Principal Investigator !! Users 2017 !! Units 2017 !! Users 2018 !! Units 2018 !! Users 2019 !! Units 2019
 
! Id !! Project !! Principal Investigator !! Users 2017 !! Units 2017 !! Users 2018 !! Units 2018 !! Users 2019 !! Units 2019
 
|-
 
|-
| 2000509 || Deep Learning for Natural Language Processing || Joakim Nivre || 11 || || || || ||
+
| 2000509 || Deep Learning for Natural Language Processing || Joakim Nivre || 7 || 531,099 || 10 || 609,617 || ||
 
|-
 
|-
| 2000661 || NLPL-OPUS || Jörg Tiedemann || 2 || || || || ||
+
| 2000582 || Neic-NLPL || Stephan Oepen || - || - || 3 || 62,298 || ||
 
|-
 
|-
| 2000288 || BAULT || Jörg Tiedemann || 8 || || || || ||
+
| 2000661 || NLPL-OPUS || Jörg Tiedemann || 1 || 28,959 || 2 || 27,0915 || ||
 
|-
 
|-
| 2000309 || CrossNLP || Jörg Tiedemann || 14 || || || || ||
+
| 2000288 || BAULT || Jörg Tiedemann || 1 || 992 || 1 || 7,089  || ||
 
|-
 
|-
| tuy4622 || Textual Data Mining for Bioinformation Management || Filip Ginter || 10 || || || || || ||
+
| 2000309 || CrossNLP || Jörg Tiedemann || 10 || 84,4404 || 15 || 1,808,416|| ||
 
|-
 
|-
| 2000391 || TurkuNLP EDU || Filip Ginter || 2 || || || || ||
+
| tuy4622 || Textual Data Mining for Bioinformation Management || Filip Ginter || 9 || 891,837 || 10 || 558,113 || ||
 
|-
 
|-
| || || || || || || || ||
+
| 2000391 || TurkuNLP EDU || Filip Ginter || 1 || 142,808 || 2 || 177,932 || ||
 
|-
 
|-
| ||  || || || || || || ||
+
| 2000989 || UCPH part of NeIC-NLPL || Anders Søgaard || - || - || 5 || 107,657 || ||
 +
|-
 +
| 2001006 || NLPL-ITUNLP || Leon Derczynski || - || - || 7 || 345 || ||
 +
|-
 +
| '''Total''' ||  || || '''26'''  || '''2,440,099''' || '''47''' || '''3,602,382''' || ||
  
 
|}
 
|}
Line 73: Line 80:
 
= Abel Usage Statistics =
 
= Abel Usage Statistics =
  
[[File:notur.2017.1.png|center]]
+
 
[[File:notur.2016.2.png|center]]
+
Allocations (and statistics) on Abel are organized into six-months periods
 +
(called, for example, <tt>2018.1</tt>), which start in April and October
 +
each year.
 +
The table below shows the sum of hours for two allocation periods
 +
in a given ‘year’ (e.g. <tt>2016.2</tt> and <tt>2017.1</tt> for the
 +
first NLPL project year),
 +
and the maximum count of active users across the two periods.
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Id !! Project !! Principal Investigator !! Users 2017 !! Hours 2017 !! Users 2018 !! Hours 2018 !! Users 2019 !! Hours 2019
 +
|-
 +
| NN9447K || NLPL || Stephan Oepen ||  7 || 202,080 || 16 || 203,590 || 41 || 960,000
 +
|-
 +
| NN9107K || DELPH-IN21 || Stephan Oepen ||  8 || 1,981,377 || 10 || 214,644 || ||
 +
|-
 +
| '''Total''' || || || '''15''' || '''2,183,457''' || '''26''' || '''418,234''' || '''41''' || '''960,000'''
 +
|}

Latest revision as of 14:04, 16 October 2020

Background

The NLPL initiative is supported by CSC and Sigma2, the Finnish and Norwegian national e-infrastructure providers, respectively. The project has received computing allocations on the Puhti (Finland; the successor to Taito) and Saga (Norway; successor to Abel) superclusters, which can be used by all project members. Additionally, moderate ‘on-line’ storage allocations are provided on both machines (accessible as /projappl/nlpl/ on Puhti and /cluster/shared/nlpl/ on Saga), as well as a more generous storage allocation of 50 terabytes (in mid-2017) on NIRD.

Gaining Access

For the time being, there is no synchronization of user accounts across countries. Thus, project members need to obtain accounts separately for Saga and Puhti. In mid-2017 at least, staff and students at project member sites (and associate partners) are welcome to use the NLPL virtual laboratory. (Sub-)Allocation decisions (within NLPL) are made by the infrastructure task force; thus, please make sure to make contact with us before submitting account applications.

For Saga, there is an on-line account application form at https://www.metacenter.no/user/application/. Please request association with Notur project NN9447K and suggest a plausible end date for your association with the project. It will usually take a day or two before account activation is complete, and you will receive status updates by email and text messages. Stephan Oepen (UiO) is the Norwegian point of contact for these projects, but please direct all inquiries to the NLPL infrastructure task force.

For Puhti, there is an on-line account application form at https://sui.csc.fi/web/guest/csc-customer-registration/. Martin Matthiesen will need to approve NLPL-related account requests and manages allocation ‘projects’ (one per site). Stephan Oepen manages access rights to the community directory on Puhti (‘/projappl/nlpl/’). For all technical questions, please contact the NLPL infrastructure task force.

Resource Allocations

At present (mid-2019), there are different allocation mechanisms for the two systems.

For Abel (and Saga), allocations are made by the Norwegian Resource Allocation Committee for six-month periods, which start on April 1 and October 1, each year. NLPL received an allocation of 500,000 cpu hours for the allocation period 2016.2; of these, less than one third had been used by March 2017, and the allocation of period 2017.1 (lasting until the end of September 2017) was reduced (from the original estimates, in mutual agreement) to a fresh 500,000 hours. Towards the end of August 2017, it appears that NLPL usage on Abel actually has declined to some 50,000 hours in allocation period 2017.1; Stephan Oepen will optimistically request 200,000 hours for the next six-month period. All NLPL project members share these allocations and, over time, will need to find ways of using and distributing these resources fairly.

For Taito (and Puhti), each NLPL member site can be allocated its ‘own’ project, and CSC tends to make smaller allocations more frequently. The Steering Group has yet to develop a policy for how to make sure that the sum of these micro-allocations fit (fairly) within the bounds of the ‘blanket’ allocation of three million core hours per year granted to NLPL as the CSC in-kind contribution, but so far this has not been a practical concern.

Taito Usage Statistics

Id Project Principal Investigator Users 2017 Units 2017 Users 2018 Units 2018 Users 2019 Units 2019
2000509 Deep Learning for Natural Language Processing Joakim Nivre 7 531,099 10 609,617
2000582 Neic-NLPL Stephan Oepen - - 3 62,298
2000661 NLPL-OPUS Jörg Tiedemann 1 28,959 2 27,0915
2000288 BAULT Jörg Tiedemann 1 992 1 7,089
2000309 CrossNLP Jörg Tiedemann 10 84,4404 15 1,808,416
tuy4622 Textual Data Mining for Bioinformation Management Filip Ginter 9 891,837 10 558,113
2000391 TurkuNLP EDU Filip Ginter 1 142,808 2 177,932
2000989 UCPH part of NeIC-NLPL Anders Søgaard - - 5 107,657
2001006 NLPL-ITUNLP Leon Derczynski - - 7 345
Total 26 2,440,099 47 3,602,382

Abel Usage Statistics

Allocations (and statistics) on Abel are organized into six-months periods (called, for example, 2018.1), which start in April and October each year. The table below shows the sum of hours for two allocation periods in a given ‘year’ (e.g. 2016.2 and 2017.1 for the first NLPL project year), and the maximum count of active users across the two periods.

Id Project Principal Investigator Users 2017 Hours 2017 Users 2018 Hours 2018 Users 2019 Hours 2019
NN9447K NLPL Stephan Oepen 7 202,080 16 203,590 41 960,000
NN9107K DELPH-IN21 Stephan Oepen 8 1,981,377 10 214,644
Total 15 2,183,457 26 418,234 41 960,000