Difference between revisions of "Eosc/clouds"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Created page with "= Background = This page gathers information on the various cloud services that are available within the EOSC-Nordic consortium. In principle, cloud utilization may be of int...")
 
(NIRD Toolkit)
 
(7 intermediate revisions by the same user not shown)
Line 21: Line 21:
 
= NIRD Toolkit =
 
= NIRD Toolkit =
  
 +
Judging from the [https://drive.google.com/drive/folders/1rsnEx4YScmyqiaqI6I73Okyd4J-mgyg3 demo] and from the [https://apps.sigma2.no/ website], NIRD Toolkit is a Kubernetes based cloud infrastructure. It seems to be mostly used to create Jupyter notebook servers with access to GPU resources. It is one of the service provided by [https://www.sigma2.no/ UNINETT Sigma2]. NIRD Toolkit is available to staff and students at Norwegian universities.
 +
 +
In theory, this can be useful for teaching, but no clear benefits come to mind in comparison to regular usage of Saga/Puhti/other HPC machines accessed via SSH. NLP researchers tend to value much more deep-level access to their system environment: a pre-defined set of provided Docker containers with TF or PyTorch will hardly satisfy them.
 +
 +
This service will probably be more useful to researchers and teachers from humanities, who tend to be less familiar with command line, programming, etc.
 +
 +
= UCLOUD =
 +
 +
Judging from the [https://drive.google.com/drive/folders/1rsnEx4YScmyqiaqI6I73Okyd4J-mgyg3 demo] and from the [https://docs.cloud.sdu.dk/# website], UCloud is a sort of a web proxy to an HPC cluster.
 +
 +
UCloud is created at the SDU eScience Center, University of Southern Denmark. It currently provides a GUI-based interface for non-experienced HPC users in the whole Denmark. UCloud is developed mostly with interactive jobs in mind (although can be used for standard batch-like jobs as well).
 +
 +
Similar to NIRD Toolkit, UCloud can be very beneficial for teaching and may be for project managment (resource allocation, etc). It does not add much to daily NLP research activities: most members of the NLPL community seem to be absolutely OK with using SSH and batch jobs via SLURM. It might be good to have a nice GUI-based alternative to that, just in case: but this is really something that IT support teams should decide, not NLP researchers.
  
 
= STACKn =
 
= STACKn =
  
= UCLOUD =
+
STACKn is [https://stackn.ai/ currently in closed beta], and the website really does not tell much about this piece of software. Seems to be a commercial (but open source) product.
 +
 
 +
Difficult to evaluate its usefulness for the NLPL community. Most probably, the same comments apply to STACKn as to NIRD Toolkit and UCloud.

Latest revision as of 02:10, 1 December 2020

Background

This page gathers information on the various cloud services that are available within the EOSC-Nordic consortium. In principle, cloud utilization may be of interest to the NLPL user community, even though today all researchers are very comfortable in a batch computing paradigm organized from the command line.

Candidate use cases for cloud resources could be in teaching or hosting of interactive services, like for example the OPUS Corpus Interface or the NLPL Vectors Explorer.

Prior to EOSC-Nordic, parts of the NLPL infrastructure task force (Bjørn Lindi and Stephan Oepen) performed a somewhat superficial assessment of the NIRD Toolkit, which at the time was found to be difficult to take into use (in part because of unclear allocation mechanisms, in part due to authentication barriers for UiO users).


NIRD Toolkit

Judging from the demo and from the website, NIRD Toolkit is a Kubernetes based cloud infrastructure. It seems to be mostly used to create Jupyter notebook servers with access to GPU resources. It is one of the service provided by UNINETT Sigma2. NIRD Toolkit is available to staff and students at Norwegian universities.

In theory, this can be useful for teaching, but no clear benefits come to mind in comparison to regular usage of Saga/Puhti/other HPC machines accessed via SSH. NLP researchers tend to value much more deep-level access to their system environment: a pre-defined set of provided Docker containers with TF or PyTorch will hardly satisfy them.

This service will probably be more useful to researchers and teachers from humanities, who tend to be less familiar with command line, programming, etc.

UCLOUD

Judging from the demo and from the website, UCloud is a sort of a web proxy to an HPC cluster.

UCloud is created at the SDU eScience Center, University of Southern Denmark. It currently provides a GUI-based interface for non-experienced HPC users in the whole Denmark. UCloud is developed mostly with interactive jobs in mind (although can be used for standard batch-like jobs as well).

Similar to NIRD Toolkit, UCloud can be very beneficial for teaching and may be for project managment (resource allocation, etc). It does not add much to daily NLP research activities: most members of the NLPL community seem to be absolutely OK with using SSH and batch jobs via SLURM. It might be good to have a nice GUI-based alternative to that, just in case: but this is really something that IT support teams should decide, not NLP researchers.

STACKn

STACKn is currently in closed beta, and the website really does not tell much about this piece of software. Seems to be a commercial (but open source) product.

Difficult to evaluate its usefulness for the NLPL community. Most probably, the same comments apply to STACKn as to NIRD Toolkit and UCloud.