Eosc/clouds

From Nordic Language Processing Laboratory
Revision as of 02:02, 1 December 2020 by Andreku (talk | contribs) (UCLOUD)
Jump to: navigation, search

Background

This page gathers information on the various cloud services that are available within the EOSC-Nordic consortium. In principle, cloud utilization may be of interest to the NLPL user community, even though today all researchers are very comfortable in a batch computing paradigm organized from the command line.

Candidate use cases for cloud resources could be in teaching or hosting of interactive services, like for example the OPUS Corpus Interface or the NLPL Vectors Explorer.

Prior to EOSC-Nordic, parts of the NLPL infrastructure task force (Bjørn Lindi and Stephan Oepen) performed a somewhat superficial assessment of the NIRD Toolkit, which at the time was found to be difficult to take into use (in part because of unclear allocation mechanisms, in part due to authentication barriers for UiO users).


NIRD Toolkit

Judging from the demo and from the website, it seems to be mostly used to create Jupyter notebook servers with access to GPU resources.

In theory, this can be useful for teaching, but no clear benefits come to mind in comparison to regular usage of Saga/Puhti/other HPC machines accessed via SSH. NLP researchers tend to value much more deep-level access to their system environment: a pre-defined set of provided Docker containers with TF or PyTorch will hardly satisfy them.

This service will probably be more useful to researchers and teachers from humanities, who tend to be less familiar with command line, programming, etc.

UCLOUD

Judging from the demo and from the website, UCloud is a sort of a web proxy to an HPC cluster.

It currently provides a GUI-based interface for non-experienced HPC users in Denmark. UCloud is developed mostly with interactive jobs in mind (although can be used for standard batch-like jobs as well).

Similar to NIRD Toolkit, UCloud can be very beneficial for teaching and may be for project managment (resource allocation, etc). It does not add much to daily NLP research activities: most members of the NLPL community seem to be absolutely OK with using SSH and batch jobs via SLURM. It might be good to have a nice GUI-based alternative to that, just in case: but this is really something that IT support teams should decide, not NLP researchers.

STACKn

STACKn is currently in closed beta, and the website really does not tell much about this piece of software.

Difficult to evaluate its usefulness for the NLPL community. Most probably, the same comments apply to STACKn as to NIRD Toolkit and UCloud.