Difference between revisions of "Community/training"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Programme)
 
(290 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
'''HPLT & OpenEuroLLM 2026 Winter School on Multilinguality in LLM Development and Evaluation'''
 +
 +
[[File:Winter school 2025.jpg|center|thumb|upright=2.0]]
 +
 
= Background =
 
= Background =
  
A desirable side-effect of the NLPL cooperation is ''community formation'',
+
In 2026, the NLPL network and Digital Europe
i.e. strengthening interaction and collaboration among Nordic research teams
+
project ''[https://openeurollm.eu OpenEuroLLM]''
in NLP and advancing a shared level of knowledge and experience in using
+
have joined forces to organize the successful winter school series on Web-scale NLP.
national e-Infrastructures for large-scale NLP research.
+
The winter school seeks to stimulate ''community formation'',
Towards these goals, the project organizes an annual three-day winter school.
+
i.e. strengthening interaction and collaboration among
 +
European research teams in NLP and advancing a shared level of knowledge
 +
and experience in using high-performance e-infrastructures for large-scale
 +
NLP research.
 +
This 2026 edition of the winter school puts special emphasis on
 +
NLP researchers from countries who participate in the EuroHPC
 +
[https://www.lumi-supercomputer.eu/lumi-consortium/ LUMI consortium]
 +
and is offered as a doctoral training event in the European
 +
[https://www.circle-u.eu Circle U] university alliance.
 
For additional background, please see the archival pages from the
 
For additional background, please see the archival pages from the
[http://wiki.nlpl.eu/index.php/Community/training/2018 2018] and
+
[https://wiki.nlpl.eu/index.php/Community/training/2018 2018],
[http://wiki.nlpl.eu/index.php/Community/training/2019 2019]
+
[https://wiki.nlpl.eu/index.php/Community/training/2019 2019],
NLPL Winter Schools].
+
[https://wiki.nlpl.eu/index.php/Community/training/2020 2020],
 +
[https://wiki.nlpl.eu/index.php/Community/training/2023 2023],
 +
[https://wiki.nlpl.eu/index.php/Community/training/2024 2024], and
 +
[https://wiki.nlpl.eu/index.php/Community/training/2025 2025]
 +
NLPL Winter Schools.
  
For early 2020, NLPL will hold its winter school from Sunday, February 2, to
+
For early 2026, NLPL will hold its winter school from Monday, February 2, to
Tuesday, February 4, 2020, at a
+
Wednesday, February 4, 2026, at a
 
[https://www.thonhotels.com/our-hotels/norway/skeikampen/ mountain-side hotel]
 
[https://www.thonhotels.com/our-hotels/norway/skeikampen/ mountain-side hotel]
(with skiing opportunities) about two hours north of Oslo.
+
(with skiing and walking opportunities) about two hours north of Oslo.
 
The project will organize group bus transfer from and to the Oslo
 
The project will organize group bus transfer from and to the Oslo
airport ''Gardermoen'', leaving the airport at 9:30 on Sunday morning
+
airport ''Gardermoen'', leaving the airport at 9:45 on Monday morning
and returning there around 17:30 on Tuesday afternoon.
+
and returning there around 17:30 on Wednesday afternoon.
  
The main instructors in 2020 will be
+
The winter school is subsidized by the OpenEuroLLM project: there is no fee for
[https://u.cs.biu.ac.il/~yogo/ Yoav Goldberg],
 
[https://bplank.github.io/ Barbara Plank], and
 
[https://natschluter.github.io/ Natalie Schluter]
 
The winter school programme will be complemented with an
 
evening ‘research bazar’ (by participants) to stimulate academic socializing
 
and possibly a ‘walk-through’ of available software, data, and service resources
 
in the NLPL Virtual Laboratory.
 
 
 
The winter school is subsidized by the project: there is no fee for
 
 
participants and no charge for the bus transfer to and from the
 
participants and no charge for the bus transfer to and from the
 
conference hotel.
 
conference hotel.
All participants will have to cover their own travel and accomodation
+
All participants will have to cover their own travel and accommodation
 
at Skeikampen, however.
 
at Skeikampen, however.
Two nights at the hotel, including all meals, will come to NOK 2865,  
+
Two nights at the hotel, including all meals, will come to NOK 3855 (NOK 3455 per person in a shared double room),  
to be paid to the hotel directly.
+
to be paid to the hotel directly upon arrival.
 +
 
 +
= Programme =
 +
 
 +
The 2026 winter school will have a thematic focus on ''Multilinguality in LLM Development and Evaluation''.
 +
The programme will be comprised of in-depth technical presentations (possibly including some
 +
hands-on elements) by seasoned experts, with special emphasis on open science and European languages, but also include critical reflections on current development trends in LLM-focussed NLP.
 +
The programme will be complemented with a ‘walk-through’ of example experience
 +
reports on the shared EuroHPC LUMI supercomputer.
 +
 
 +
Confirmed presenters and talks include:
 +
 
 +
* [https://sites.google.com/view/alexandra-birch Alexandra Birch], University of Edinburgh</br>'''EuroLLM and FinLLM – Stories from the Trenches'''
 +
* [https://laion.ai/team/ Jenia Jitsev] and [https://laion.ai/team/ Marianna Nezhurina], Jülich Supercomputing Centre / LAION</br>'''Open Foundation Models: Scaling Laws and Generalization'''
 +
* [https://huggingface.co/guipenedo Guilherme Penedo], Huggingface</br>'''FineWeb2: Creating a Large Multilingual Dataset for LLM Pre-Training'''
 +
* [https://scholar.google.com/citations?user=f5FSgPwAAAAJ&hl=en Gema Ramírez-Sánchez], Prompsit Language Engineering</br>'''A look at Pre-Training Data through the Stats Glass''' 
 +
* [https://annargrs.github.io Anna Rogers], IT University of Copenhagen</br>'''Large Language Models and Factuality'''
 +
* [https://portizs.eu Pedro Ortiz Suarez] and [https://commoncrawl.org/team/sebastian-nagel-engineer Sebastian Nagel], Common Crawl</br>'''Data Quality, Language Coverage and Ethical Considerations in Web Crawling'''
 +
* [https://scholar.google.com.tr/citations?user=fvotcRIAAAAJ&hl=tr Ahmet Üstün], Cohere AI</br>'''Recipe for multilingual post-training: How to collect high-quality data and use them?'''
 +
 
 +
= Schedule =
 +
{| class="wikitable"
 +
|-
 +
!colspan=3|Monday, February 2, 2025
 +
|-
 +
| 13:00 || 14:00 || Lunch
 +
|-
 +
| 14:00 || 15:30 || '''Session 1'''
 +
|-
 +
| 15:30 || 15:50 || Coffee Break
 +
|-
 +
| 16:00 || 17:30 || '''Session 2'''
 +
|-
 +
| 17:30 || 17:50 || Coffee Break
 +
|-
 +
| 17:50 || 19:20 || '''Session 3'''
 +
|-
 +
| 19:30 ||  || Dinner
 +
|}
 +
 
 +
{| class="wikitable"
 +
|-
 +
!colspan=3|Tuesday, February 3, 2025
 +
|-
 +
|colspan=3 | Breakfast is available from 07:30
 +
|-
 +
| 09:00 || 10:30 || '''Session 4'''
 +
|-
 +
|colspan=3| Free time (Lunch is available between 13:00 and 14:30)
 +
|-
 +
| 15:30 || 17:00 || '''Session 5'''
 +
|-
 +
| 17:00 || 17:20 || Coffee Break
 +
|-
 +
| 17:20 || 19:20 || '''Session 6'''
 +
|-
 +
| 19:30 ||  || Dinner
 +
|-
 +
| 21:00 || || '''Evening Session: Findings from OpenEuroLLM'''
 +
|}
 +
 
 +
 
 +
{| class="wikitable"
 +
|-
 +
!colspan=3|Wednesday, February 4, 2025
 +
|-
 +
|colspan=3| Breakfast is available from 07:30
 +
|-
 +
| 08:30 || 10:00 || '''Session 8'''
 +
|-
 +
| 10:00 || 10:30 || Coffee Break
 +
|-
 +
| 10:30 || 12:00 || '''Session 9'''
 +
|-
 +
| 12:30 || 13:30 || Lunch
 +
|-
 +
| 13:45 || 16:45 || Bus transfer to OSL Airport
 +
|}
  
 
= Registration =
 
= Registration =
  
In total, we anticipate 25–45 participants in the 2020 Winter School.
+
In total, this year we welcome 62 participants at the 2025 winter school.
Please register your intent of participation through our
+
The winter school is [https://nettskjema.no/a/381438 over-subscribed] and no longer accepting registrations.
[https://indico.neic.no/e/skeikampen20 on-line registration form].
+
We have processed requests for participation on a first-come, first-served basis, with an eye toward regional balance.
We will process requests for participation on a first-come, first-served
+
Interested parties who had submitted the registration form have been confirmed in three batches, on '''December 6''', on '''December 13''',
basis; the closing date for registration is Friday, December 13, 2019.
+
and on '''December 20''', which was also the closing date for winter school registration.
Once confirmed by the organizing team, registration will establish a
 
binding agreement with the hotel and a cancellation fee will be
 
incurred (unless we can find someone else to ‘take over’ last-minute
 
spaces).
 
  
 +
Once confirmed by the organizing team, participant names are published
 +
on this page, and registration establishes a
 +
''binding agreement'' with the hotel.
 +
Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute
 +
spaces), and no-shows will be charged the full price for at least one night
 +
by the hotel. [https://sites.google.com/view/sogstikollen-24f <span style="colour: white;"></span>]
  
= Contact =
+
= Logistics =  
  
The 2020 NLPL Winter School is organized by a team of volunteers,
+
With a few exceptions, winter school participants travel to and from the conference hotel
Li-Hsin Chang,
+
jointly on a chartered bus (the HPLT shuttle).
Filip Ginter,
+
The bus will leave OSL airport no later than 9:45 CET on Monday, February 3.
Bjørn Lindi,  
+
Thus, please meet up by 9:30 and make your arrival known to your assigned
Farrokh Mehryary,
+
‘tour guide’ (who will introduce themselves to you by email beforehand).
Joakim Nivre,
+
 
Stephan Oepen, and
+
The group will gather near the DNB currency exchange booth in the downstairs
Jörg Tiedemann.
+
arrivals area, just outside the international arrivals luggage claims and slightly
 +
to the left as one exits the customs area:
 +
the yellow dot numbered (18) on the
 +
[https://avinor.no/globalassets/_oslo-lufthavn/ankomst-arrivals.pdf OSL arrivals map].
 +
The group will then walk over to the bus terminal, to leave the airport not long after 9:40.
 +
The drive to the Skeikampen conference hotel will take us about three hours, and the bus
 +
will make one stop along the way to stretch our legs and fill up on coffee.
 +
 
 +
The winter school will end with lunch on Wednesday, February 5, before the group returns
 +
to OSL airport on the HPLT shuttle.
 +
The bus will leave Skeikampen at 14:00 CET, with an expected arrival time at OSL
 +
around 17:00 to 17:30 CET. After stopping at the OSL airport, the bus will continue to central Oslo.
 +
 
 +
= Organization =
 +
 
 +
The 2025 Winter School is organized by a team of volunteers at the University
 +
of Oslo, supported by a programme committee from the HPLT and NLPL network and beyond,
 +
please see below.
 
For all inquiries regarding registration, the programme, logistics,
 
For all inquiries regarding registration, the programme, logistics,
or such, please contact <code>outreach@nlpl.eu</code>.
+
or such, please contact <code>hplt-training@ifi.uio.no</code>.
 
 
= Programme =
 
  
'''Program draft:'''
+
The programme committee is comprised of:
[https://docs.google.com/spreadsheets/d/e/2PACX-1vSA7R--zjrxnzhrxpr6cNNzlomy3hvfTk1hedPJkmIcqxk2-ZuBGOG2Spp1YlPK9PtOOdFqwHNO3i9u/pubhtml?gid=530428440&single=true Click here]
 
  
Complete program will be announced soon!
+
* Jenia Jitsev
 +
* Andrey Kutuzov (University of Oslo, Norway)
 +
* Stephan Oepen (University of Oslo, Norway)
 +
* Sampo Pyysalo (University of Turku, Finland)
 +
* David Salinas
 +
* Gema Ramirez-Sanches (Prompsit Language Engineering, Spain)
 +
* Joaquin
  
 
= Participants =
 
= Participants =
 
# Pepa Atanasova (Copenhagen)
 
# Jeremy Barnes (Oslo)
 
# Ali Basirat (Uppsala)
 
# Maja Buljan (Oslo)
 
# Li-Hsin Chang (Turku, co-organizer)
 
# Manuel Ciosici (Copenhagen)
 
# Filip Ginter (Turku, co-organizer)
 
# Yoav Goldberg (Tel Aviv, presenter)
 
# Rob van der Goot (Copenhagen)
 
# Daniel Hershcovich (Copenhagen)
 
# Mateusz Jurewicz (Copenhagen)
 
# Suwisa Kaewphan (Turku)
 
# Jenna Kanerva (Turku)
 
# Artur Kulmizev (Uppsala)
 
# Andrey Kutuzov (Oslo)
 
# Ellinor Lindqvist (Uppsala)
 
# Juhani Luotolahti (Turku)
 
# Farrokh Mehryary (Turku, co-organizer)
 
# Joakim Nivre (Uppsala, co-organizer)
 
# Stephan Oepen (Oslo, co-organizer)
 
# Barbara Plank (Copenhagen, presenter)
 
# Arradi Nur Rizal (Uppsala)
 
# Samuel Rönnqvist (Turku)
 
# Natalie Schluter (Copenhagen, presenter)
 
# Marija Stepanovic (Copenhagen)
 
# Samia Touileb (Oslo)
 
# Erik Velldal (Oslo)
 
# Daniel Varab (Copenhagen)
 
# Antti Virtanen (Turku)
 
# Dustin Wright (Copenhagen)
 

Latest revision as of 21:49, 24 September 2025

HPLT & OpenEuroLLM 2026 Winter School on Multilinguality in LLM Development and Evaluation

Winter school 2025.jpg

Background

In 2026, the NLPL network and Digital Europe project OpenEuroLLM have joined forces to organize the successful winter school series on Web-scale NLP. The winter school seeks to stimulate community formation, i.e. strengthening interaction and collaboration among European research teams in NLP and advancing a shared level of knowledge and experience in using high-performance e-infrastructures for large-scale NLP research. This 2026 edition of the winter school puts special emphasis on NLP researchers from countries who participate in the EuroHPC LUMI consortium and is offered as a doctoral training event in the European Circle U university alliance. For additional background, please see the archival pages from the 2018, 2019, 2020, 2023, 2024, and 2025 NLPL Winter Schools.

For early 2026, NLPL will hold its winter school from Monday, February 2, to Wednesday, February 4, 2026, at a mountain-side hotel (with skiing and walking opportunities) about two hours north of Oslo. The project will organize group bus transfer from and to the Oslo airport Gardermoen, leaving the airport at 9:45 on Monday morning and returning there around 17:30 on Wednesday afternoon.

The winter school is subsidized by the OpenEuroLLM project: there is no fee for participants and no charge for the bus transfer to and from the conference hotel. All participants will have to cover their own travel and accommodation at Skeikampen, however. Two nights at the hotel, including all meals, will come to NOK 3855 (NOK 3455 per person in a shared double room), to be paid to the hotel directly upon arrival.

Programme

The 2026 winter school will have a thematic focus on Multilinguality in LLM Development and Evaluation. The programme will be comprised of in-depth technical presentations (possibly including some hands-on elements) by seasoned experts, with special emphasis on open science and European languages, but also include critical reflections on current development trends in LLM-focussed NLP. The programme will be complemented with a ‘walk-through’ of example experience reports on the shared EuroHPC LUMI supercomputer.

Confirmed presenters and talks include:

  • Alexandra Birch, University of Edinburgh
    EuroLLM and FinLLM – Stories from the Trenches
  • Jenia Jitsev and Marianna Nezhurina, Jülich Supercomputing Centre / LAION
    Open Foundation Models: Scaling Laws and Generalization
  • Guilherme Penedo, Huggingface
    FineWeb2: Creating a Large Multilingual Dataset for LLM Pre-Training
  • Gema Ramírez-Sánchez, Prompsit Language Engineering
    A look at Pre-Training Data through the Stats Glass
  • Anna Rogers, IT University of Copenhagen
    Large Language Models and Factuality
  • Pedro Ortiz Suarez and Sebastian Nagel, Common Crawl
    Data Quality, Language Coverage and Ethical Considerations in Web Crawling
  • Ahmet Üstün, Cohere AI
    Recipe for multilingual post-training: How to collect high-quality data and use them?

Schedule

Monday, February 2, 2025
13:00 14:00 Lunch
14:00 15:30 Session 1
15:30 15:50 Coffee Break
16:00 17:30 Session 2
17:30 17:50 Coffee Break
17:50 19:20 Session 3
19:30 Dinner
Tuesday, February 3, 2025
Breakfast is available from 07:30
09:00 10:30 Session 4
Free time (Lunch is available between 13:00 and 14:30)
15:30 17:00 Session 5
17:00 17:20 Coffee Break
17:20 19:20 Session 6
19:30 Dinner
21:00 Evening Session: Findings from OpenEuroLLM


Wednesday, February 4, 2025
Breakfast is available from 07:30
08:30 10:00 Session 8
10:00 10:30 Coffee Break
10:30 12:00 Session 9
12:30 13:30 Lunch
13:45 16:45 Bus transfer to OSL Airport

Registration

In total, this year we welcome 62 participants at the 2025 winter school. The winter school is over-subscribed and no longer accepting registrations. We have processed requests for participation on a first-come, first-served basis, with an eye toward regional balance. Interested parties who had submitted the registration form have been confirmed in three batches, on December 6, on December 13, and on December 20, which was also the closing date for winter school registration.

Once confirmed by the organizing team, participant names are published on this page, and registration establishes a binding agreement with the hotel. Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute spaces), and no-shows will be charged the full price for at least one night by the hotel.

Logistics

With a few exceptions, winter school participants travel to and from the conference hotel jointly on a chartered bus (the HPLT shuttle). The bus will leave OSL airport no later than 9:45 CET on Monday, February 3. Thus, please meet up by 9:30 and make your arrival known to your assigned ‘tour guide’ (who will introduce themselves to you by email beforehand).

The group will gather near the DNB currency exchange booth in the downstairs arrivals area, just outside the international arrivals luggage claims and slightly to the left as one exits the customs area: the yellow dot numbered (18) on the OSL arrivals map. The group will then walk over to the bus terminal, to leave the airport not long after 9:40. The drive to the Skeikampen conference hotel will take us about three hours, and the bus will make one stop along the way to stretch our legs and fill up on coffee.

The winter school will end with lunch on Wednesday, February 5, before the group returns to OSL airport on the HPLT shuttle. The bus will leave Skeikampen at 14:00 CET, with an expected arrival time at OSL around 17:00 to 17:30 CET. After stopping at the OSL airport, the bus will continue to central Oslo.

Organization

The 2025 Winter School is organized by a team of volunteers at the University of Oslo, supported by a programme committee from the HPLT and NLPL network and beyond, please see below. For all inquiries regarding registration, the programme, logistics, or such, please contact hplt-training@ifi.uio.no.

The programme committee is comprised of:

  • Jenia Jitsev
  • Andrey Kutuzov (University of Oslo, Norway)
  • Stephan Oepen (University of Oslo, Norway)
  • Sampo Pyysalo (University of Turku, Finland)
  • David Salinas
  • Gema Ramirez-Sanches (Prompsit Language Engineering, Spain)
  • Joaquin

Participants