Difference between revisions of "Community/training"
(→Schedule) |
|||
(11 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | '''HPLT & | + | '''HPLT & OpenEuroLLM 2026 Winter School on Multilinguality in LLM Development and Evaluation''' |
− | [[File: | + | [[File:Winter school 2025.jpg|center|thumb|upright=2.0]] |
= Background = | = Background = | ||
− | + | In 2026, the NLPL network and Digital Europe | |
− | project ''[https:// | + | project ''[https://openeurollm.eu OpenEuroLLM]'' |
have joined forces to organize the successful winter school series on Web-scale NLP. | have joined forces to organize the successful winter school series on Web-scale NLP. | ||
The winter school seeks to stimulate ''community formation'', | The winter school seeks to stimulate ''community formation'', | ||
Line 13: | Line 13: | ||
and experience in using high-performance e-infrastructures for large-scale | and experience in using high-performance e-infrastructures for large-scale | ||
NLP research. | NLP research. | ||
− | This | + | This 2026 edition of the winter school puts special emphasis on |
NLP researchers from countries who participate in the EuroHPC | NLP researchers from countries who participate in the EuroHPC | ||
− | [https://www.lumi-supercomputer.eu/lumi-consortium/ LUMI consortium]. | + | [https://www.lumi-supercomputer.eu/lumi-consortium/ LUMI consortium] |
+ | and is offered as a doctoral training event in the European | ||
+ | [https://www.circle-u.eu Circle U] university alliance. | ||
For additional background, please see the archival pages from the | For additional background, please see the archival pages from the | ||
[https://wiki.nlpl.eu/index.php/Community/training/2018 2018], | [https://wiki.nlpl.eu/index.php/Community/training/2018 2018], | ||
[https://wiki.nlpl.eu/index.php/Community/training/2019 2019], | [https://wiki.nlpl.eu/index.php/Community/training/2019 2019], | ||
[https://wiki.nlpl.eu/index.php/Community/training/2020 2020], | [https://wiki.nlpl.eu/index.php/Community/training/2020 2020], | ||
− | [https://wiki.nlpl.eu/index.php/Community/training/2023 2023], | + | [https://wiki.nlpl.eu/index.php/Community/training/2023 2023], |
− | [https://wiki.nlpl.eu/index.php/Community/training/2024 2024] | + | [https://wiki.nlpl.eu/index.php/Community/training/2024 2024], and |
+ | [https://wiki.nlpl.eu/index.php/Community/training/2025 2025] | ||
NLPL Winter Schools. | NLPL Winter Schools. | ||
− | For early | + | For early 2026, NLPL will hold its winter school from Monday, February 2, to |
− | Wednesday, February | + | Wednesday, February 4, 2026, at a |
[https://www.thonhotels.com/our-hotels/norway/skeikampen/ mountain-side hotel] | [https://www.thonhotels.com/our-hotels/norway/skeikampen/ mountain-side hotel] | ||
(with skiing and walking opportunities) about two hours north of Oslo. | (with skiing and walking opportunities) about two hours north of Oslo. | ||
Line 32: | Line 35: | ||
and returning there around 17:30 on Wednesday afternoon. | and returning there around 17:30 on Wednesday afternoon. | ||
− | The winter school is subsidized by the | + | The winter school is subsidized by the OpenEuroLLM project: there is no fee for |
participants and no charge for the bus transfer to and from the | participants and no charge for the bus transfer to and from the | ||
conference hotel. | conference hotel. | ||
Line 42: | Line 45: | ||
= Programme = | = Programme = | ||
− | The | + | The 2026 winter school will have a thematic focus on ''Multilinguality in LLM Development and Evaluation''. |
The programme will be comprised of in-depth technical presentations (possibly including some | The programme will be comprised of in-depth technical presentations (possibly including some | ||
− | hands-on elements) by seasoned experts, with special emphasis on open science and European languages, | + | hands-on elements) by seasoned experts, with special emphasis on open science and European languages, but also include critical reflections on current development trends in LLM-focussed NLP. |
− | but also include critical reflections on current development trends in LLM-focussed NLP. | ||
The programme will be complemented with a ‘walk-through’ of example experience | The programme will be complemented with a ‘walk-through’ of example experience | ||
reports on the shared EuroHPC LUMI supercomputer. | reports on the shared EuroHPC LUMI supercomputer. | ||
Line 51: | Line 53: | ||
Confirmed presenters and talks include: | Confirmed presenters and talks include: | ||
− | * [https://sites.google.com/view/alexandra-birch Alexandra Birch], University of Edinburgh</br>'''EuroLLM and FinLLM – | + | * [https://sites.google.com/view/alexandra-birch Alexandra Birch], University of Edinburgh</br>'''EuroLLM and FinLLM – Stories from the Trenches''' |
* [https://laion.ai/team/ Jenia Jitsev] and [https://laion.ai/team/ Marianna Nezhurina], Jülich Supercomputing Centre / LAION</br>'''Open Foundation Models: Scaling Laws and Generalization''' | * [https://laion.ai/team/ Jenia Jitsev] and [https://laion.ai/team/ Marianna Nezhurina], Jülich Supercomputing Centre / LAION</br>'''Open Foundation Models: Scaling Laws and Generalization''' | ||
* [https://huggingface.co/guipenedo Guilherme Penedo], Huggingface</br>'''FineWeb2: Creating a Large Multilingual Dataset for LLM Pre-Training''' | * [https://huggingface.co/guipenedo Guilherme Penedo], Huggingface</br>'''FineWeb2: Creating a Large Multilingual Dataset for LLM Pre-Training''' | ||
Line 62: | Line 64: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | !colspan=3|Monday, February | + | !colspan=3|Monday, February 2, 2025 |
|- | |- | ||
| 13:00 || 14:00 || Lunch | | 13:00 || 14:00 || Lunch | ||
|- | |- | ||
− | | 14:00 || 15:30 || '''Session 1''' | + | | 14:00 || 15:30 || '''Session 1''' |
− | |||
− | |||
|- | |- | ||
| 15:30 || 15:50 || Coffee Break | | 15:30 || 15:50 || Coffee Break | ||
|- | |- | ||
− | | 16:00 || 17:30 || '''Session 2''' | + | | 16:00 || 17:30 || '''Session 2''' |
− | |||
− | |||
|- | |- | ||
| 17:30 || 17:50 || Coffee Break | | 17:30 || 17:50 || Coffee Break | ||
|- | |- | ||
− | | 17:50 || 19:20 || '''Session 3''' | + | | 17:50 || 19:20 || '''Session 3''' |
− | |||
− | |||
|- | |- | ||
| 19:30 || || Dinner | | 19:30 || || Dinner | ||
Line 87: | Line 83: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | !colspan=3|Tuesday, February | + | !colspan=3|Tuesday, February 3, 2025 |
|- | |- | ||
|colspan=3 | Breakfast is available from 07:30 | |colspan=3 | Breakfast is available from 07:30 | ||
|- | |- | ||
− | | 09:00 || 10:30 || '''Session 4''' | + | | 09:00 || 10:30 || '''Session 4''' |
− | |||
|- | |- | ||
|colspan=3| Free time (Lunch is available between 13:00 and 14:30) | |colspan=3| Free time (Lunch is available between 13:00 and 14:30) | ||
|- | |- | ||
− | | 15:30 || 17:00 || '''Session 5''' | + | | 15:30 || 17:00 || '''Session 5''' |
− | |||
|- | |- | ||
| 17:00 || 17:20 || Coffee Break | | 17:00 || 17:20 || Coffee Break | ||
|- | |- | ||
− | | 17:20 || 19:20 || '''Session 6''' | + | | 17:20 || 19:20 || '''Session 6''' |
|- | |- | ||
| 19:30 || || Dinner | | 19:30 || || Dinner | ||
|- | |- | ||
− | | 21:00 || || '''Evening Session: Findings from | + | | 21:00 || || '''Evening Session: Findings from OpenEuroLLM''' |
|} | |} | ||
Line 111: | Line 105: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | !colspan=3|Wednesday, February | + | !colspan=3|Wednesday, February 4, 2025 |
|- | |- | ||
|colspan=3| Breakfast is available from 07:30 | |colspan=3| Breakfast is available from 07:30 | ||
|- | |- | ||
− | | 08:30 || 10:00 || '''Session 8''' | + | | 08:30 || 10:00 || '''Session 8''' |
|- | |- | ||
| 10:00 || 10:30 || Coffee Break | | 10:00 || 10:30 || Coffee Break | ||
|- | |- | ||
− | | 10:30 || 12:00 || '''Session 9''' | + | | 10:30 || 12:00 || '''Session 9''' |
− | |||
|- | |- | ||
| 12:30 || 13:30 || Lunch | | 12:30 || 13:30 || Lunch | ||
Line 140: | Line 133: | ||
Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute | Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute | ||
spaces), and no-shows will be charged the full price for at least one night | spaces), and no-shows will be charged the full price for at least one night | ||
− | by the hotel. | + | by the hotel. [https://sites.google.com/view/sogstikollen-24f <span style="colour: white;"></span>] |
= Logistics = | = Logistics = | ||
Line 174: | Line 167: | ||
The programme committee is comprised of: | The programme committee is comprised of: | ||
− | * | + | * Jenia Jitsev |
* Andrey Kutuzov (University of Oslo, Norway) | * Andrey Kutuzov (University of Oslo, Norway) | ||
* Stephan Oepen (University of Oslo, Norway) | * Stephan Oepen (University of Oslo, Norway) | ||
* Sampo Pyysalo (University of Turku, Finland) | * Sampo Pyysalo (University of Turku, Finland) | ||
− | * | + | * David Salinas |
+ | * Gema Ramirez-Sanches (Prompsit Language Engineering, Spain) | ||
+ | * Joaquin | ||
= Participants = | = Participants = | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 21:49, 24 September 2025
HPLT & OpenEuroLLM 2026 Winter School on Multilinguality in LLM Development and Evaluation
Contents
Background
In 2026, the NLPL network and Digital Europe project OpenEuroLLM have joined forces to organize the successful winter school series on Web-scale NLP. The winter school seeks to stimulate community formation, i.e. strengthening interaction and collaboration among European research teams in NLP and advancing a shared level of knowledge and experience in using high-performance e-infrastructures for large-scale NLP research. This 2026 edition of the winter school puts special emphasis on NLP researchers from countries who participate in the EuroHPC LUMI consortium and is offered as a doctoral training event in the European Circle U university alliance. For additional background, please see the archival pages from the 2018, 2019, 2020, 2023, 2024, and 2025 NLPL Winter Schools.
For early 2026, NLPL will hold its winter school from Monday, February 2, to Wednesday, February 4, 2026, at a mountain-side hotel (with skiing and walking opportunities) about two hours north of Oslo. The project will organize group bus transfer from and to the Oslo airport Gardermoen, leaving the airport at 9:45 on Monday morning and returning there around 17:30 on Wednesday afternoon.
The winter school is subsidized by the OpenEuroLLM project: there is no fee for participants and no charge for the bus transfer to and from the conference hotel. All participants will have to cover their own travel and accommodation at Skeikampen, however. Two nights at the hotel, including all meals, will come to NOK 3855 (NOK 3455 per person in a shared double room), to be paid to the hotel directly upon arrival.
Programme
The 2026 winter school will have a thematic focus on Multilinguality in LLM Development and Evaluation. The programme will be comprised of in-depth technical presentations (possibly including some hands-on elements) by seasoned experts, with special emphasis on open science and European languages, but also include critical reflections on current development trends in LLM-focussed NLP. The programme will be complemented with a ‘walk-through’ of example experience reports on the shared EuroHPC LUMI supercomputer.
Confirmed presenters and talks include:
- Alexandra Birch, University of Edinburgh
EuroLLM and FinLLM – Stories from the Trenches - Jenia Jitsev and Marianna Nezhurina, Jülich Supercomputing Centre / LAION
Open Foundation Models: Scaling Laws and Generalization - Guilherme Penedo, Huggingface
FineWeb2: Creating a Large Multilingual Dataset for LLM Pre-Training - Gema Ramírez-Sánchez, Prompsit Language Engineering
A look at Pre-Training Data through the Stats Glass - Anna Rogers, IT University of Copenhagen
Large Language Models and Factuality - Pedro Ortiz Suarez and Sebastian Nagel, Common Crawl
Data Quality, Language Coverage and Ethical Considerations in Web Crawling - Ahmet Üstün, Cohere AI
Recipe for multilingual post-training: How to collect high-quality data and use them?
Schedule
Monday, February 2, 2025 | ||
---|---|---|
13:00 | 14:00 | Lunch |
14:00 | 15:30 | Session 1 |
15:30 | 15:50 | Coffee Break |
16:00 | 17:30 | Session 2 |
17:30 | 17:50 | Coffee Break |
17:50 | 19:20 | Session 3 |
19:30 | Dinner |
Tuesday, February 3, 2025 | ||
---|---|---|
Breakfast is available from 07:30 | ||
09:00 | 10:30 | Session 4 |
Free time (Lunch is available between 13:00 and 14:30) | ||
15:30 | 17:00 | Session 5 |
17:00 | 17:20 | Coffee Break |
17:20 | 19:20 | Session 6 |
19:30 | Dinner | |
21:00 | Evening Session: Findings from OpenEuroLLM |
Wednesday, February 4, 2025 | ||
---|---|---|
Breakfast is available from 07:30 | ||
08:30 | 10:00 | Session 8 |
10:00 | 10:30 | Coffee Break |
10:30 | 12:00 | Session 9 |
12:30 | 13:30 | Lunch |
13:45 | 16:45 | Bus transfer to OSL Airport |
Registration
In total, this year we welcome 62 participants at the 2025 winter school. The winter school is over-subscribed and no longer accepting registrations. We have processed requests for participation on a first-come, first-served basis, with an eye toward regional balance. Interested parties who had submitted the registration form have been confirmed in three batches, on December 6, on December 13, and on December 20, which was also the closing date for winter school registration.
Once confirmed by the organizing team, participant names are published on this page, and registration establishes a binding agreement with the hotel. Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute spaces), and no-shows will be charged the full price for at least one night by the hotel.
Logistics
With a few exceptions, winter school participants travel to and from the conference hotel jointly on a chartered bus (the HPLT shuttle). The bus will leave OSL airport no later than 9:45 CET on Monday, February 3. Thus, please meet up by 9:30 and make your arrival known to your assigned ‘tour guide’ (who will introduce themselves to you by email beforehand).
The group will gather near the DNB currency exchange booth in the downstairs arrivals area, just outside the international arrivals luggage claims and slightly to the left as one exits the customs area: the yellow dot numbered (18) on the OSL arrivals map. The group will then walk over to the bus terminal, to leave the airport not long after 9:40. The drive to the Skeikampen conference hotel will take us about three hours, and the bus will make one stop along the way to stretch our legs and fill up on coffee.
The winter school will end with lunch on Wednesday, February 5, before the group returns to OSL airport on the HPLT shuttle. The bus will leave Skeikampen at 14:00 CET, with an expected arrival time at OSL around 17:00 to 17:30 CET. After stopping at the OSL airport, the bus will continue to central Oslo.
Organization
The 2025 Winter School is organized by a team of volunteers at the University
of Oslo, supported by a programme committee from the HPLT and NLPL network and beyond,
please see below.
For all inquiries regarding registration, the programme, logistics,
or such, please contact hplt-training@ifi.uio.no
.
The programme committee is comprised of:
- Jenia Jitsev
- Andrey Kutuzov (University of Oslo, Norway)
- Stephan Oepen (University of Oslo, Norway)
- Sampo Pyysalo (University of Turku, Finland)
- David Salinas
- Gema Ramirez-Sanches (Prompsit Language Engineering, Spain)
- Joaquin