Difference between revisions of "Community/training"
(→Participants) |
(→Schedule) |
||
| (26 intermediate revisions by 2 users not shown) | |||
| Line 45: | Line 45: | ||
= Programme = | = Programme = | ||
| − | The 2026 winter school | + | The 2026 winter school has a thematic focus on ''Multilinguality in LLM Development and Evaluation''. |
| − | The programme | + | The programme is comprised of in-depth technical presentations (possibly including some |
| − | hands-on elements) by international experts, with special emphasis on open science and European languages, but also | + | hands-on elements) by international experts, with special emphasis on open science and European languages, but also includes critical reflections on current development trends in LLM-focused NLP. |
The programme will be complemented with a ‘walk-through’ of example EuroHPC experience | The programme will be complemented with a ‘walk-through’ of example EuroHPC experience | ||
| − | reports from the OpenEuroLLM consortium. | + | reports from the OpenEuroLLM consortium and with reflections about current LLM-oriented activities of the National Library of Norway. |
| − | Confirmed presenters and talks include | + | Confirmed presenters and talks include: |
* [https://bplank.github.io Barbara Plank], Ludwig Maximilian University of Munich | * [https://bplank.github.io Barbara Plank], Ludwig Maximilian University of Munich | ||
| Line 57: | Line 57: | ||
* [https://www.linkedin.com/in/maximilianidahl/?originalSubdomain=de Max Idahl], ellamind | * [https://www.linkedin.com/in/maximilianidahl/?originalSubdomain=de Max Idahl], ellamind | ||
* [https://juliakreutzer.github.io Julia Kreutzer], Cohere for Labs | * [https://juliakreutzer.github.io Julia Kreutzer], Cohere for Labs | ||
| + | * [https://geoalgo.github.io/ David Salinas], ELLIS Institute Tübingen | ||
* [https://www.isir.upmc.fr/personnel/yvon/?lang=en François Yvon], Sorbonne Université | * [https://www.isir.upmc.fr/personnel/yvon/?lang=en François Yvon], Sorbonne Université | ||
| − | = | + | = Schedule = |
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
| Line 66: | Line 67: | ||
| 13:00 || 14:00 || Lunch | | 13:00 || 14:00 || Lunch | ||
|- | |- | ||
| − | | 14:00 || 15:30 || '''Session 1''' | + | | 14:00 || 15:30 || '''Session 1''' Laurie Burchell and Pedro Ortiz Suarez <p class="mw-collapsible mw-collapsed">'''Multilinguality at Common Crawl: improving language coverage for the largest open web corpus'''<br>The Common Crawl Foundation (CCF) provides the largest open corpus of web data, enabling a wide range of scientific and technical applications including large language model (LLM) development. However, our current data processing pipeline faces challenges when processing multilingual data, decreasing language representation and impacting downstream model performance. In this talk, we will discuss CCF’s initiatives to improve multilingual coverage and language identification of our web corpus. These efforts include soliciting crowd-sourced web seeds for under-served languages, running the First Workshop for Multilingual Data Quality Signals at COLM 2025, and creating CommonLID, a community-driven, human-annotated language identification benchmark for the web domain. Throughout, we emphasise the collaborative nature of our efforts, working in partnership with members of the NLP community to improve content available in their languages.</p> |
|- | |- | ||
| 15:30 || 15:50 || Coffee Break | | 15:30 || 15:50 || Coffee Break | ||
|- | |- | ||
| − | | 16:00 || 17:30 || '''Session 2''' | + | | 16:00 || 17:30 || '''Session 2''' François Yvon <p class="mw-collapsible mw-collapsed">'''Evaluating Multilingual Models'''<br> Large Language Models introduced in the recent years have been found extremely helpful to advance the state-of-the-art in many Natural Language Applications, notably due to their ability to compute numerical, high-dimensional, representations of linguistic units such as words or sentences. Multilingual language models go one step further and add the ability to handle multiple languages, sometimes even multiple scripts, with just one single model. In this presentation, I will discuss multilingual language models at length, with a focus on the evaluation of their multilingual abilities, which raises two difficult questions: (a) to evaluate their performance as if they were just a collection of monolingual models; (b) to evaluate their performance as integrated multilingual models, capable of bridging between languages. </p> |
|- | |- | ||
| 17:30 || 17:50 || Coffee Break | | 17:30 || 17:50 || Coffee Break | ||
|- | |- | ||
| − | | 17:50 || 19:20 || '''Session 3''' | + | | 17:50 || 19:20 || '''Session 3''' Julia Kreutzer <p class="mw-collapsible mw-collapsed">'''Evaluating Generations Multilingually: Current challenges and Lessons from Machine Translation'''<br>In this session we will dive into the particular challenge of evaluating LLMs across many languages in generative tasks. We will take a look at the "sister field" of machine translation and inspect what principles have led to advances in understanding quality across languages. </p> |
|- | |- | ||
| 19:30 || || Dinner | | 19:30 || || Dinner | ||
| Line 85: | Line 86: | ||
|colspan=3 | Breakfast is available from 07:30 | |colspan=3 | Breakfast is available from 07:30 | ||
|- | |- | ||
| − | | 09:00 || 10:30 || '''Session 4''' | + | | 09:00 || 10:30 || '''Session 4''' François Yvon <p class="mw-collapsible mw-collapsed">'''Text Generation: Know your Options!'''<br>Text generation, contextual or non-contextual, is ubiquitous in the current LLM era, as it serves as the most basic block in multiple application contexts, from question answering and dialog systems to text summarization and machine translation, and many more. Generation is thus equally useful to compute deterministic and highly non-deterministic mappings with various level of output constraints. Furthermore, text generation is also used as a sub-routine of more complex generation strategies, aiming to produce syntactically well-formed (e.g. for code generation) or semantically consistent outputs, possibility through multiple steps of generation (e.g, in chain-of-thoughts generation) or to collect diverse samples from the generating distribution. To cover this considerable diversity of uses, multiple text generation strategies have been proposed, some less well-known than others. In this talk I will review various families of generation algorithms, from the most basic ones to the more sophisticated approaches, so as to document, as much as possible, the possible options that are available to text generation users. The final part will survey some decoding issues that are specific to multilingual models. </p> |
|- | |- | ||
|colspan=3| Free time (Lunch is available between 13:00 and 14:30) | |colspan=3| Free time (Lunch is available between 13:00 and 14:30) | ||
|- | |- | ||
| − | | 15:30 || 17:00 || '''Session 5''' | + | | 15:30 || 17:00 || '''Session 5''' Max Idahl <p class="mw-collapsible mw-collapsed">'''Multilingual Model-Based Quality Filtering for LLM Pretraining'''<br>Data quality is the highest-leverage factor for LLM performance, with recent work showing significant training efficiency gains through careful curation. This presentation traces the evolution from rule-based filtering to modern model-based approaches that now work across dozens of languages. We cover the progression from basic perplexity-based filters, to FastText and encoder-based scorers, to our newly released Propella models that annotate documents across 18 properties for 57 languages at scale. The talk includes practical insights into building multilingual filtering pipelines.</p> |
|- | |- | ||
| 17:00 || 17:20 || Coffee Break | | 17:00 || 17:20 || Coffee Break | ||
|- | |- | ||
| − | | 17:20 || 19:20 || '''Session 6''' | + | | 17:20 || 19:20 || '''Session 6''' David Salinas <p class="mw-collapsible mw-collapsed">'''Challenges in Evaluating Generative Models'''<br>In this talk, we will discuss the evaluation of generative models, in particular Large Language Models (LLMs). Given that such models produce open-ended output, their evaluation requires different techniques than static evaluations such as simple question-answering benchmarks. We will first discuss human annotations and their use in leaderboards such as LMArena and ComparIA. We will then focus on automatic evaluation relying on LLM judges. In particular, we will describe current challenges with LLM judges before discussing their application in multilingual settings.</p> |
|- | |- | ||
| 19:30 || || Dinner | | 19:30 || || Dinner | ||
|- | |- | ||
| − | | 21:00 || || '''Evening Session | + | | 21:00 || || '''Evening Session''': National Library of Norway, OpenEuroLLM, MultiSynt |
|} | |} | ||
| Line 107: | Line 108: | ||
|colspan=3| Breakfast is available from 07:30 | |colspan=3| Breakfast is available from 07:30 | ||
|- | |- | ||
| − | | 08:30 || 10:00 || '''Session 8''' | + | | 08:30 || 10:00 || '''Session 8''' Barbara Plank <p class="mw-collapsible mw-collapsed">'''NLP Beyond the Standard: Dialects, Variation, and Shared Representations in Multilingual Language Models'''<br>Multilingual language models have primarily focused on cross-lingual differences, with intra-language variation only recently gaining more attention. Dialects and non-standard varieties challenge core assumptions about data, representation, and evaluation. In this talk, I discuss what makes dialects particularly challenging for multilingual models, review approaches starting from early encoder-based methods, and give an overview of resources developed for dialectal NLP, with a focus on German dialects. I then turn to recent work on multilingual training dynamics and shared representations, analyzing when linguistic information and shared concept spaces emerge during training and where alignment breaks down. Although dialects are not yet explicitly modeled in this analysis, the findings provide insight into multilingual representation learning during pre-training. </p> |
|- | |- | ||
| 10:00 || 10:30 || Coffee Break | | 10:00 || 10:30 || Coffee Break | ||
|- | |- | ||
| − | | 10:30 || 12:00 || '''Session 9''' | + | | 10:30 || 12:00 || '''Session 9''' Julia Kreutzer <p class="mw-collapsible mw-collapsed">'''Optimizing data for multilingual post-training'''<br>In this session we will look into techniques for augmenting data collections for better multilingual coverage. We will discuss the role of translation and inference settings, and explore methods for optimizing multilingual data both on the prompt and the generation side.</p> |
|- | |- | ||
| 12:30 || 13:30 || Lunch | | 12:30 || 13:30 || Lunch | ||
| Line 120: | Line 121: | ||
= Registration = | = Registration = | ||
| − | In total, we expect | + | In total, we expect 60–70 participants at the 2026 winter school. |
| − | Registration for interested participants is | + | Registration for interested participants is now closed. |
| − | + | Requests for participation were processed on a first-come, first-served basis, with an eye toward regional balance. | |
| − | Interested parties who have submitted the registration form | + | Interested parties who have submitted the registration form were confirmed in three batches, on '''November 28''', on '''December 5''', |
| − | and on '''December 19''', which | + | and on '''December 19''', which was also the closing date for winter school registration. |
Once confirmed by the organizing team, participant names are published | Once confirmed by the organizing team, participant names are published | ||
| Line 131: | Line 132: | ||
Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute | Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute | ||
spaces), and no-shows will be charged the full price for at least one night | spaces), and no-shows will be charged the full price for at least one night | ||
| − | by the hotel. | + | by the hotel. |
= Logistics = | = Logistics = | ||
| Line 178: | Line 179: | ||
= Participants = | = Participants = | ||
# Adam Hrin, AMD Silo AI (Finland) | # Adam Hrin, AMD Silo AI (Finland) | ||
| − | + | # Agnes Toftgård, The National Library (Sweden) | |
| − | # Agnes Toftgård, National Library (Sweden | ||
| − | |||
# Alicia Núñez Alcover, Prompsit (Spain) | # Alicia Núñez Alcover, Prompsit (Spain) | ||
| − | # | + | # Anastasia Philipps, University of Oslo (Norway) |
# Andrey Kutuzov, University of Oslo (Norway) | # Andrey Kutuzov, University of Oslo (Norway) | ||
| − | # | + | # Angelina Zanardi, National Library of Norway |
# Anni Moisala, CSC – IT Center for Science (Finland) | # Anni Moisala, CSC – IT Center for Science (Finland) | ||
# Artūrs Znotiņš, University of Latvia (Latvia) | # Artūrs Znotiņš, University of Latvia (Latvia) | ||
| Line 190: | Line 189: | ||
# Barbara Plank, Ludwig-Maximilians-Universität München (Germany) | # Barbara Plank, Ludwig-Maximilians-Universität München (Germany) | ||
# Charlotte Noel, LINAGORA Labs (France) | # Charlotte Noel, LINAGORA Labs (France) | ||
| − | # Dalton Harmsen, Eindhoven University of Technology, | + | # Dalton Harmsen, Eindhoven University of Technology (Netherlands) |
| + | # David Salinas, ELLIS institute Tübingen (Germany) | ||
# Diana Kylymnyk, University of Exeter (UK) | # Diana Kylymnyk, University of Exeter (UK) | ||
# Elizaveta Kuzmenko, Université Libre de Bruxelles (Belgium) | # Elizaveta Kuzmenko, Université Libre de Bruxelles (Belgium) | ||
| + | # Etienne Simon, University of Oslo (Norway) | ||
# Faton Rekathati, The National Library (Sweden) | # Faton Rekathati, The National Library (Sweden) | ||
| + | # Fedor Vitiugin, University of Turku (Finland) | ||
# François Yvon, CNRS (France) | # François Yvon, CNRS (France) | ||
# Fred Philippy, University of Luxembourg (Luxembourg) | # Fred Philippy, University of Luxembourg (Luxembourg) | ||
| + | # Ghulam Muhammed Khan, University of Exeter (United Kingdom) | ||
# Gianluca Barmina, University of Southern Denmark (Denmark) | # Gianluca Barmina, University of Southern Denmark (Denmark) | ||
# Hannah Clausen, University of Oslo (Norway) | # Hannah Clausen, University of Oslo (Norway) | ||
| + | # Hannan Mahadik, ELLIS Institute Tübingen (Germany) | ||
# Iglika Nikolova-Stoupak, Sorbonne Université (France) | # Iglika Nikolova-Stoupak, Sorbonne Université (France) | ||
| − | # Jan Hajič, Charles University | + | # Jan Hajič, Charles University (Czech Republic) |
| − | # Jiajing Wan, | + | # Jiajing Wan, University of Bergen (Norway) |
# Jindřich Helcl, University of Oslo (Norway) | # Jindřich Helcl, University of Oslo (Norway) | ||
# Johannes Gabriel Sindlinger, IT University of Copenhagen (Denmark) | # Johannes Gabriel Sindlinger, IT University of Copenhagen (Denmark) | ||
| Line 206: | Line 210: | ||
# Julia Kreutzer, Cohere Labs (Canada) | # Julia Kreutzer, Cohere Labs (Canada) | ||
# Justyna Sikora, The National Library (Sweden) | # Justyna Sikora, The National Library (Sweden) | ||
| + | # Katarina Strani Herriot-Watt University (United Kingdom) | ||
# Kevin Glocker, Linköping University (Sweden) | # Kevin Glocker, Linköping University (Sweden) | ||
| − | |||
# Kristýna Onderková, Charles University (Czech Republic) | # Kristýna Onderková, Charles University (Czech Republic) | ||
# Laurène Cave, Sorbonne Université (France) | # Laurène Cave, Sorbonne Université (France) | ||
# Lisa Yankovskaya, University of Tartu (Estonia) | # Lisa Yankovskaya, University of Tartu (Estonia) | ||
| − | # | + | # Maja Buljan, University of Oslo (Norway) |
| − | # Markus Heiervang, National Library | + | # Markus Heiervang, National Library of Norway |
| + | # Marthe Midtgaard, National Library of Norway | ||
# Mattes Ruckdeschel, IT University of Copenhagen (Denmark) | # Mattes Ruckdeschel, IT University of Copenhagen (Denmark) | ||
# Maximilian Idahl, ellamind (Germany) | # Maximilian Idahl, ellamind (Germany) | ||
# Meihan Tong, University of Oslo (Norway) | # Meihan Tong, University of Oslo (Norway) | ||
# Muhammad Imran, University of A Coruña (Spain) | # Muhammad Imran, University of A Coruña (Spain) | ||
| − | |||
# Nam Luu, Charles University (Czech Republic) | # Nam Luu, Charles University (Czech Republic) | ||
| − | # Neda Jamshidi, University of | + | # Neda Jamshidi, University of Sienna (Italy) |
| + | # Nikolay Arefev, University of Oslo (Norway) | ||
# Nils Grünefeld, IT University of Copenhagen (Denmark) | # Nils Grünefeld, IT University of Copenhagen (Denmark) | ||
# Pedro Ortiz Suarez, Common Crawl Foundation (USA) | # Pedro Ortiz Suarez, Common Crawl Foundation (USA) | ||
| − | # | + | # Rolv-Arild Braaten, National Library of Norway |
# Romina Oji, Linköping University (Sweden) | # Romina Oji, Linköping University (Sweden) | ||
# Sampo Pyysalo, University of Turku (Finland) | # Sampo Pyysalo, University of Turku (Finland) | ||
# Shanshan Xu, University of Copenhagen (Denmark) | # Shanshan Xu, University of Copenhagen (Denmark) | ||
# Shenbin Qian, University of Oslo (Norway) | # Shenbin Qian, University of Oslo (Norway) | ||
| − | |||
# Stephan Oepen, University of Oslo (Norway) | # Stephan Oepen, University of Oslo (Norway) | ||
# Taja Kuzman Pungeršek, Jožef Stefan Institute (Slovenia) | # Taja Kuzman Pungeršek, Jožef Stefan Institute (Slovenia) | ||
| − | # Tita Enstad, National Library | + | # Tita Enstad, National Library of Norway |
# Tommaso Green, University of Mannheim (Germany) | # Tommaso Green, University of Mannheim (Germany) | ||
# Tudor Nicolae Mateiu, Prompsit (Spain) | # Tudor Nicolae Mateiu, Prompsit (Spain) | ||
Latest revision as of 15:27, 1 February 2026
Contents
Circle U, NLPL, & OpenEuroLLM 2026 Winter School on Multilinguality in LLM Development and Evaluation
Background
In 2026, the NLPL network and Digital Europe project OpenEuroLLM have joined forces to organize the successful winter school series on Web-scale NLP. The winter school seeks to stimulate community formation, i.e. strengthening interaction and collaboration among European research teams in NLP and advancing a shared level of knowledge and experience in using high-performance e-infrastructures for large-scale NLP research. This 2026 edition of the winter school puts special emphasis on NLP researchers from countries who participate in the EuroHPC consortium and is endorsed as a doctoral training event in the European Circle U university alliance. For additional background, please see the archival pages from the 2018, 2019, 2020, 2023, 2024, and 2025 NLPL Winter Schools.
For early 2026, NLPL will hold its winter school from Monday, February 2, to Wednesday, February 4, 2026, at a mountain-side hotel (with skiing and walking opportunities) about two hours north of Oslo. The project will organize group bus transfer from and to the main Oslo airport Gardermoen (OSL), leaving the airport at 9:45 on Monday morning and returning there around 17:30 on Wednesday afternoon.
The winter school is subsidized by the OpenEuroLLM project: there is no fee for participants and no charge for the bus transfer to and from the conference hotel. All participants will have to cover their own travel and accommodation at Skeikampen, however. Two nights at the hotel, including all meals, will come to NOK 3885 (NOK 3485 per person in a shared double room), to be paid to the hotel directly upon arrival.
Programme
The 2026 winter school has a thematic focus on Multilinguality in LLM Development and Evaluation. The programme is comprised of in-depth technical presentations (possibly including some hands-on elements) by international experts, with special emphasis on open science and European languages, but also includes critical reflections on current development trends in LLM-focused NLP. The programme will be complemented with a ‘walk-through’ of example EuroHPC experience reports from the OpenEuroLLM consortium and with reflections about current LLM-oriented activities of the National Library of Norway.
Confirmed presenters and talks include:
- Barbara Plank, Ludwig Maximilian University of Munich
- Laurie Burchell and Pedro Ortiz Suarez, Common Crawl
- Max Idahl, ellamind
- Julia Kreutzer, Cohere for Labs
- David Salinas, ELLIS Institute Tübingen
- François Yvon, Sorbonne Université
Schedule
| Monday, February 2, 2026 | ||
|---|---|---|
| 13:00 | 14:00 | Lunch |
| 14:00 | 15:30 | Session 1 Laurie Burchell and Pedro Ortiz Suarez Multilinguality at Common Crawl: improving language coverage for the largest open web corpus |
| 15:30 | 15:50 | Coffee Break |
| 16:00 | 17:30 | Session 2 François Yvon Evaluating Multilingual Models |
| 17:30 | 17:50 | Coffee Break |
| 17:50 | 19:20 | Session 3 Julia Kreutzer Evaluating Generations Multilingually: Current challenges and Lessons from Machine Translation |
| 19:30 | Dinner | |
| Tuesday, February 3, 2026 | ||
|---|---|---|
| Breakfast is available from 07:30 | ||
| 09:00 | 10:30 | Session 4 François Yvon Text Generation: Know your Options! |
| Free time (Lunch is available between 13:00 and 14:30) | ||
| 15:30 | 17:00 | Session 5 Max Idahl Multilingual Model-Based Quality Filtering for LLM Pretraining |
| 17:00 | 17:20 | Coffee Break |
| 17:20 | 19:20 | Session 6 David Salinas Challenges in Evaluating Generative Models |
| 19:30 | Dinner | |
| 21:00 | Evening Session: National Library of Norway, OpenEuroLLM, MultiSynt | |
| Wednesday, February 4, 2026 | ||
|---|---|---|
| Breakfast is available from 07:30 | ||
| 08:30 | 10:00 | Session 8 Barbara Plank NLP Beyond the Standard: Dialects, Variation, and Shared Representations in Multilingual Language Models |
| 10:00 | 10:30 | Coffee Break |
| 10:30 | 12:00 | Session 9 Julia Kreutzer Optimizing data for multilingual post-training |
| 12:30 | 13:30 | Lunch |
| 13:45 | 16:45 | Bus transfer to OSL Airport |
Registration
In total, we expect 60–70 participants at the 2026 winter school. Registration for interested participants is now closed. Requests for participation were processed on a first-come, first-served basis, with an eye toward regional balance. Interested parties who have submitted the registration form were confirmed in three batches, on November 28, on December 5, and on December 19, which was also the closing date for winter school registration.
Once confirmed by the organizing team, participant names are published on this page, and registration establishes a binding agreement with the hotel. Therefore, a cancellation fee will be incurred (unless we can find someone else to ‘take over’ last-minute spaces), and no-shows will be charged the full price for at least one night by the hotel.
Logistics
With a few exceptions, winter school participants travel to and from the conference hotel jointly on a chartered bus (the OpenEuroLLM shuttle). The bus will leave OSL airport no later than 9:45 CET on Monday, February 2. Thus, please meet up by 9:30 and make your arrival known to your assigned ‘tour guide’ (who will introduce themselves to you by email beforehand).
The group will gather near the DNB currency exchange booth in the downstairs arrivals area, just outside the international arrivals luggage claims and slightly to the left as one exits the customs area: the yellow dot numbered (18) on the OSL arrivals map. The group will then walk over to the bus terminal, to leave the airport not long after 9:40. The drive to the Skeikampen conference hotel will take us about two-three hours, and the bus will make one stop along the way to stretch our legs and fill up on coffee.
The winter school will end with lunch on Wednesday, February 4, before the group returns to OSL airport on the OpenEuroLLM shuttle. The bus will leave Skeikampen at 14:00 CET, with an expected arrival time at OSL around 17:00 to 17:30 CET. After stopping at the OSL airport, the bus will continue to central Oslo.
Organization
The 2026 Winter School is organized by a team of volunteers at the University
of Oslo, supported by a programme committee from the OpenEuroLLM, Circle U, and
NLPL networks and beyond, please see below.
For all inquiries regarding registration, the programme, logistics,
or such, please contact nlpl-training@ifi.uio.no.
The programme committee is comprised of (in alphabetical order):
- Jenia Jitsev (Forschungszentrum Jülich, Germany)
- Andrey Kutuzov (University of Oslo, Norway)
- Alessandro Lenci (University of Pisa, Italy)
- Stephan Oepen (University of Oslo, Norway)
- Sampo Pyysalo (University of Turku, Finland)
- David Salinas (ELLIS Institute, Germany)
- Gema Ramirez-Sanches (Prompsit Language Engineering, Spain)
- Jörg Tiedemann (University of Helsinki, Finland)
- Joaquin Vanschoren (Eindhoven University of Technology, The Netherlands)
- Guillaume Wisniewski (Paris Cité University, France)
Participants
- Adam Hrin, AMD Silo AI (Finland)
- Agnes Toftgård, The National Library (Sweden)
- Alicia Núñez Alcover, Prompsit (Spain)
- Anastasia Philipps, University of Oslo (Norway)
- Andrey Kutuzov, University of Oslo (Norway)
- Angelina Zanardi, National Library of Norway
- Anni Moisala, CSC – IT Center for Science (Finland)
- Artūrs Znotiņš, University of Latvia (Latvia)
- Barbara Heinisch, Eurac Research (Italy)
- Barbara Plank, Ludwig-Maximilians-Universität München (Germany)
- Charlotte Noel, LINAGORA Labs (France)
- Dalton Harmsen, Eindhoven University of Technology (Netherlands)
- David Salinas, ELLIS institute Tübingen (Germany)
- Diana Kylymnyk, University of Exeter (UK)
- Elizaveta Kuzmenko, Université Libre de Bruxelles (Belgium)
- Etienne Simon, University of Oslo (Norway)
- Faton Rekathati, The National Library (Sweden)
- Fedor Vitiugin, University of Turku (Finland)
- François Yvon, CNRS (France)
- Fred Philippy, University of Luxembourg (Luxembourg)
- Ghulam Muhammed Khan, University of Exeter (United Kingdom)
- Gianluca Barmina, University of Southern Denmark (Denmark)
- Hannah Clausen, University of Oslo (Norway)
- Hannan Mahadik, ELLIS Institute Tübingen (Germany)
- Iglika Nikolova-Stoupak, Sorbonne Université (France)
- Jan Hajič, Charles University (Czech Republic)
- Jiajing Wan, University of Bergen (Norway)
- Jindřich Helcl, University of Oslo (Norway)
- Johannes Gabriel Sindlinger, IT University of Copenhagen (Denmark)
- Jouni Luoma, AMD Silo AI (Finland)
- Julia Kreutzer, Cohere Labs (Canada)
- Justyna Sikora, The National Library (Sweden)
- Katarina Strani Herriot-Watt University (United Kingdom)
- Kevin Glocker, Linköping University (Sweden)
- Kristýna Onderková, Charles University (Czech Republic)
- Laurène Cave, Sorbonne Université (France)
- Lisa Yankovskaya, University of Tartu (Estonia)
- Maja Buljan, University of Oslo (Norway)
- Markus Heiervang, National Library of Norway
- Marthe Midtgaard, National Library of Norway
- Mattes Ruckdeschel, IT University of Copenhagen (Denmark)
- Maximilian Idahl, ellamind (Germany)
- Meihan Tong, University of Oslo (Norway)
- Muhammad Imran, University of A Coruña (Spain)
- Nam Luu, Charles University (Czech Republic)
- Neda Jamshidi, University of Sienna (Italy)
- Nikolay Arefev, University of Oslo (Norway)
- Nils Grünefeld, IT University of Copenhagen (Denmark)
- Pedro Ortiz Suarez, Common Crawl Foundation (USA)
- Rolv-Arild Braaten, National Library of Norway
- Romina Oji, Linköping University (Sweden)
- Sampo Pyysalo, University of Turku (Finland)
- Shanshan Xu, University of Copenhagen (Denmark)
- Shenbin Qian, University of Oslo (Norway)
- Stephan Oepen, University of Oslo (Norway)
- Taja Kuzman Pungeršek, Jožef Stefan Institute (Slovenia)
- Tita Enstad, National Library of Norway
- Tommaso Green, University of Mannheim (Germany)
- Tudor Nicolae Mateiu, Prompsit (Spain)
- Vladislav Mikhailov, University of Oslo (Norway)
- Wafa Aissa, UCLouvain (Belgium)
- Xiaorui Yu, King's College London (UK)
- Yihang Lu, Sorbonne Université (France)
- Yiheng Wu, University of Helsinki (Finland)
- Yves Scherrer, University of Oslo (Norway)
- Zihao Li, University of Helsinki (Finland)