Switzerland's unique, fully open-source LLM 'Apertus' is released, emphasizing learning, transparency, and digital sovereignty with 15 trillion tokens across over 1,000 languages



Apertus , jointly developed by the Swiss Federal Institute of Technology Lausanne (EPFL), the Swiss Federal Institute of Technology Zurich (ETHZ), and the Swiss National Supercomputing Center (CSCS), has been released. Apertus emphasizes transparency and digital sovereignty, and is a large-scale language model (LLM) with all training data and code publicly available. It has been trained on 15 trillion tokens across more than 1,000 languages.

apertus-tech-report/Apertus_Tech_Report.pdf (main) · swiss-ai/apertus-tech-report · GitHub

https://github.com/swiss-ai/apertus-tech-report/blob/main/Apertus_Tech_Report.pdf



Apertus: a fully open, transparent, multilingual language model - EPFL
https://actu.epfl.ch/news/apertus-a-fully-open-transparent-multilingual-lang/

swiss-ai/Apertus-8B-Instruct-2509 · Hugging Face
https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509

APERTUS: A FULLY OPEN, TRANSPARENT, MULTILINGUAL LANGUAGE MODEL - YouTube


Apertus, which means 'open' in Latin, was developed as part of a Swiss national project aiming to 'develop a reliable, globally applicable, and completely open model.' Not only the model weights, but also the architecture, training data, training process, and even intermediate checkpoints are publicly available under the Apatche License 2.0 .

'Apertus is being built for the public good. There are few fully open LLMs of this scale, and it is the first to incorporate multilingualism, transparency, and compliance as core design principles,' said Imanol Schlag, research scientist at ETH Zurich and technical lead for the Apertus development project.

Apertus is trained on a massive dataset of 15 trillion tokens, approximately 40% of which is non-English content. The training data covers over 1,800 languages, including languages that have not been widely covered in LLMs until now, such as Swiss German and Romansh . This multilingual nature is expected to enable applications across a wide range of languages and cultures. At the time of writing, it is unclear whether Apertus supports input and output in Japanese.



Apertus is available in two versions: an 8 billion parameter model (8B) and a 70 billion parameter model (70B). Training was performed using the CSCS supercomputer 'Alps.' It also incorporates technological innovations, such as the 'Goldfish objective' to suppress verbatim memorization of training data, a new activation function called 'xIELU' to stabilize large-scale training, and the optimizer 'AdEMAMix.'

Additionally, Apertus' development took into consideration the transparency obligations stipulated by Swiss data protection and copyright law, as well as EU AI law. Only publicly available data is used for training, and a mechanism has been implemented to retroactively honor website operators' requests to opt out of the AI crawler. Furthermore, utmost attention is paid to data integrity and ethical standards, with personal information and harmful content being removed before training begins.



Once trained, the model undergoes supervised fine-tuning (SFT) to improve its ability to follow interactive instructions, and alignment to generate responses that are in line with human preferences and values.

For particularly controversial topics, the approach is to tailor responses based on the Swiss AI Charter, which reflects Swiss constitutional values. This Charter summarizes Swiss constitutional values such as neutrality, consensus building, federalism, multilingualism, and respect for cultural diversity as principles for AI. In practice, a separate LLM, known as an 'LLM-as-judge,' uses the Charter as an evaluation standard to score responses on a scale of 1 to 9, and the model is then adjusted based on this score to generate responses that are in line with the Charter's principles.

Apertus is not just a research project, but is also designed as a foundational technology for social infrastructure. Our strategic partner, Swisscom, provides access to Apertus for its corporate customers, and through the Public AI Inference Utility, a non-profit open-source service, anyone can access Apertus as a global public infrastructure. In the future, we plan to develop domain-specific models tailored to specific fields such as law, climate, health, and education.

'The release of Apertus is not the final step, but rather the beginning of a long-term effort to build an open, trustworthy and independent AI infrastructure for the public good around the world,' said Antoine Bossel, director of EPFL's Natural Language Processing Laboratory.



Apertus is available in four different models from Hugging Face: the 8B model, the 70B model, and each Instruct model.

swiss-ai/Apertus-8B-2509 · Hugging Face
https://huggingface.co/swiss-ai/Apertus-8B-2509

swiss-ai/Apertus-70B-2509 · Hugging Face
https://huggingface.co/swiss-ai/Apertus-70B-2509

swiss-ai/Apertus-8B-Instruct-2509 · Hugging Face
https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509

swiss-ai/Apertus-70B-Instruct-2509 · Hugging Face
https://huggingface.co/swiss-ai/Apertus-70B-Instruct-2509

In addition, various source codes are available on GitHub.

swiss-ai · GitHub
https://github.com/swiss-ai

in Software, Posted by log1i_yk