this post was submitted on 25 Nov 2025
29 points (100.0% liked)

LocalLLaMA

3862 readers
1 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago
MODERATORS
 

cross-posted from: https://feddit.online/c/technology/p/1229433/apertus-switzerland-government-release-a-fully-open-transparent-multilingual-language-l

"Apertus: a fully open, transparent, multilingual language model

EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) released Apertus 2 September, Switzerland’s first large-scale, open, multilingual language model — a milestone in generative AI for transparency and diversity.

Researchers from EPFL, ETH Zurich and CSCS have developed the large language model Apertus – it is one of the largest open LLMs and a basic technology on which others can build.

In brief Researchers at EPFL, ETH Zurich and CSCS have developed Apertus, a fully open Large Language Model (LLM) – one of the largest of its kind. As a foundational technology, Apertus enables innovation and strengthens AI expertise across research, society and industry by allowing others to build upon it. Apertus is currently available through strategic partner Swisscom, the AI platform Hugging Face, and the Public AI network. ...

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

AI researchers, professionals, and experienced enthusiasts can either access the model through the strategic partner Swisscom or download it from Hugging Face – a platform for AI models and applications – and deploy it for their own projects. Apertus is freely available in two sizes – featuring 8 billion and 70 billion parameters, the smaller model being more appropriate for individual usage. Both models are released under a permissive open-source license, allowing use in education and research as well as broad societal and commercial applications. ...

Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others. ...

Furthermore, for people outside of Switzerland, the external pagePublic AI Inference Utility will make Apertus accessible as part of a global movement for public AI. "Currently, Apertus is the leading public AI model: a model built by public institutions, for the public interest. It is our best proof yet that AI can be a form of public infrastructure like highways, water, or electricity," says Joshua Tan, Lead Maintainer of the Public AI Inference Utility."

you are viewing a single comment's thread
view the rest of the comments
[–] domi@lemmy.secnd.me 3 points 1 day ago

I tested the 70b model at Q8_K_XL quantization on Strix Halo and it's not too slow to use for short queries but definitely much slower than something like gpt-oss-120b.

English to German leads to weird results. German to English seems to be fine, though. At least for the 2 texts I put in.

Didn't get to German to English before giving up, English to German was just too awful. Cases were constantly wrong, very weird word choices and incorrect grammatical genders.