"Apertus: a fully open, transparent, multilingual language model

EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) released Apertus 2 September, Switzerland’s first large-scale, open, multilingual language model — a milestone in generative AI for transparency and diversity.

Researchers from EPFL, ETH Zurich and CSCS have developed the large language model Apertus – it is one of the largest open LLMs and a basic technology on which others can build.

In brief Researchers at EPFL, ETH Zurich and CSCS have developed Apertus, a fully open Large Language Model (LLM) – one of the largest of its kind. As a foundational technology, Apertus enables innovation and strengthens AI expertise across research, society and industry by allowing others to build upon it. Apertus is currently available through strategic partner Swisscom, the AI platform Hugging Face, and the Public AI network. …

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

AI researchers, professionals, and experienced enthusiasts can either access the model through the strategic partner Swisscom or download it from Hugging Face – a platform for AI models and applications – and deploy it for their own projects. Apertus is freely available in two sizes – featuring 8 billion and 70 billion parameters, the smaller model being more appropriate for individual usage. Both models are released under a permissive open-source license, allowing use in education and research as well as broad societal and commercial applications. …

Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others. …

Furthermore, for people outside of Switzerland, the external pagePublic AI Inference Utility will make Apertus accessible as part of a global movement for public AI. “Currently, Apertus is the leading public AI model: a model built by public institutions, for the public interest. It is our best proof yet that AI can be a form of public infrastructure like highways, water, or electricity,” says Joshua Tan, Lead Maintainer of the Public AI Inference Utility."

  • xcjs@programming.dev
    link
    fedilink
    arrow-up
    10
    ·
    edit-2
    17 hours ago

    In case you’re not aware, there are a decent number of open weight (and some open source) large language models.

    The Ollama project makes it very approachable to download and use these models.

    • PandaInSpace@kbin.earth
      link
      fedilink
      arrow-up
      2
      ·
      3 hours ago

      Other than Apertus, are there any truly open source models - mainly what I want to know is models that list their training data publicly to ensure no theft of art and stuff. (i replied to your comment as you seem to know about these models, I have no clue abou this stuff)

      • xcjs@programming.dev
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        3 hours ago

        Deepseek R1 and OpenThinker are two more examples. There’s also SmolLM, which I believe also open sources its training data and ensures proper licensing for it.

    • Xylight@lemdro.id
      link
      fedilink
      English
      arrow-up
      10
      ·
      edit-2
      5 hours ago

      Ollama has taken a bad turn lately (such is the nature of VC backed software). Maybe recommend kobold.cpp jan.ai for LLM noobs instead

      • mudkip@lemdro.id
        link
        fedilink
        English
        arrow-up
        1
        ·
        15 minutes ago

        there is nothing wrong with ollama it runs models fast and easy add a gguf and youre done unless you want to squeeze out extra performance and have time to figure out your exact flags then use llama cpp otherwise ollama just works for 99 percent of people

        • Xylight@lemdro.id
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 hours ago

          That’s what I use and also the backend of the aforementioned software, but it’s still complicated for people to set up.

          I should also mention Jan, it makes things super easy and it also has a very nice GUI

      • xcjs@programming.dev
        link
        fedilink
        arrow-up
        5
        ·
        16 hours ago

        I’m keeping an eye on Ollama’s service offerings - I don’t think they’re in enshittification territory yet, but I definitely share the concern.

        I still don’t believe the other LLM engines out there have reached an equivalent ease of use compared to Ollama, and I still recommend it for now. If nothing else, it can be a stepping stone to other solutions for some.