As Snowden told us, video and audio recording capabilities of your devices are NSA spying vectors. OSS/Linux is a safeguard against such capabilities. The massive datacenter investments in US will be used to classify us all into a patriotic (for Israel)/Oligarchist social credit score, and every mega tech company can increase profits through NSA cooperation, and are legally obligated to cooperate with all government orders.

Speech to text and speech automation are useful tech, though always listening state sponsored terrorists is a non-NSA targeted path for sweeping future social credit classifications of your past life.

Some small LLMs that can be used for speech to text: https://modal.com/blog/open-source-stt

  • moodwrench@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    3 hours ago

    It’s not lack of software, it’s lack of hardware. Home assistant is ready as are others, but there’s no good cheap mic/speaker/esp in a box hardware

    • Beacon@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      3 hours ago

      No, home assistant very much is not ready to replace an Alexa device. Home assistant mainly only does automation of smart devices, and as far as i can see from their website it does nothing else. One of the main things people use Alexa for is to play music from services like Spotify, and home assistant doesn’t appear to do that.

      • moodwrench@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        2 hours ago

        Sorry… my experience has been trying to move my google home to something open with no cloud… it’s not been perfect for me after moving. Definitely things missing, but lots of things are better. Spotify does work with home assistant… maybe look again or send a pr

        • Beacon@fedia.io
          link
          fedilink
          arrow-up
          0
          ·
          2 hours ago

          It isn’t listed anywhere on their homepage or example demos or anywhere listing its capabilities, so i did a web search to find it and I found that it sorta just kinda can do Spotify, but (1.) that isn’t listed anywhere on the home assistant abilities listing pages, which shows just how not ready for the mass market it is, and (2.) takes a ridiculous amount of very techie setup just to get it to work

          https://www.home-assistant.io/integrations/spotify/

          And also, out of the box can i ask it to:

          • tell me the weather?

          • set a timer?

          • set an alarm?

          I don’t see anything on the website that says it can do these things. And even if it can (which doesn’t appear to be the case from their website) then the fact that the website doesn’t say it can do these things is a problem in itself that shows it isn’t ready for the mass market

          Just look at the webpage for Alexa vs. Home Assistant and it’s clear that Alexa has a very wide variety of abilities and is designed to be easy to use by anyone, while the home assistant website only shows it doing smart device automation and looks like it’s not for regular folks

          https://www.amazon.com/dp/B0DCCNHWV5

          https://www.home-assistant.io/

          I would LOVE to replace my Alexa devices with a local FOSS system, but unfortunately home assistant isn’t close to being able to do that yet

  • brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    3 hours ago

    I mean, there are many. TTS and self-hosted automation are huge in the local LLM scene.

    We even have open source “omni” models now, that can ingest and output speech tokens directly (which means they get more semantic understanding from tone and such, they ‘choose’ the tone to reply with, and that it’s streamable word-by-word). They support all sorts of tool calling.

    …But they aren’t easy to run. It’s still in the realm of homelabs with at least an RTX 3060 + hacky python projects.


    If you’re mad, you can self-host Longcat Omni

    https://huggingface.co/meituan-longcat/LongCat-Flash-Omni

    And blow Alexa out of the water with a MIT-licensed model from, I kid you not, a Chinese food delivery company.


    EDIT

    For the curious, see:

    Audio-text-to-text (and sometimes TTS): https://huggingface.co/models?pipeline_tag=audio-text-to-text&num_parameters=min%3A6B&sort=modified

    TTS: https://huggingface.co/models?pipeline_tag=text-to-speech&num_parameters=min%3A6B&sort=modified

    “Anything-to-anything,” generally image/video/audio/text -> text/speech: https://huggingface.co/models?pipeline_tag=any-to-any&num_parameters=min%3A6B&sort=modified

    Bigger than 6B to exclude toy/test models.

  • grue@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    3 hours ago

    Home Assistant has been heavily working on that sort of functionality lately.

    • 9point6@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 hours ago

      Home assistant continues to be fantastic, I remember it was what felt like fairly recently that all we had was OpenHAB and although it was fine, it was a bit of an uphill struggle to do anything.