Edit to add: I also found someone who recorded a voice chat of the same thing. This isn’t that someone uploaded a song, or that AI didn’t actually process the file. These models really are this sycophantic:

https://m.youtube.com/shorts/JqvDLHshTtI

    • Anisette [any/all]@quokk.au
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      yeah but that doesn’t mean anything, does it? I don’t think they just tokenize the raw audio, that wouldn’t make sense, right?

      • sobchak@programming.dev
        link
        fedilink
        arrow-up
        2
        ·
        4 hours ago

        I mean, you could. Just encode 100ms chunks or whatever into tokens then push them through the same model. I’m pretty sure that’s what the claim to do (though with MoE/routing now, maybe).