• GamingChairModel@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      3 minutes ago

      Yeah, it’s counterintuitive because it’s a lot more work for a human to draw a picture (much less a photorealistic picture) than to write a few words, but human language grammar actually has a lot of strict rules that makes that stream of letters work as “valid” output, much less “decent” output that kinda matches the prompt/description. Transpose a pair of letters or even substitute a single letter (or token) and you’ve got an output that just doesn’t work, in a way that generated images don’t have to worry about.