Transcription
A comic in four panels:
Panel 1. Cepper, the Gothic Sorceress, sits at her workbench in the basement of the university with her iconic clothes and glasses on her head. She is surrounded by steampunk cogs, wires, circuits, and code snippets written on parchments. She’s determined in front of her masterpiece, her own local AI Parrot looking like a big pigeon.
Cepper: “I did it! Running 100% locally now. My own machine, my own terms! hehe.”
Panel 2. She excitedly asks it a question, but it takes an eternity to respond.
Cepper: “Avian Intelligence, what’s the airspeed velocity of an unladen swallow?”
Local AI Parrot: “1… 1… m… e… t… e… r… s… loading 2%”
Panel 3. Cepper starts to realize the immense computational power required to run AI models remotely. She looks at her local AI Parrot and starts to wonder.
Cepper: “Ouch. That’s painfully slow, even with the largest magical stone I had!”
Local AI Parrot: “p…e…r… s…e…c…o…n…d… loading 4%”
Panel 4. A shot late at night, she sleeps deeply on a big armchair, while the local AI Parrot still finish to output.
Local AI Parrot: “a…n…d… t…h…a…t…s… a…l…l… loading 100%”
Source: David Revoy— Pepper&Carrot.


Have tried that in the past, that’s not that accurate? Sure it’s not that instant but could get 1 word per second at least, on a RTX 3070
Nowadays I don’t use it, apart from rare cases like making concept art I have a hard time visualising before actually turning it into proper 3d art. Always local models and never showing them in public tho
sort of, on cpu it’s slow, but if you are running a large model on something with s lot of RAM, those will be slower unless you can do multiple GPUs. the main issue is the size of the model. I ran a few smaller ones that i could fit in 16GB but the issue is that they’re not really useful