Windows users be like

Riskable@programming.dev · 2 days ago

unless you consider every single piece of software or code ever to be just “a way of giving instructions to computers”

Yes. Yes I do. That’s exactly what code is: instructions. That’s literally how computers work. That’s what people like me (software developers) do when we write software: We’re writing down instructions.

When you click or move your mouse, you’re giving the computer instructions (well, the driver is). When you type a key, that’s resulting in an instruction being executed (dozens to thousands, actually).

When I click “submit” on this comment, I’m giving a whole bunch of computers some instructions.

Insert meme of, “you mean computers are just running instructions?” “Always have been.”

Riskable@programming.dev · 2 days ago

In Kadrey v. Meta (court case) a group of authors sued Meta/Anthropic for copyright infringement but the case was thrown out by the judge because they couldn’t actually produce any evidence of infringement beyond, “Look! This passage is similar.” They asked for more time so they could keep trying thousands (millions?) of different prompts until they finally got one that matched enough that they might have some real evidence.

In Getty Images v. Stability AI (UK), the court threw out the case for the same reason: It was determined that even though it was possible to generate an image similar to something owned by Getty, that didn’t meet the legal definition of infringement.

Basically, the courts ruled in both cases, “AI models are not just lossy/lousy compression.”

IMHO: What we really need a ruling on is, “who is responsible?” When an AI model does output something that violate someone’s copyright, is it the owner/creator of the model that’s at fault or the person that instructed it to do so? Even then, does generating something for an individual even count as “distribution” under the law? I mean, I don’t think it does because to me that’s just like using a copier to copy a book. Anyone can do that (legally) for any book they own, but if they start selling/distributing that copy, then they’re violating copyright.

Even then, there’s differences between distributing an AI model that people can use on their PCs (like Stable Diffusion) VS using an AI service to do the same thing. Just because the model can be used for infringement should be meaningless because anything (e.g. a computer, Photoshop, etc) can be used for infringement. The actual act of infringement needs to be something someone does by distributing the work.

You know what? Copyright law is way too fucking complicated, LOL!

Riskable@programming.dev · 2 days ago

Hmmm… That’s all an interesting argument but it has nothing to do with my comparison to YouTube/Netflix (or any other kind of video) streaming.

If we were to compare a heavy user of ChatGPT to a teenager that spends a lot of time streaming videos, the ChatGPT side of the equation wouldn’t even amount to 1% of the power/water used by streaming. In fact, if you add up all the usage of all the popular AI services power/water usage that still doesn’t add up to much compared to video streaming.

Riskable@programming.dev · 2 days ago

Sell? Only “big AI” is selling it. Generative AI has infinite uses beyond ChatGPT, Claude, Gemini, etc.

Most genrative AI research/improvement is academic in nature and it’s being developed by a bunch of poor college students trying to earn graduate degrees. The discoveries of those people are being used by big AI to improve their services.

You seem to be making some argument from the standpoint that “AI” == “big AI” but this is not the case. Research and improvements will continue regardless of whether or not ChatGPT, Claude, etc continue to exist. Especially image AI where free, open source models are superior to the commercial products.

Riskable@programming.dev · 3 days ago

but we can reasonably assume that Stable Diffusion can render the image on the right partly because it has stored visual elements from the image on the left.

No, you cannot reasonably assume that. It absolutely did not store the visual elements. What it did, was store some floating point values related to some keywords that the source image had pre-classified. When training, it will increase or decrease those floating point values a small amount when it encounters further images that use those same keywords.

What the examples demonstrate is a lack of diversity in the training set for those very specific keywords. There’s a reason why they chose Stable Diffusion 1.4 and not Stable Diffusion 2.0 (or later versions)… Because they drastically improved the model after that. These sorts of problems (with not-diverse-enough training data) are considered flaws by the very AI researchers creating the models. It’s exactly the type of thing they don’t want to happen!

The article seems to be implying that this is a common problem that happens constantly and that the companies creating these AI models just don’t give a fuck. This is false. It’s flaws like this that leave your model open to attack (and letting competitors figure out your weights; not that it matters with Stable Diffusion since that version is open source), not just copyright lawsuits!

Here’s the part I don’t get: Clearly nobody is distributing copyrighted images by asking AI to do its best to recreate them. When you do this, you end up with severely shitty hack images that nobody wants to look at. Basically, if no one is actually using these images except to say, “aha! My academic research uncovered this tiny flaw in your model that represents an obscure area of AI research!” why TF should anyone care?

They shouldn’t! The only reason why articles like this get any attention at all is because it’s rage bait for AI haters. People who severely hate generative AI will grasp at anything to justify their position. Why? I don’t get it. If you don’t like it, just say you don’t like it! Why do you need to point to absolutely, ridiculously obscure shit like finding a flaw in Stable Diffusion 1.4 (from years ago, before 99% of the world had even heard of generative image AI)?

Generative AI is just the latest way of giving instructions to computers. That’s it! That’s all it is.

Nobody gave a shit about this kind of thing when Star Trek was pretending to do generative AI in the Holodeck. Now that we’ve got he pre-alpha version of that very thing, a lot of extremely vocal haters are freaking TF out.

Do you want the cool shit from Star Trek’s imaginary future or not? This is literally what computer scientists have been dreaming of for decades. It’s here! Have some fun with it!

Generative AI uses up less power/water than streaming YouTube or Netflix (yes, it’s true). So if you’re about to say it’s bad for the environment, I expect you’re just as vocal about streaming video, yeah?

Riskable@programming.dev · 3 days ago

Correction: Newer versions of ChatGPT (GPT-5.x) are failing in insidious ways. The article has no mention of the other popular services or the dozens of open source coding assist AI models (e.g. Qwen, gpt-oss, etc).

The open source stuff is amazing and gets better just as quickly as the big AI options. Yet they’re boring so they don’t make the news.

Riskable@programming.dev · 4 days ago

Well, the CSAM stuff is unforgivable but I seriously doubt even the soulless demon that is Elon Musk wants his AI tool generating that. I’m sure they’re working on it (it’s actually a hard computer science sort of problem because the tool is supposed to generate what the user asks for and there’s always going to be an infinite number of ways to trick it since LLMs aren’t actually intelligent).

Porn itself is not illegal.

Riskable@programming.dev · edit-2 5 days ago

Say you’re a young woman with a stable male partner. What would it take to convince you to have a baby?

At this point, I’m thinking: A guaranteed income that’s enough to pay for all the expenses a child requires, including housing. For twenty years.

Riskable@programming.dev · 5 days ago

I don’t know, man… Have you even seen Amber? It might be worth an alert 🤷

Riskable@programming.dev · 5 days ago

I don’t know how to tell you this but… Every body gives a shit. We’re born shitters.

Riskable@programming.dev · 5 days ago

The real problem here is that Xitter isn’t supposed to be a porn site (even though it’s hosted loads of porn since before Musk bought it). They basically deeply integrated a porn generator into their very publicly-accessible “short text posts” website. Anyone can ask it to generate porn inside of any post and it’ll happily do so.

It’s like showing up at Walmart and seeing everyone naked (and many fucking), all over the store. That’s not why you’re there (though: Why TF are you still using that shithole of a site‽).

The solution is simple: Everyone everywhere needs to classify Xitter as a porn site. It’ll get blocked by businesses and schools and the world will be a better place.

Riskable@programming.dev · 5 days ago

“To solve this puzzle, you have to get your dog to poop in the circle…”

Riskable@programming.dev · 5 days ago

Yep. Stadia also had a feature like this (that no one ever used).

Just another example of why software patents should not exist.

Riskable@programming.dev · 7 days ago

Working on (some) AI stuff professionally, the open source models are the only models that allow you to change the system prompt. Basically, that means that only open source models are acceptable for a whole lot of business logic.

Another thing to consider: There’s models that are designed for processing: It’s hard to explain but stuff like Qwen 3 “embedding” is made for in/out usage in automation situations:

https://huggingface.co/Qwen/Qwen3-Embedding-8B

You can’t do that effectively with the big AI models (as much as Anthropic would argue otherwise… It’s too expensive and risky to send all your data to a cloud provider in most automation situations).

Riskable@programming.dev · 8 days ago

This doesn’t make sense when you look at it from the perspective of open source models. They exist and they’re fantastic. They also get better just as quickly as the big AI company services.

IMHO, the open source models will ultimately what pops the big AI bubble.

Riskable@programming.dev · 11 days ago

There’s going to be some hilarious memes/videos when these get deployed:

Dominoes!
Someone walking up and kicking one over in protest, only to be surprised AF when another one trips over it and they start to pile up.
News story that reveals the “AI” behind these things stands for “Analog Intelligence” or “Actually Indians”.

Riskable@programming.dev · 12 days ago

It’s close enough. The key is that it’s not something that was just regurgitated based on a single keyword. It’s unique.

I could’ve generated hundreds and I bet a few would look a lot more like a banana.

Riskable@programming.dev · edit-2 12 days ago

Hard disagree. You just have to describe the shape and colors of the banana and maybe give it some dimensions. Here’s an example:

A hyper-realistic studio photograph of a single, elongated organic object resting on a wooden surface. The object is curved into a gentle crescent arc and features a smooth, waxy, vibrant yellow skin. It has distinct longitudinal ridges running its length, giving it a soft-edged pentagonal cross-section. The bottom end tapers to a small, dark, organic nub, while the top end extends into a thick, fibrous, greenish-brown stalk that appears to have been cut from a larger cluster. The yellow surface has minute brown speckles indicating ripeness.

It’s a lot of description but you’ve got 4096 tokens to play with so why not?

Remember: AI is just a method for giving instructions to a computer. If you give it enough details, it can do the thing at least some of the time (also remember that at the heart of every gen AI model is a RNG).

A terrible image of a banana generated by AI using a prompt that did not use the word banana

Note: That was the first try and I didn’t even use the word “banana”.

Riskable@programming.dev · 12 days ago

It’s more like this: If you give a machine instructions to construct or do something, is the end result a creative work?

If I design a vase (using nothing but code) that’s meant to be 3D printed, does that count as a creative work?

https://imgur.com/bdxnr27

That vase was made using code (literally just text) I wrote in OpenSCAD. The model file is the result of the code I wrote and the physical object is the output of the 3D printer that I built. The pretty filament was store-bought, however.

If giving a machine instructions doesn’t count as a creative process then programming doesn’t count either. Because that’s all you’re doing when you feed a prompt to an AI: Giving it instructions. It’s just the latest tech for giving instructions to machines.

Riskable@programming.dev · 12 days ago

What that Afghanistan girl image demonstrates is simply a lack of diversity in Midjourney’s training data. They probably only had a single image categorized as “Afghanistan girl”. So the prompt ended up with an extreme bias towards that particular set of training values.

Having said that, Midjourney’s model is entirely proprietary so I don’t know if it works the same way as other image models.

It’s all about statistics. For example, there were so many quotes and literal copies of the first Harry Potter book in OpenAI’s training set that you could get ChatGPT to spit out something like 70% of the book with a lot of very, very specific prompts.

At the heart of every AI is a random number generator. If you ask it to generate an image of an Afghan girl—and it was only ever trained on a single image—it’s going to output something similar to that one image every single time.

On the other hand, if it had thousands of images of Afghan girls you’d get more varied and original results.

For reference, finding flaws in training data like that “Afghanistan girl” is one of the methods security researchers use to break large language models.

Flaws like this are easy to fix once they’re found. So it’s likely that over time, image models will improve and we’ll see fewer issues like this.

The “creativity” isn’t in the AI model itself, it’s in its use.

Riskable@programming.dev · 2 months ago

Windows users be like

Riskable@programming.dev · 1 year ago

Incident Postmortem

Riskable@programming.dev · 2 years ago

NTFS turns 30 years old today! I hear it's still in use by some crufty old legacy operating systems 😁

Riskable@programming.dev · 3 years ago

Looking for cool Fediverse communities

Riskable