LLM-Driven Large Code Rewrites With Relicensing Are The Latest AI Concern

cm0002 · 4 months ago

LLM-Driven Large Code Rewrites With Relicensing Are The Latest AI Concern

Hetare King@piefed.social · 4 months ago

I’m really having trouble understanding what’s failing to convey here.

Let’s assume for a moment that chardet is a completely new library that from the start consisted mostly of AI-generated code. Let’s also forget that in order to generate that code, the LLM had to absorb a lot of existing code, and its usage may very well itself be many license violations. In this hypothetical situation would there be any point in licensing that code under the LGPL? No, there wouldn’t be, because it wouldn’t be possible to be enforced. This is not a claim anyone is making, this is just the logical conclusion from it not being possible to copyright AI-generated works.

Of course in reality, chardet has been around for a long time and previously consisted of human-authored code, and this new version cannot be considered a product of clean room reverse engineering, both because the maintainers had access to the original code and because LLMs are the very opposite of a clean room, so they don’t have to right to change that license because it would be in violation of the license of the original code. But the maintainers clearly don’t see it that way, otherwise they wouldn’t have done it. So from their perspective, the fact that it’s pointless for AI-generated code to be licensed under the LGPL, could be a reason why they felt the need to change the license.

Was this really not obvious from the context set by the comment I was responding to? I swear, every time I decide to be not so damn wordy, I learn to regret it.

grue@lemmy.world · 4 months ago

In this hypothetical situation would there be any point in licensing that code under the LGPL? No, there wouldn’t be, because it wouldn’t be possible to be enforced.

There would be exactly equally as much point in licensing it under the LGPL as there would be under anything else (in particular: including the MIT license they apparently actually chose). If their argument were really that AI makes it uncopyrightable, they would’ve claimed it to be in the Public Domain rather than attempting to apply any license at all.

So obviously, that can’t be their argument. Their only possible argument has to be that AI magically lets them launder out the copyleft and make it permissive instead, which is straight-up obvious bullshit.

More to the point, you weren’t speaking hypothetically about what they might’ve thought. You were speaking concretely about what you thought. Read it again:

LGPL is unenforceable with AI-generated code.

That’s what you said. Not “the devs claim the LGPL is unenforceable with AI-generated code,” or “hypothetically maybe somebody could argue that the LGPL is unenforceable with AI-generated code” or anything like that. Nope, you just made a straight-up unambiguous claim on your own behalf, full stop.

Your follow-up could be “whoops, I didn’t mean to say that,” but it cannot be “you misunderstood me.” What you wrote was very unambiguous. Don’t insult us by trying to pretend we read it wrong.

Hetare King@piefed.social · 4 months ago

Yes, in principle the same would apply to the MIT license, but in practice it’s pretty much impossible to violate the terms of that license, so it would never get tested. LGPL on the other hand, could lead to real, practical problems. As for why they would insist on MIT, there’s more MIT-licensed code used in production than public domain code. They’re already cosplaying as programmers by producing this slop, who knows what else they’ll do for the sake of appearances?

Is it possible that the license change was the goal and the use of AI was the means to achieve it? Of course. Should I have expressed that what I proposed was only a possible reason? Yeah, probably. But putting it as something like “the devs claim yada yada…” would have been incorrect. While the way the original question was asked meant that obviously, any answers would be from the hypothetical perspective of the maintainers (which is why the fact that the new version of chardet violates the license of the original code is irrelevant, because they wouldn’t think so), I worded my comment as an assertion because it was an assertion. By me. Because it’s the only possibility that is consistent with current legal precedent. And whether or not that was the or a reason for the license change, it’s something that would have been a real issue had they kept the license.

Now, you accuse me of insulting people’s intelligence, but when two people respond to my comment in the same way, obviously the problem lies with me. But you have been very unclear in conveying what exactly that problem is. You went form pointing out that the generated code is still in violation of the original license, which while true, is again, irrelevant in this context (and I still don’t really understand why you would think that it was) to not liking how assertively I worded the consequences for the enforceability of the LGPL license when it comes to code than cannot have copyright, I guess?