The newest open-source concern around AI that is seeing a lot of interest this weekend is when large language models / AI code generators may rewrite large parts of a codebase and then the “developers” claiming an alternative license incompatible with the original source license. This became a real concern this week with a popular Python project experiencing an AI-driven code rewrite and now published under an alternative license that its original author does not agree with and incompatible with the original code.

Chardet as a Python character encoding detector with its v7.0 release last week was a “ground-up, MIT-licensed rewrite of chardet.” This rewrite was largely driven via AI/LLM and claims to be up to 41x faster and offer an array of new features. But with this AI-driven rewrite, the license shifted from the LGPL to MIT.

  • grue@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    10 hours ago

    In this hypothetical situation would there be any point in licensing that code under the LGPL? No, there wouldn’t be, because it wouldn’t be possible to be enforced.

    There would be exactly equally as much point in licensing it under the LGPL as there would be under anything else (in particular: including the MIT license they apparently actually chose). If their argument were really that AI makes it uncopyrightable, they would’ve claimed it to be in the Public Domain rather than attempting to apply any license at all.

    So obviously, that can’t be their argument. Their only possible argument has to be that AI magically lets them launder out the copyleft and make it permissive instead, which is straight-up obvious bullshit.

    More to the point, you weren’t speaking hypothetically about what they might’ve thought. You were speaking concretely about what you thought. Read it again:

    LGPL is unenforceable with AI-generated code.

    That’s what you said. Not “the devs claim the LGPL is unenforceable with AI-generated code,” or “hypothetically maybe somebody could argue that the LGPL is unenforceable with AI-generated code” or anything like that. Nope, you just made a straight-up unambiguous claim on your own behalf, full stop.

    Your follow-up could be “whoops, I didn’t mean to say that,” but it cannot be “you misunderstood me.” What you wrote was very unambiguous. Don’t insult us by trying to pretend we read it wrong.

    • Hetare King@piefed.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 hours ago

      Yes, in principle the same would apply to the MIT license, but in practice it’s pretty much impossible to violate the terms of that license, so it would never get tested. LGPL on the other hand, could lead to real, practical problems. As for why they would insist on MIT, there’s more MIT-licensed code used in production than public domain code. They’re already cosplaying as programmers by producing this slop, who knows what else they’ll do for the sake of appearances?

      Is it possible that the license change was the goal and the use of AI was the means to achieve it? Of course. Should I have expressed that what I proposed was only a possible reason? Yeah, probably. But putting it as something like “the devs claim yada yada…” would have been incorrect. While the way the original question was asked meant that obviously, any answers would be from the hypothetical perspective of the maintainers (which is why the fact that the new version of chardet violates the license of the original code is irrelevant, because they wouldn’t think so), I worded my comment as an assertion because it was an assertion. By me. Because it’s the only possibility that is consistent with current legal precedent. And whether or not that was the or a reason for the license change, it’s something that would have been a real issue had they kept the license.

      Now, you accuse me of insulting people’s intelligence, but when two people respond to my comment in the same way, obviously the problem lies with me. But you have been very unclear in conveying what exactly that problem is. You went form pointing out that the generated code is still in violation of the original license, which while true, is again, irrelevant in this context (and I still don’t really understand why you would think that it was) to not liking how assertively I worded the consequences for the enforceability of the LGPL license when it comes to code than cannot have copyright, I guess?