AI Code is Hollowing Out Open Source, and Maintainers are Looking the Other Way

yoasif@fedia.io · 21 hours ago

AI Code is Hollowing Out Open Source, and Maintainers are Looking the Other Way

chicken@lemmy.dbzer0.com · edit-2 9 hours ago

The only portions of the work that can be copyrighted are the actual creative work the person has put into the work.

Ok, but it’s not like everyone is documenting exactly which parts are generated, curated, or human written.

Maintainers cannot prevent the LLM code from being incorporated into closed source projects without reciprocity

Say someone incorporates GPL code without attribution, and gets sued for doing so. They try to make the argument in court that the source material they used is not copyrighted, because of AI. Won’t they have to prove that the parts they used were actually AI output for this defense to work? It isn’t like people are going around ignoring the copyright on things in general if they look like they were probably generated with AI, that isn’t enough to be safe from prosecution, because you usually can’t know the exact breakdown. It seems like preventing this loophole from being used would be as simple as keeping it ambiguous and not allowing submissions that positively affirm being entirely AI generated.

yoasif@fedia.io · 4 hours ago

I don’t really think we need to go down the copyfraud path to see that AI code damages copyleft projects no matter what - we know that some projects are already accepting AI generated code, and they don’t ask you to hide it - it is all in the open.

Buelldozer@lemmy.today · 17 hours ago

This is a fast path to open source irrelevancy, since the US copyright office has deemed LLM outputs to be uncopyrightable.

This is a misunderstanding of US Copyright. Here’s a link to the compendium so you can verify for yourself.

Section 313 says “Although uncopyrightable material, by definition, is not eligible for copyright protection, the Office may register a work that contains uncopyrightable material, provided that the work as a whole contains other material that qualifies as an original work of authorship…”

This means that LLM created code that’s embedded in a larger work may be registered.

Section 313.2 says “Similarly, the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”

Meaning that LLM created code CAN be registered as long as an author has some creative input or intervention in the process. I’d posit that herding an LLM system to create the code definitely qualifies as “creative input or intervention”. If someone feels it isn’t then all they need to do is change something, literally anything, and suddenly it becomes a derivative work of an uncopyrighted source and the derivative can then be registered (to a human) and be subject to copyright.

In short, it’s fine. Take a breath.

LukeZaz@beehaw.org · edit-2 8 hours ago

In short, it’s fine. Take a breath.

Ehhhhh, that depends on how you take it. Personally, no, I’m not very worried about the legal aspect. But,

It’s still LLMs. FOSS communities have been better than average, but that bar is a low one considering coders generally have been using LLMs most of all. And LLM usage is reckless, not to mention presently harmful in numerous ways. (And yes, this means the latest models too. “Looks good” doesn’t mean it is good.) I’d just as soon FOSS not use the tech at all.

slacktoid@lemmy.ml · edit-2 20 hours ago

The way I see it is, and not saying this isn’t a valid concern, it that it still doesn’t help with code maintenance. Just cause you can create it doesn’t mean you can maintain it. Many companies moved to open source (not free software) cause of the financial incentives of security and long term maintainability of the codebase. Think of how much better say tensorflow and pytorch got because it was opensource. The engineers of Google and meta could make it what were their reasons for open sourcing it? I doubt these reasons have changed with ai. Cause nothing beats free Q&A testing.