

Ah, ok. This is a conversation about Linux, so that doesn’t apply. Linux is open source, so it wouldn’t matter if someone wanted to enforce a EULA, anyone else could just take the source and do what they want with it.


Ah, ok. This is a conversation about Linux, so that doesn’t apply. Linux is open source, so it wouldn’t matter if someone wanted to enforce a EULA, anyone else could just take the source and do what they want with it.


Generating code costs a lot of money, as does the expertise to review the code. People aren’t going to want to spend the many millions of dollars to do that when they could use a GPL kernel. Of course if the kernel is not only free, but basically public domain, it solves all of their problems. They can modify it and keep those modifications closed source, the complete antithesis of what the GPL stands for.


Sure, but if it’s open source, I can just take that code without agreeing to your contract. Since it’s public domain, I can do whatever I want with it. You can only enforce a contract if I agree to it.


So what happens thirty years from now when 95% of the kernel code is AI generated? It’ll be a lot easier to rewrite the parts that aren’t, and have a fully closed source kernel that you can use without following the GPL.


I mean, yeah, you can make the argument that owning the copyrights to all of the code in your project isn’t important. I don’t agree, but that’s certainly a valid stance. Apparently the Linux maintainers are on your side. That makes me sad. Copyright ownership of the things I produce is very important to me.


Wow, what an atrocious analogy. So, you just can’t determine what brand of keyboard someone uses, period. When someone uses an AI, there will be certain patterns that are somewhat more common in their code. Their code will also look different than their previous code. It also tends to produce very large commits. You can also ask them why they did certain things and see how they answer. So you might not be 100% accurate, but there are ways to tell when someone is using AI.


Do you want to explain to me what, in those two paragraphs, means that the use of spell checkers and LLMs is equivalent with regard to copyrightability? It seems like those paragraphs make it clear that the use of spell checkers is not the same as LLMs.
The policy I use bans “generative AI model” output. Generative AI is a pretty well defined term:
https://en.wikipedia.org/wiki/Generative_AI
https://www.merriam-webster.com/dictionary/generative AI
If you have trouble determining whether something is a generative AI model, you can usually just look up how it is described in the promotional materials or on Wikipedia.
Type: Large language model, Generative pre-trained transformer
- https://en.wikipedia.org/wiki/Claude_(language_model)
I never said it violates GPL to include public domain code. I’m not sure where you got that from. What I said is that public domain code can’t really be released under the GPL. You can try, but it’s not enforceable. As in, you can release it under that license, but I can still do whatever I want with it, license be damned, because it’s public domain.
I did that with this vibe coded project:
https://github.com/hperrin/gnata
I just took it and rereleased it as pubic domain, because that’s what it is anyway.


Nobody can verify that the output of an LLM isn’t from its training data except those with access to its training data.


If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user. Based on the Office’s understanding of the generative AI technologies currently available, users do not exercise ultimate creative control over how such systems interpret prompts and generate material. Instead, these prompts function more like instructions to a commissioned artist—they identify what the prompter wishes to have depicted, but the machine determines how those instructions are implemented in its output. For example, if a user instructs a text-generating technology to “write a poem about copyright law in the style of William Shakespeare,” she can expect the system to generate text that is recognizable as a poem, mentions copyright, and resembles Shakespeare’s style. But the technology will decide the rhyming pattern, the words in each line, and the structure of the text. When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship. As a result, that material is not protected by copyright and must be disclaimed in a registration application.
That seems very clear to me. Generative AI output is not human authored, and therefore not copyrighted.
The policy I use also makes very clear the definition of AI generated material:
https://sciactive.com/human-contribution-policy/#Definitions
I’m not exactly sure how you can possibly think there is an equivalence between a tool like a spelling and grammar checker and a generative AI, but there’s a reason the copyright office will register works that have been authored using spelling and grammar checkers, but not works that have been authored using LLMs.


Yes, that makes sense. People have always been able to intentionally commit copyright infringement. However, it has historically been fairly difficult to unintentionally commit copyright infringement. That’s no longer the case. AI makes it very easy to unintentionally commit copyright infringement. That’s a good reason to ban it outright.


There are so many reasons not to include any AI generated code.


Unless the code the AI generated is a copy of copyrighted code, of course. Then it would be copyright infringement.
I can cause the AI to spit out code that I own the copyright to, because it was trained on my code too. If someone used that code without including attribution to me (the requirement of the license I release my code under), that would be copyright infringement. Do you understand what I mean?


The copyright office said material generated by AI is not copyrighted, even if that material is subsequently revised by the AI through additional prompts. That includes code. The GPL can only be used on copyrighted code. It is a copyleft license because it uses copyright law as a mechanism to enforce its terms. If you believe you can enforce a license on public domain material, that’s simply a gross misunderstanding of copyright law.
Yes, it will hopefully be a very small part of the kernel, but what happens thirty years from now if the kernel is all AI generated code? It may be a slippery slope, but it’s a valid slippery slope. The more the kernel is AI generated, the less of it the license can cover.


Sure, you can license them, but that license is unenforceable, because you don’t own the copyrights, so you can’t sue anyone for copyright infringement. And you’d have to be a fool to agree to a license for public domain material. You can do whatever you want with it, no license necessary.


If the author is an LLM, then the author is not a human.


I think you’re misunderstanding what I’m saying. Any portions of the kernel that are public domain can be used by anyone for any purpose without following the terms of the GPL. AI generated code is public domain. To make sure all parts of the kernel are protected by the GPL, public domain code should not be accepted unless absolutely necessary.


Ok, well here are quotes from the US Copyright Office that establish that what I said is true:
https://sciactive.com/human-contribution-policy/#More-Information
You replied to me, man. xD