I’d imagine there have been more nonsensical (than AI = public domain) legal decisions that have had the full force of law for decades.
I recently dug around for a while, and if the copyright of works in the training data affects the copyright of outputs, no popular model can output anything that would even be close to acceptable for a contribution to an open-source project. Maybe if you trained a model exclusively on “The Stack” (NOT “The Pile”) and then included all the required attributions – but no ready-made model does that. All of the “open source” model frameworks that I could find included some amount of proprietary “pre-training” data that would also be an issue.
If AI output is NOT affected by the copyright of training data… there might not BE a (legal) person that can hold any copyrights over it, which is pretty close to public domain.
I’d imagine there have been more nonsensical (than AI = public domain) legal decisions that have had the full force of law for decades.
I recently dug around for a while, and if the copyright of works in the training data affects the copyright of outputs, no popular model can output anything that would even be close to acceptable for a contribution to an open-source project. Maybe if you trained a model exclusively on “The Stack” (NOT “The Pile”) and then included all the required attributions – but no ready-made model does that. All of the “open source” model frameworks that I could find included some amount of proprietary “pre-training” data that would also be an issue.
If AI output is NOT affected by the copyright of training data… there might not BE a (legal) person that can hold any copyrights over it, which is pretty close to public domain.