Hooded Horse ban AI-generated art in their games: "all this thing has done is made our lives more difficult"

Tony Bark@pawb.social · 2 days ago

Hooded Horse ban AI-generated art in their games: "all this thing has done is made our lives more difficult"

Katana314@lemmy.world · 1 day ago

If the models are in fact reading code that’s GPL licensed, I think that’s a fair concern. Lots of code on sites like Stack Overflow is shared with the default assumption that their rights are not protected (that varies for some coding sites). That’s helpful if the whole point is for people to copy paste those solutions into large enterprise apps, especially if there’s no feasible way to write it a different way.

The main reason I don’t pursue that issue is that with so much public documentation, it becomes very hard to prove what was generated from code theft. I’ve worked with AI models that were able to make very functioning apps just off a project’s documentation, without even seeing examples.

MountingSuspicion@reddthat.com · 1 day ago

I don’t think training on all public information is super ethical regardless, but to the extent that others may support it, I understand that SO may be seen as fair game. To my knowledge though, all the big AIs I’m aware of have been trained on GitHub regardless of any individual projects license.

It’s not about proving individual code theft, it’s about recognizing the model itself is built from theft. Just because an AI image output might not resemble any preexisting piece of art doesn’t mean it isn’t based on theft. Can I ask what you used that was trained on just a projects documentation? Considering the amount of data usually needed for coherent output, I would be surprised if it did not need some additional data.

Katana314@lemmy.world · 24 hours ago

The example I gave was more around “context” than “model” - data related to the question, not their learning history. I would ask the AI to design a system that interacts with XYZ, and it would be thoroughly confused and have no idea what to do. Then I would ask again, linking it to the project’s documentation page, as well as granting it explicit access to fetch relevant webpages, and it would give a detailed response. That suggests to me it’s only working off of the documentation.

That said, AIs are not strictly honest, so I think you have a point that the original model training may have grabbed data like that at some point regardless. If most AI models don’t track/cite the details on each source used for generation, be it artwork on Deviantart or licensed Github repos, I think it’s fair to say any of those models should become legally liable; moreso if there’s ways of demonstrating “copying-like” actions from the original.