• Grimy@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 hours ago

    In the Office’s view, training a generative AI foundation model on a large and diverse dataset will often be transformative. The process converts a massive collection of training examples into a statistical model that can generate a wide range of outputs across a diverse array of new situations. It is hard to compare individual works in the training data—for example, copies of The Big Sleep in various languages—with a resulting language model capable of translating emails, correcting grammar, or answering natural language questions about 20th-century literature, without perceiving a transformation.

    https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf

    You can read the whole doc. The part above is cherry picked. I haven’t read through the whole thing but at a glance, the doc basically explains how it depends. If the model is trained specifically to output one piece content, it wouldn’t be acceptable.

    The waters are muddy but holy fuck does taking the copyright juggernauts side sound bloody stupid.