The arts community opposes generative AI, fearing theft of intellectual property. Accusations target Big Tech’s practices, particularly Meta’s, while a certification process for ethical sourcing in AI may offer reassurance.
Update: On Sept. 5, 2025, Anthropic and plaintiffs in a lawsuit against the AI company announced a $1.5 billion settlement, which will include payouts to writers who believe Anthropic stole their work to train its AI models.
The arts community has stiff-armed generative AI. Almost from the day OpenAI launched a public preview of ChatGPT in December 2022, writers and visual artists have accused the company and Big Tech of stealing their work to train AI models. When I speak to my writer friends about my use of ChaptGPT and Claude.ai as writing assistants, a collective groan stops the conversation, as if I’ve brought up national politics.
Based as much on misunderstanding than truth, the accusation threatens acceptance of AI as an augment to creativity. Creators working in AI, as well as software developers and entrepreneurs, appear ill-equipped to deal with the blowback. In fact, some may be willfully ignoring the problem.
The industry must reassure artists and writers that their work is respected, valued and that they are entitled to compensation.
Meta is guilty of stealing
Big Tech isn’t helping. Piracy is evil, and Mark Zuckerberg’s Meta is guilty. After reading the revelations in Alex Reisner’s March 20 article in The Atlantic on AI and stolen content, I’ve concluded that Zuckerberg doesn’t give a damn about others’ intellectual property. His use of the Library Genesis (LibGen) stolen content database to train Meta’s Llama 3 model threatens to undermine the potential for generative artificial intelligence—text and images—already suffering from a sketchy reputation.
I’ll admit to ambivalence about AI training. For their prediction engines to work, large language models need mountains of data. Most of the leading AI developers have admitted scraping the web for text to train LLMs. Web scraping is common, going back to the early search engines that built listings with indexed web text. Technically, sites can opt out, but not every developer running a web crawler honors the request.
Something similar is happening with AI data ingestion; the LibGen database swallowed books and articles without the author’s permission. This set off lawsuits that led to revelations of Zuckerberg’s blessing of LibGen as a training tool for Meta AI. He knowingly accepted stolen goods for his own profit-making purposes. (LibGen contains my books Carbon Run and The Mother Earth Insurgency. No one asked permission.)
Have creatives overreacted?
Despite my doubts, I think many creatives have overreacted. Under the well-established “fair use” doctrine, artists can use limited amounts of others’ material without violating copyright. However, with AI, the boundaries are now blurred. A judge and jury will ultimately decide whether Meta, OpenAI and other developers committed theft. Unfortunately for the plaintiffs, when new ideas and applications pop up, judges tend to side with the defendants in the name of encouraging new technologies and ways of communicating.
The Meta scenario still pisses me off. It again underlines the arrogance of Zuckerberg and other tech titans’ “move fast and break things” MO, which almost always enriches them at the expense of those with limited resources to fight back. Any price Big Tech may pay for stolen work will simply be the cost of doing business, because the AI training horse has now left the barn.
A consumer seal of approval?
I’d rather use “ethically sourced” LLMs and other material for generative AI. But it’s a murky term, usually referring to content licensed from a legitimate third-party media source or custom-made internally for the AI model. But how does a creator—either of the training data or the end user—know whether the source data was acquired ethically?
How about an independent body’s certification, a kind of Good Housekeeping Seal of Approval or a high rating by the Better Business Bureau? At least it would be an attempt to separate the marginal actors from the upstanding ones and set a standard for excellence. A robust industry certification process could stop the reputational bleeding. It could help creatives feel more confident that their original works haven’t been stolen for AI model training or other purposes. And it could save the secret embarrassment some end users might feel when discussing their projects.
Tell me what you think. Have you used AI in your creative work, despite misgivings about the source?
Image: Microsoft Image Creator


One response to “Stolen work and AI: Why writers and artists worry about Meta”
There are good objections to AI, but the accusation that it’s ‘stealing’ isn’t one of them. The accusations of ‘theft’ are condemning activities that any legitimate user also does – reading, selecting, and building on other people’s work. https://charlesspicker.wordpress.com/2025/01/06/why-is-this-cover-controversial/