Angry Instagram posts won’t stop Meta AI from using your content

We've been here before (sort of).
Meta AI
Facebook users briefly tried a similar tactic more than a decade ago, with similarly useless results. Ahmet Serdar Eser/Anadolu via Getty Images

Share

Meta, the Mark Zuckerberg-owned tech giant behind Instagram, surprised many of the app’s estimated 1.2 billion global users with a shock revelation last month. Images, including original artwork and other creative assets uploaded to the company’s platforms, are now being used to train the company’s AI image generator. That admission, initially made public by Meta executive Chris Cox during an interview with Bloomberg last month, has elicited a fierce backlash from some creators. As of writing, more than 130,000 Instagram users have reshared a message on Instagram telling the company they do not consent to it using their data to train Meta AI. Those pleas, however, are founded on a fundamental misunderstanding of creators’ relationship with extractive social media platforms. These creators already gave away their work, whether they realize it or not.

The viral Instagram message came by way of a template—pre-made posts that users can copy onto their own Stories with their own customizable backgrounds. Those posts all feature the same same string of text, which a viewer can see for around five seconds. The template was shared by an wide assortment of digital artists, photographers, and other creators featured the following text:  

“I own the copyright to all images and posts submitted to my Instagram profile and therefore do not consent to Meta or other companies using them to train generative AI platforms. This includes all future AND past posts

@Instagram get rid of the Ai program.” 

These creator complaints, though understandable, nonetheless speaks to a glaring lack of understanding about how social media economies function. Do creators actually own the copyright to works submitted to Instagram? The answer is a big mess. 

Instagram’s terms of service say that the platform does not “claim ownership” of users’ content. But while it doesn’t own the copyright outright, all Instagram’s users already gave the company license to use their works as they see fit when they signed up for the service. In other words, once an image or video is uploaded to Instagram, Meta has free reign to modify, copy, or create derivative works of that content. That wide allowance, it turns out, now includes using the content for training AI models. 

To be absolutely clear, uploading a post telling a social platform you don’t consent to their data practices doesn’t make any material difference. A group of Facebook users briefly tried a similar tactic more than a decade ago, with similarly useless results.  

“While Instagram account owners retain the copyright in what they post, by using the platform they have granted a license to Meta, the terms of use states explicitly that the license they grant is non-exclusive, royalty-free, transferable, sub-licensable and worldwide,” Texas A&M Regents Professor of Law and Communication Peter K. Yu told Popular Science. “That license, however, will end when the content is removed.”

Yu went on to say some jurisdictions with stronger online data privacy laws, like the European Union, may extend privacy protections to personal images, though that’s crucially not the case in the U.S. Similarly, certain US states provide a “right of publicity,” that protects individuals against the unauthorized commercial use of their name, likeness, image, voice or other personal attributes but Yu notes that safeguard’s focus on commercial use means it would likely offer little protection for Instagram users who have their data scrapped to train a model. 

How Meta trains its AI 

Meta, like other tech companies building generative AI tools such as OpenAI and Google, has amassed a vast database of text, image, audio and video with which to train its systems. This data includes billions of web pages, digital books, and multimedia files shared across the web for years. During his interview with Bloomberg, Cox clarified that this cornucopia of training data also sucks up public posts on Instagram which includes users’ images, videos, and public comments and captions. Cox said the company does not (for now at least) train its models on explicitly private data like direct messages or content from private accounts. Meta clarified its updated approach to AI training in a blog post published last month

“With the release of our AI experiences, we’ve shared details about the kinds of information we use to build and improve AI experiences- which includes public posts from Instagram and Facebook—consistent with our Privacy Policy and Terms of Service,” a Meta spokesperson told Popular Science. “Since no AI model is perfect, feedback is instrumental to AI’s continued development, and we’re engaging with experts, policy makers, and advocates as we strive to build AI responsibly.”

If the past is any guide, publicly available Instagram posts could contribute significantly to Meta’s AI image generator output. When Meta released its AI chatbot assistants for Facebook last year, Meta’s President of Global Affairs Nick Clegg told Reuters the “vast majority” of training data used to power the tools AI originated from–yup, you guessed it–publicly available Facebook and Instagram posts. Meta is far from the only company using AI to generate images but it is one of the few that happens to have access to hundreds of millions of loyal users regularly uploading data rich photos and video to its platforms for free. That leveraging of its user base could give Meta an advantage over its competitors.  

Creators on Meta platforms aren’t happy about the company’s seemingly newfound interest in using their accounts as AI fodder. In addition to the posts above, some users’ creators have even gone a step further and threatened to leave the platform entirely if the company continues on its current path. And while users can’t really revoke the consent they gave Meta when signing up for the platform, some users nonetheless defended the creator backlash as a way of collectively voicing frustrations.

“It’s not fear mongering to warn people what rights have been taken from the artists on this app,” one user sharing the template wrote on their story this week. “AI has become integrated into corporate and public usage, we should at least show a little resistance to that??”

Artists are taking generative AI companies to court

If any of this sounds familiar, that’s probably because this isn’t the first time visual artists in particular have voiced their annoyance at AI companies. Last year, a group of professional visual artists including illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz sued Stability, Midjourney, and DeviantArt over claims their AI models were illegally producing images oddly similar to their copyright protected works. Several of these artists have spoken publicly about their shock and dismay when an AI tool would produce work remarkably similar to their if when promoted to create in image “in the style of” their name.

A judge overseeing the artists’ suit was unconvinced that the AI generator outputs (the images it spits out) actually violate copyright since they aren’t exact copies. That said, the artists were still allowed to pursue parts of the suit that allege the companies violated their copyright when they trained the models on their work without permission. But even if those artists eventually emerge victorious in their legal battle it would likely have little effect on the artists crying foul at Meta on Instagram. In the former case, the artists claim AI companies illegally scrapped copyrighted images of their work from non-public sources they didn’t consent to. Unfortunately for artists sharing their work on Instagram, they did technically give Meta consent to mine their work, whether they realized it at the time or not. 

What can Instagram artists actually do to protect their work from AI?  

Meta has released several tools to give users more control over how AI accesses their data, but the results are still limited. Last year, for example, Meta added a new forum to its help center titled “Generative AI Data Subject Rights,” which lets users request and even ask to delete third-party data about them used to train Meta AI models. In theory, this tool lets users attempt to opt-out of having Meta train its models on their works that the company may have sucked up from other sites during its data gathering process. This can include images pulled from blogs, websites or books. This opt-out option, however, notably does not apply to material shared directly on Instagram or other Meta-owned products. That material, in this case, would be considered first-party data. 

More recently the company released a separate form where users can formally “object” to having their data used to train its AI. Here, users can provide their email address, country of residence and an explanation for why they don’t think Meta should scrape their images. It’s worth noting that while Meta says it will review these requests, it also says it may still choose to process a user’s information and use it to train an AI anyway. Of course, users could limit Meta’s ability to train on their data if they make their account private, but that’s an unrealistic option for many artists and creators whose primary reason for posting on the platform is to attract an audience. This approach also wouldn’t apply to older public posts, which Meta has likely already scrapped for data training purposes. 

“While we don’t currently have an opt-out feature, we’ve built in-platform tools that allow people to delete their personal information from chats with Meta AI across our apps,” the Meta spokesperson said. “Depending on where people live, they can also object to the use of their personal information being used to build and train AI consistent with local privacy laws.”  

Providing consent online has become more complicated and less meaningful 

The apparent user confusion over just who gets to use public Instagram posts speaks to a larger, growing issue with the modern internet. “Consent” as it’s commonly understood, has become harder to define. Cornell Tech professor and technology philosopher Helen Nissenbaum expanded on that point in a recent interview with the Harvard Business Review where she said the combination of dense, verbose terms of service agreements and untransparent data privacy practices have left the average internet user unsure or what they are ever actually providing consent for. In reality, according to a 2017 Deloitte survey, 91% of US consumers consent to terms of service agreements without ever fully reading them. 

“There is a strong sense that consent is still fundamental to respecting people’s privacy,” Nissenbaum told HBR. “In some cases, yes, consent is essential. But what we have today is not really consent.”