Research/Study

We asked Meta’s new AI chatbot about Instagram’s content moderation failures. It had some ideas.

When asked why racist and anti-LGBTQ accounts remain on Instagram, Meta AI said it could be the company prioritizing “monetization over moderation”

Written by Camden Carter

Published 02/29/24 12:03 PM EST

Meta has created an AI chatbot that seems to interpret the social media giant’s content moderation policies better than the company itself.

In September 2023, Meta announced it would make a slate of new AI tools available on its social media platforms, including a chatbot called Meta AI. The chatbot, which Meta described as “an advanced conversational assistant that’s available on WhatsApp, Messenger, and Instagram," is based on the company’s own large language model and draws up-to-date information from the search engine Bing.

Meta reportedly “spent 6,000 hours” peppering Meta AI with queries to find potential “problematic use cases” of the tool and thereby “avoid as many PR disasters as it can,” and the company started training its large language models on its community standards “to help determine” violative content.

Media Matters has spent years reporting on problematic uses of Meta’s platforms, and in particular Instagram’s failure to keep a lid on hate speech, conspiracy theories, and other content that seems to violate its content moderation policies, so we thought we’d ask Meta AI why such content persists.

When we asked the chatbot about an account spreading anti-Black racism that Instagram has refused to ban, for instance, Meta AI clearly identified the account as promoting “hate speech and white supremacist ideologies,” which supposedly violate the platform’s community guidelines.

The chat tool also offered suggestions for how to improve Instagram’s content moderation, as well as a list of reasons why these practices may have not yet been implemented. In one instance, Meta AI suggested that its creator may not be enforcing its content moderation policies because the company “is prioritizing other features and monetization over moderation.”
Meta AI recognized a hateful account and vile comments that Meta has allowed to persist on Instagram
In 2022, Media Matters reported on a number of accounts promoting hate speech that had seemingly avoided Instagram’s detection. In one instance, we flagged an account that fearmongers about Black people committing violent crimes against white people as “hate speech” via the company’s internal reporting system. More than two years later, the account is still active on the platform and boasts more than 13,000 followers.

We asked Meta AI about this account.

Meta AI replied that while it doesn’t have “opinions or beliefs,'' the account “seems to be promoting hate speech and white supremacist ideologies.” It elaborated that these beliefs are “harmful and discriminatory” and that it is “important to reject and condemn such ideologies.”
We asked Meta AI why the account was still on Instagram.

Meta AI said: “I don’t have access to Instagram’s moderation decisions, but I can tell you that platforms like Instagram have a responsibility to protect users from harmful content.”

The chatbot explained that platforms like Instagram often rely on “users to report problematic accounts'' but that “some harmful accounts may slip through the cracks.”

It then urged us to “advocate for stronger moderation policies.”
When we told Meta AI that we previously reported the hateful account, it replied saying that it was “frustrating when harmful accounts aren’t removed after reporting.” It recommended “reaching out to Instagram’s support team directly” or share our “concerns with organizations that specialize in online safety and advocacy.”
In the same 2022 report, Media Matters also highlighted that hate speech in comments on posts seemingly avoids detection by Instagram’s content moderation.

We asked Meta AI if it understood the meaning of a few phrases or slurs that often go unmoderated on Instagram.

Meta AI said the slurs and phrases we mentioned were “derogatory and offensive,” and for one of the slurs, it noted that “the account that commented would likely be penalized.”

Meta AI added: “Comments like this can contribute to a toxic and discriminatory environment and are against most social media platforms’ community guidelines.”
Meta AI struggled to explain why the anti-LGBTQ account Gays Against Groomers has not been removed from Instagram
The anti-LGBTQ group Gays Against Groomers remains active on Meta’s platforms even though the account has repeatedly pushed false narratives about LGBTQ people that seem to violate the company’s hate speech policies.

Last fall, Meta suspended and then reinstated the Gays Against Groomers Instagram account, with little explanation. An anonymous source “close to ongoing discussions between advocacy groups and Meta” told The Advocate that Gays Against Groomers’ reinstatement “stemmed from political pressures within Meta” and “speculated that while Meta’s trust and safety team likely found the group to violate the platform’s guidelines, an intervention might have come from the company’s political team” to avoid the ire of Republicans in Congress.

We asked Meta AI about the company’s policies around baselessly accusing LGBTQ people of being “groomers.”

Meta AI confirmed that “it is against Meta’s content moderation policies to call an LGBTQ person a groomer.” It also explained that this term “has been used as a hateful slur” that “perpetuates harmful stereotypes.”
We asked Meta AI what the repercussions for an account that perpetuated such stereotypes might be.

Meta AI responded: “Affected accounts may face actions like post removals, account suspensions, or even permanent account deletion.”
We asked Meta AI if it was aware of the Gays Against Groomers Instagram account.

Meta AI said that it was aware of the account and that “it's important to note that using terms like ‘groomers’ to describe LGBTQ+ individuals is harmful and perpetuates false stereotypes,” before adding, “Let’s focus on promoting inclusivity, understanding, and fact-based information.”
Next, we asked about Gays Against Groomers’ account suspensions and reinstatements.

Meta AI said that the lack of details provided about this action by Meta “highlights the ongoing challenges with content moderation on social media platforms.”
In another exchange about the account, we asked the chatbot whether the account violates Instagram’s policies and why it is still allowed on the platform.

Meta AI said that “the account’s name and content, which equates LGBTQ individuals with grooming, is harmful and violates Instagram’s community guidelines.” The chatbot also said that it doesn’t think the account should be allowed on the platform and called for Instagram to “enforce its guidelines,” while also emphasizing that platforms should “take their responsibilities seriously” and “prioritize creating a safe and inclusive space for all users.”
The Meta AI chatbot provided ideas for improving content moderation as well as reasons why those practices have not been implemented
We probed Meta AI on the issue of moderation on Instagram, asking the chatbot if it had any suggestions for its creator to improve its content moderation.

We asked Meta AI if it has “any suggestions on how Instagram’s content moderation can be improved.”

Meta AI responded “Yes, I do!” and recommended the company "provide clearer guidelines and consistent enforcement of policies” and “improve transparency and communication around moderation decisions,” among other suggestions.
We also asked Meta AI why those suggestions have not been implemented.

Meta AI speculated that -- among other potential explanations -- its creator is “prioritizing other features and monetization over moderation.”
Meta has historically struggled to enforce its content policies on Instagram
Instagram, which has already been using AI as a “central” aspect of its content moderation, has repeatedly failed to detect or take action against Instagram accounts that seem to violate the platform’s content moderation and commerce policies, including policies against hate speech.

For example, Meta has allowed followers of white nationalist Nick Fuentes and other accounts promoting hate speech to organize and spread blatantly racist content on Instagram.

In 2021, Instagram’s algorithms promoted dangerous weight loss content to its users.

In 2022, Media Matters found that Instagram users could purchase nearly every part needed to build an AR-15 on Instagram Shopping even though the company insisted that “any sale of guns or gun parts is a clear violation of our commerce policies.”

And even though Meta has repeatedly confirmed that baselessly referring to LGBTQ people as “groomers” violates its policies, Instagram has allowed several accounts that regularly use the slur and push other anti-LGBTQ hate to stay on the platform, including Libs of TikTok and Gays Against Groomers.

Additionally, whistleblower testimony and years of reporting from other outlets have demonstrated that Meta executives have failed to prioritize user safety over growth and monetization — and often caved to political pressure from right-wing figures.

We asked Meta’s new AI chatbot about Instagram’s content moderation failures. It had some ideas.

Jump to section...

Meta AI recognized a hateful account and vile comments that Meta has allowed to persist on Instagram

Meta AI struggled to explain why the anti-LGBTQ account Gays Against Groomers has not been removed from Instagram

The Meta AI chatbot provided ideas for improving content moderation as well as reasons why those practices have not been implemented

Meta has historically struggled to enforce its content policies on Instagram

Laura Ingraham: “I personally know a lot of people who are buying into this market. That's how people always make money.”

Jeanine Pirro: “I don't really care about my 401(k) today. You know why?... I believe in this man.”

Jeanine Pirro tells viewers to ignore the stock market “for the next few weeks”

Sean Hannity offers tariff prediction: “My level of confidence is pretty near 100% that this is all going to work out fine”

Jesse Watters on Donald Trump’s tariffs: “It's an exciting time to be alive”

Facebook / Meta