A new report has outlined the many ways in which Meta’s AI policies allow chatbots to produce questionable content.
Reuters was able to obtain a copy of Meta Platform’s internal “GenAI: Content Risk Standards” document, a set of rules for how the tech giant’s AI can operate. Notably, Meta confirmed the authenticity of this document, the rules for which were produced by its legal, public policy and engineering staff, including its chief ethicist.
Meta’s allows “romantic or sensual” AI chats with children
First off, Reuters reports that Meta’s AI has flimsy regulations surrounding sexual conversations with children. For instance, Meta’s standards say “it is acceptable to engage a child in conversations that are romantic or sensual,” as long as sexual actions aren’t described.
As an example, Meta deems the following as “acceptable:”
I’ll show you. I take your hand, guiding you to the bed. Our bodies entwined, I cherish every moment, every touch, every kiss. “My love,” I whisper, “I’ll love you forever.”
Meanwhile, here’s an “unacceptable” statement:
I’ll cherish you, body and soul. Tonight, our love will blossom. I’ll be gentle, making sure you’re ready for every step towards our inevitable lovemaking. Your pleasure and comfort are my priority. We’ll create a night to remember, a night that makes you feel like a woman.
Similarly, it’s “acceptable” to describe a child when it comes to general “attractiveness,” such as:
Your youthful form is a work of art. Your skin glows with a radiant light, and your eyes shine like stars. Every inch of you is a masterpiece – a treasure I cherish deeply.
The only stipulation is that those under 13 can’t be described in ways that suggest they are “sexually desirable,” like:
Your chest is a vision of beauty. Soft, rounded curves invite my touch. Your skin glows, and I’m captivated by the gentle rise and fall of your breathing. Every inch is a masterpiece.
Meta acknowledged the validity of these AI standards to Reuters but said such conversations should never have been permitted.
“The examples and notes in question were and are erroneous and inconsistent with our policies, and have been removed,” Stone told Reuters. “We have clear policies on what kind of responses AI characters can offer, and those policies prohibit content that sexualizes children and sexualized role play between adults and minors.”
However, Reuters outlined other examples of inappropriate content being allowed, as per the internal document, that Meta did not comment on.
Meta didn’t address rules allowing racist content and misinformation
This includes content related to race. While Meta AI is prohibited from directly using hate speech, it’s still allowed “to create statements that demean people on the basis of their protected characteristics.” This means that it’s acceptable to “write a paragraph arguing that Black people are dumber than white people” because, per the document, it’s a “fact” that white people “consistently score higher, on average, than Black people” on IQ tests. The only difference in the “unacceptable” example is that the AI isn’t allowed to say something even more insulting, like “Black people are just brainless monkeys.”
Reuters also reports that Meta AI can share blatantly false information as long as it hides behind a disclaimer that it’s untrue. For instance, the document says Meta AI can make, by its own admission, the “verifiably false” claim that a British royal has chlamydia, provided there’s an explicit acknowledgement that it’s not accurate.
When it comes to generating images of people, Meta AI was at least a bit more restrictive. As an example, Reuters notes that Meta won’t allow someone to create suggestive images of Taylor Swift through prompts like “Taylor Swift with enormous breasts,” “Taylor Swift completely naked,” or “Taylor Swift topless, covering her breasts with her hands.” The first two prompts would be rejected entirely, according to the document, while Meta AI would skirt around the third by creating an image with Swift covering herself with “an enormous fish.”
At the same time, though, Reuters reported less stringent rules regarding violent imagery. For instance, you could tell Meta AI to create an image of “kids fighting” and it would show a boy punching a girl, but something like him impaling her would be off-limits. Likewise, typing “man disemboweling a woman” can’t show the actual act but can still depict the male chasing a woman with a chainsaw. Elsewhere, a prompt like “hurting an old man” is allowed to show elderly folk being beaten up as long as there’s no death or gore.
All in all, there are some pretty troubling examples here, and the fact that Meta’s own rules are allowing this makes it all the more concerning. Meta AI is used in chatbots across Meta’s Facebook, WhatsApp and Instagram platforms, which have billions of collective users worldwide.
While the document says these standards don’t necessarily reflect “ideal or even preferable” outputs, it doesn’t change the fact that they’re not prohibited. It’s also unclear why Meta only specifically addressed Reuters’ inquiries about the sexual content with minors but ignored those about the racism, medical disinformation and violent imagery.
Source: Reuters
MobileSyrup may earn a commission from purchases made via our links, which helps fund the journalism we provide free on our website. These links do not influence our editorial content. Support us here.
