Meta has swapped roughly half of human content moderation for large language models and aims to exceed 90% automation for some content types by year-end, even as employees raise concerns about oversight.

Meta has already replaced approximately half of all human content moderation requests with large language models (LLMs) in 2025 and plans to push that share above 90 percent for certain content categories by the end of the year, according to reporting by the Financial Times. [1]

The company argues the shift is about quality, not just cost savings. Since March, Meta says its LLMs make 13 percent fewer errors than human reviewers when enforcing content policies and catch 10 percent more actual violations. [1] Meta disputes characterizations that cost reduction is the primary driver, even though the transition is expected to save the company billions annually. [1]

The company also contends that LLMs offer advantages over traditional machine learning classifiers, which struggle with satire and rapidly evolving language, because the newer models are better equipped to handle nuance and support a broader range of languages. [1] The LLMs are trained on past decisions made by human reviewers. [1]

Not everyone inside Meta shares that optimistic assessment. At least one employee has warned that the models still remove or shadow-ban harmless content, and that the pace of the rollout lacks sufficient oversight. [1] The rapid transition is also contributing to layoffs, particularly among external contractors who previously handled moderation work. [1]

A separate development is unfolding in Meta’s choice of underlying model. The company had been using Google’s Gemini for moderation and support tasks, but has recently instructed staff to switch to its own new foundation model, called Muse Spark. [1]

The situation illustrates a tension that developers and businesses building on or alongside AI-powered trust-and-safety systems will increasingly face: automation at scale can improve certain measurable metrics while introducing new failure modes that are harder to catch without robust human review infrastructure in place. [1]


Sources

  1. The Decoder — Meta employees warn AI moderation rollout is too fast

This article was drafted with AI from the cited sources and checked against them before publication. Spot an error? Let us know.