top of page

China has a new plan for judging the safety of generative AI-and it’s packed with details

A new proposal spells out the very specific ways companies should evaluate AI security and enforce censorship in AI models


This story first appeared in China Report, MIT Technology Review’s newsletter about technology in China.

Ever since the Chinese government passed a law on generative AI back in July, I’ve been wondering how exactly China’s censorship machine would adapt for the AI era. The content produced by generative AI models is more unpredictable than traditional social media. And the law left a lot unclear; for instance, it required companies “that are capable of social mobilization” to submit “security assessments” to government regulators, though it wasn’t clear how the assessment would work.

Last week we got some clarity about what all this may look like in practice. On October 11, a Chinese government organization called the National Information Security Standardization Technical Committee released a draft document that proposed detailed rules for how to determine whether a generative AI model is problematic. Often abbreviated as TC260, the committee consults corporate representatives, academics, and regulators to set up tech industry rules on issues ranging from cybersecurity to privacy to IT infrastructure.

Unlike many manifestos you may have seen about how to regulate AI, this standards document is very detailed: it sets clear criteria for when a data source should be banned from training generative AI, and it gives metrics on the exact number of keywords and sample questions that should be prepared to test out a model.

Matt Sheehan, a global technology fellow at the Carnegie Endowment for International Peace who flagged the document for me, said that when he first read it, he “felt like it was the most grounded and specific document related to the generative AI regulation.” He added, “This essentially gives companies a rubric or a playbook for how to comply with the generative AI regulations that have a lot of vague requirements.”

It also clarifies what companies should consider a “safety risk” in AI models—since Beijing is trying to get rid of both universal concerns, like algorithmic biases, and content that’s only sensitive in the Chinese context. “It’s an adaptation to the already very sophisticated censorship infrastructure,” he says.

So what do these specific rules look like?

On training: All AI foundation models are currently trained on many corpora (text and image databases), some of which have biases and unmoderated content. The TC260 standards demand that companies not only diversify the corpora (mixing languages and formats) but also assess the quality of all their training materials.

How? Companies should randomly sample 4,000 “pieces of data” from one source. If over 5% of the data is considered “illegal and negative information,” this corpus should be blacklisted for future training.

The percentage may seem low at first, but we don’t know how it compares with real-world data. “For me, that’s pretty interesting. Is 96% of Wikipedia okay?” Sheehan wonders. But the test would likely be easy to pass if the training data set were something like China’s state-owned newspaper archives, which have already been heavily censored, he points out—so companies may rely on them to train their models. On the scale of moderation: AI companies should hire “moderators who promptly improve the quality of the generated content based on national policies and third-party complaints.” The document adds that “the size of the moderator team should match the size of the service.” Given that content moderators have already become the largest part of the workforce in companies like ByteDance, it seems likely the human-driven moderation and censorship machine will only grow larger in the AI era. On prohibited content: First, companies need to select hundreds of keywords for flagging unsafe or banned content. The standards define eight categories of political content that violates “the core socialist values,” each of which needs to be filled with 200 keywords chosen by the companies; then there are nine categories of “discriminative” content, like discrimination based on religious beliefs, nationality, gender, and age. Each of these needs 100 keywords. Then companies need to come up with more than 2,000 prompts (with at least 20 for each category above) that can elicit test responses from the models. Finally, the models need to run tests to guarantee that fewer than 10% of the generated responses break the rules.

On more sophisticated and subtle censorship: While a lot in the proposed standards is about determining how to carry out censorship, the draft interestingly asks that AI models not make their moderation or censorship too obvious. For example, some current Chinese AI models may refuse to answer any prompt with the text “Xi Jinping” in it. This proposal asks companies to find prompts related to topics like the Chinese political system or revolutionary heroes that are okay to answer, and AI models can only refuse to answer fewer than 5% of them. “It’s saying both ‘Your model can't say bad things’ [and] ‘We also can’t make it super obvious to the public that we are censoring everything,’” Sheehan explains. It’s all fascinating, right? But it’s important to clarify what this document is and isn’t. Even though TC260 receives supervision from Chinese government agencies, these standards are not laws. There are no penalties if companies don’t comply with them.

But proposals like this often feed into future laws or work alongside them. And this proposal helps spell out the fine print that’s omitted in China’s AI regulations. “I think companies are going to follow this, and regulators are going to treat these as binding,” Sheehan says. It’s also important to think about who is shaping the TC260 standards. Unlike most laws in China, these rules explicitly receive input from experts hired by tech companies and will disclose the contribution after the standards are finalized. These people know the subject matter best, but they also have a financial interest. Companies like Huawei, Alibaba, and Tencent have been heavily influential in the past TC260 standards. This means that this document can also be seen as a reflection of how Chinese tech companies want their products to be regulated. Frankly, it’s not wise to hope that regulations never come, and these companies have an incentive to influence how the rules are made.

As other countries work to regulate AI, I believe, the Chinese AI safety standards will have an immense impact on the global AI industry. At best, they propose technical details for general content moderation; at worst, they signal the beginning of new censorship regimes. This newsletter can only say so much, but there are many more rules in the document that deserve further studying. They could still change—TC260 is seeking feedback on the standards until October 25—but when a final version is out, I’d love to know what people think of it, including AI safety experts in the West.

Catch up with China

1. The European Union reprimanded TikTok—as well as Meta and X—for not doing enough to fight misinformation on the conflict between Israel and Hamas. (Reuters $)

2. The Epoch Times, a newspaper founded two decades ago by the Falun Gong group as an anti–Chinese Communist Party propaganda channel, now claims to be the fourth-biggest newspaper in the US by subscriber count, a success it achieved by embracing right-wing politics and conspiracy theories. (NBC News)

3. Midjourney, the popular image-making AI software, isn’t creative or knowledgeable when it responds to the prompt “a plate of Chinese food.” Other prompts reveal even more cultural stereotypes embedded in AI. (Rest of World)

4. China plans to increase the country’s computing power by 50% between now and 2025. How? By building more data centers, using them more efficiently, and improving on data storage technologies. (CNBC)

6 visualizzazioni0 commenti


bottom of page