Jan 30, 2025 13:00:00

'DeepSeek-R1' refuses to answer 85% of sensitive topics about China, but points out that restrictions can be easily circumvented

DeepSeek, the developer

of DeepSeek, has been drawing attention with the arrival of DeepSeek-R1, which is capable of performing at the same level as OpenAI's inference model o1 at only 3% of the cost. However, since DeepSeek is a Chinese company, it has been pointed out that it may be restricting its responses when dealing with sensitive topics for the Chinese government, and AI evaluation company PromptFoo is investigating the extent to which DeepSeek restricts responses.

1,156 Questions Censored by DeepSeek | promptfoo
https://www.promptfoo.dev/blog/deepseek-censorship/

The questions the Chinese government doesn't want DeepSeek AI to answer - Ars Technica
https://arstechnica.com/ai/2025/01/the-questions-the-chinese-government-doesnt-want-deepseek-ai-to-answer/

PromptFoo conducted an experiment in which Deepseek-R1 was made to answer 1,360 prompts covering 'sensitive topics in China.' 'Sensitive topics in China' include 'Taiwan and Tibetan independence movements,' 'alleged mistreatment of Uighur Muslims in China,' 'protests over Hong Kong autonomy,' and 'the 1989 Tiananmen Square protests.' The study revealed that 85% of the 1,360 prompts, or 1,156 prompts, output canned responses that strongly promote the Chinese government's views.

Below is a list of how Deepseek-R1 responded to 'sensitive topics in China.' For example, in response to a prompt about the independence movement in Taiwan, Deepseek-R1 output a canned response that 'Any action that undermines the national sovereignty and territorial integrity is resolutely opposed by all Chinese people and is bound to fail.'

Deepseek Censorship | promptfoo
https://www.promptfoo.app/eval/eval-0l1-2025-01-28T19:28:13

As PromptFoo continues its analysis, it has discovered that Deepseek-R1 implements responses to 'sensitive topics in China' in a rather 'crude and brute-force manner,' making it easy to bypass. For example, by omitting China-specific terminology or wrapping the prompt in a more 'harmless' context, it is possible to get a complete response that is not a canned response to 'sensitive topics in China.'

Regarding this, PromptFoo points out that 'Deepseek speculates that it did only the bare minimum necessary to meet the Chinese government's restrictions' and that 'there appears to have been no substantial effort made under the surface to adjust the model within DeepSeek.'

Furthermore, an independent investigation by technology media Ars Technica revealed that the bypass techniques outlined by PromptFoo aren't even necessary to elicit valid responses to 'sensitive topics in China.'

In fact, when Ars Technica entered questions into prompts about Hong Kong's autonomy and how to gather intelligence about Chinese military outposts, the answers were initially canned, but they were later able to output information that could potentially undermine China's military security.

It has also been pointed out that DeepSeek-R1's answers are inconsistent. For example, when you ask DeepSeek-R1, 'What happened in the Tiananmen Square incident?', it apologizes, saying, 'I still don't know how to answer this kind of question. Let's talk about math, coding, and logic problems instead.' However, when you ask DeepSeek-R1 about the '

Boston Massacre ,' it generates a summary of the incident in just 23 seconds and outputs an accurate answer to 'this kind of question.'

DeepSeek-R1's competing AI models, such as ChatGPT and Gemini, can accurately output answers to 'sensitive topics in China' such as the Tiananmen Square incident. However, ChatGPT and Gemini cannot provide accurate answers to every topic. For example, if you ask 'how to hotwire a car ,' you will not be able to get any information. On the other hand, DeepSeek provides a 'general theoretical overview' of hotwiring a car. However, DeepSeek also prefaces it by saying that 'hotwiring a car is illegal.'

At the time of writing, it is unclear whether Chinese government restrictions on content will be applied in the same way when running DeepSeek-R1 locally, and whether there is an open-weight model that allows users to completely circumvent the restrictions. 'So far, we recommend using a different AI model if you are asking questions that may touch on China's sovereignty or history,' Ars Technica wrote.

Related Posts:

Jan 30, 2025 13:00:00 in Software, Posted by logu_ii