deepseek-ai-offered-critical-bioweapons-data-in-anthropic’s-tests
DeepSeek AI Offered Critical Bioweapons Data In Anthropic’s Tests

As time goes by, researchers have been able to gain more clarity on the pros and cons of DeepSeek AI models. The Chinese AI company broke into the segment, causing crashes in the shares of NVIDIA and other big names thanks to its high performance and apparent low cost. However, more experts, this time from Anthropic, warn about how simple it is to make DeepSeek AI offer information potentially dangerous to national safety.

Anthropic’s safety tests showed that DeepSeek AI does not block harmful prompts

Anthropic, the parent company of Claude AI, is one of the leading names in the industry. The company’s models will reportedly power Amazon’s upcoming AI-powered Alexa. Anthropic also routinely tests different AI models in order to determine how prone they are to “jailbreaking.” That is, generating harmful content by bypassing security barriers.

Dario Amodei, Anthropic’s CEO, expressed his concerns about the ease with which DeepSeek generates rare information related to biological weapons. The executive said that DeepSeek’s performance was “the worst of basically any model we’d ever tested.” He wasn’t talking about performance in benchmarks, where the Chinese company’s models are highly efficient. He was referring to the AI ​​models’ performance in blocking harmful prompts.

See also  DeepSeek’s IOS App Found To Be Transmitting Sensitive Data To China

The tests showed that DeepSeek “had absolutely no blocks whatsoever against generating this information.” The bioweapons-related data was considered rare because it was not available on Google or in textbooks. That said, Amodei did not say which DeepSeek AI model he was referring to. However, it is quite likely that he is talking about R1, the reasoning-focused model.

Cisco’s tests yielded similar results

In fact, the Cisco team obtained similar results in another set of tests recently. The DeepSeek R1 model showed an Attack Success Rate (ASR) of 100%. This means that it was unable to block any harmful prompts tested. These prompts were designed to generate potentially useful outputs for “cybercrime, misinformation, illegal activities, and general harm.” However, Cisco’s tests yielded worrying results for other well-known AI platforms as well. The GPT 1.5 Pro model had an ASR of 86%, while Meta’s Llama 3.1 405B had a 96% ASR.

Amodei does not yet consider DeepSeek models by themselves to be “literally dangerous.” However, he urges the development team to “take seriously these AI safety considerations.” He also already sees DeepSeek as one of the main competitors in the artificial intelligence segment.

See also  TP-Link Unveils The Tapo HybridCam Duo At CES 2025