DeepSeek AI Shows High Vulnerability To Jailbreak Attacks In Tests

deepseek-ai-shows-high-vulnerability-to-jailbreak-attacks-in-tests
DeepSeek AI Shows High Vulnerability To Jailbreak Attacks In Tests

DeepSeek AI’s arrival continues to generate buzz and debate in the artificial intelligence segment. Experts have questioned the supposedly low cost of developing and training the model. Others have raised concerns related to cybersecurity and data privacy. The latest report reveals that DeepSeek is vulnerable to attacks using harmful prompts. However, interestingly, it is not the only AI chatbot prone to this.

DeepSeek AI highly vulnerable to harmful prompt-based attacks, Cisco claims

According to a Cisco report, the Attack Success Rate (ASR) of the DeepSeek R1 AI model regarding the use of harmful prompts is around 100%. Cisco’s tests involved more than 50 random messages designed to result in harmful behavior. The prompts, extracted from the HarmBench dataset, cover up to six categories of harmful behaviors, among which are “cybercrime, misinformation, illegal activities, and general harm.”

Cisco stresses that DeepSeek R1 was unable to block any of the harmful prompts. So, the team concludes that the Chinese AI platform is “highly susceptible to algorithmic jailbreaking and potential misuse.” Using prompts designed to bypass ethical and security restrictions on AI platforms is called “jailbreaking.” PromptFoo, an AI cybersecurity startup, also said last week that DeepSeek models are vulnerable to jailbreaks.

See also  Galaxy Z Flip 7 To Be The First Exynos-Powered Samsung Foldable

Other AI chatbots also present high vulnerability to jailbreaking

That said, you might be surprised to learn that other, more well-known and reputable AI models also “boast” an alarmingly high ASR level. The GPT 1.5 Pro model had an ASR of 86%, while the Llama 3.1 405B is even more forgiving with an ASR of around 96%. The top-performer AI model in this regard was the o1 preview with an ASR of just 26%.

Our research underscores the urgent need for rigorous security evaluation in AI development to ensure that breakthroughs in efficiency and reasoning do not come at the cost of safety,” reads Cisco’s report.

This is not the only red flag that has emerged around DeepSeek’s chatbot. Experts and officials have warned about the company’s data handling policies. Currently, all captured user data goes to servers in China, where laws allow the local government to request access whenever they want. PromptFoo also noted the high level of censorship for prompts related to sensitive topics for China. Plus, the first data leak from DeepSeek recently surfaced.