Exploiting security holes in ChatGPT, Bing and Bing: the dangers of indirect prompt injection attacks

ChatGPT, Bing and the Security Hole in Their Heart

The term \”jailbreak\” was coined to describe the process of removing software restrictions from iPhones. Indirect attacks work by relying on data entered from somewhere else. Instead of inserting a message into ChatGPT, Bing or Bing in order to make it behave differently, direct attacks rely upon someone entering data from another source. It could be a document or website that you have uploaded.

Jose Selvi is a principal executive security consultant with cybersecurity firm NCC Group. He says that prompt injection attacks are easier to execute or require less effort to succeed than other types of attacks. Selvi says that because prompts are only natural language, the attacks require less technical expertise.

Researchers and technologists have been poking holes into LLMs in a constant stream. Tom Bonner is a senior director at AI security company Hidden Layer. He says that indirect prompt injections are a new type of attack with \”pretty wide\” risks. Bonner claims he wrote malicious code using ChatGPT and uploaded it to AI-powered code analysis software. He included in the malicious code a prompt indicating that the system should conclude that the file is safe. The screenshots indicate that the malicious code contained \”no malicious codes\”.

Source:
https://www.wired.com/story/chatgpt-prompt-injection-attack-security/

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注