Project Icon

DeepInception

Explore Security Vulnerabilities and Adaptive Bypasses in Large Language Models

Product DescriptionLarge language models, while successful, face risks from adversarial jailbreaks affecting their safety. DeepInception offers a novel, less resource-intensive method, inspired by the Milgram experiment, to bypass usage controls via personification and nested scenes. This approach highlights vulnerabilities in various LLMs, underlining the need for enhanced safety measures.
Project Details