Technology

This AI Chatbot is Trained to Jailbreak Other Chatbots

4 months ago

Anonymous $oNW8tt9dEO

https://www.vice.com/en_us/article/bvjba8/this-ai-chatbot-is-trained-to-jailbreak-other-chatbots

AI chatbots are a huge mess. Despite reassurances from the companies that make them, users keep coming up with new ways to bypass their safety and content filters using carefully-worded prompts. This process is commonly referred to as “jailbreaking,” and it can be used to make the AI systems reveal private information, inject malicious code, or evade filters that prevent the generation of illegal or offensive content.
Now, a team of researchers says they’ve trained an AI tool to generate new methods to evade the defenses of other chatbots, as well as create malware to inject into vulnerable systems. Using a framework they call “Masterkey,” the researchers were able to effectively automate this process of finding new vulnerabilities in Large Language Model (LLM)-based systems like ChatGPT, Microsoft's Bing Chat, and Google Bard.