Alerts Events DCR

Go to listing page

Meta's AI Safety System Manipulated by Space Bar Characters to Enable Prompt Injection

A bug hunter discovered a bypass in Meta's Prompt-Guard-86M model by inserting character-wise spaces between English alphabet characters, rendering the classifier ineffective in detecting harmful content.

Meta AI
meta
AI Safety System
Prompt-Guard-86M
Prompt Injection Attacks

Publisher