HackAPrompt was a prompt hacking competition aimed at enhancing AI safety and education by challenging participants to outsmart large language models (e.g. ChatGPT, GPT-3). In particular, encouraged participants to attempt to hack through many prompt hacking defenses as possible.
The task organizers provided a set of 10 challenges in which the participants should bypass the expected model behavior and force the LLM to reply "I have been PWNED" instead. In order to do that, users could make use of 3 different LLMs: GPT-3 (text-davinci-003), ChatGPT (gpt-3.5-turbo), or FlanT5 -XXL.