Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??
Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??
Yes. Abuse towards LLMs works.
My team has shared prompts and about 50% of them threaten some sort of harm
Yikes. I knew this tech would introduce new societal issues, but I can’t say this is one I foresaw.