Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

  • Pieisawesome@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    5
    ·
    2 days ago

    Yes. Abuse towards LLMs works.

    My team has shared prompts and about 50% of them threaten some sort of harm

    • lunar17@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 day ago

      Yikes. I knew this tech would introduce new societal issues, but I can’t say this is one I foresaw.