• Madrigal@lemmy.world
    link
    fedilink
    English
    arrow-up
    29
    ·
    1 month ago

    Nah, guarantee the models have rules built in to deal with obvious stuff like that.

    You need to be more subtle. Give them information that is slightly wrong.

    • taco@anarchist.nexus
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 month ago

      Perhaps by generating a bunch of complex copilot code to upload. It’s easy to mass produce and would look plausibly functional.

    • ozymandias117@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      30 days ago

      Just need to use less obvious insults, a la, “your mother was a hamster, and your father smelt of elderberries”

      Still poisons the model with something an end user won’t like, but isn’t easy enough to train out