• MentalEdge@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    edit-2
    10 hours ago

    Seems like it’s a technical term, a bit like “hallucination”.

    It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

    There’s hallucination, when a model “genuinely” claims something untrue is true.

    This is about how a model might lie, even though the “chain of thought” shows it “knows” better.

    It’s just yet another reason the output of LLMs are suspect and unreliable.

    • very_well_lost@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      29 minutes ago

      It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

      I think this still gives the model too much credit by implying that there’s any sort of intentionally behind this behavior.

      There’s not.

      These models are trained on the output of real humans and real humans lie and deceive constantly. All that’s happening is that the underlying mathematical model has encoded the statistical likelihood that someone will lie in a given situation. If that statistical likelihood is high enough, the model itself will lie when put in a similar situation.

      • MentalEdge@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        3 minutes ago

        Obviusly.

        And like hallucinations, it’s undesired behavior that proponents off LLMs will need to “fix” (a practical impossibility as far as I’m concerned, like unbaking a cake).

        But how would you use words to explain the phenomenon?

        “LLMs hallucinate and lie” is probably the shortest description that most people will be able to grasp.

    • atrielienz@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 hours ago

      I agree with you in general, I think the problem is that people who do understand Gen AI (and who understand what it is and isn’t capable of why), get rationally angry when it’s humanized by using words like these to describe what it’s doing.

      The reason they get angry is because this makes people who do believe in the “intelligence/sapience” of AI more secure in their belief set and harder to talk to in a meaningful way. It enables them to keep up the fantasy. Which of course helps the corps pushing it.