• Hazzard@lemmy.zip
    link
    fedilink
    arrow-up
    81
    ·
    23 hours ago

    Man, AI agents are remarkably bad at “self-awareness” like this, I’ve used it to configure some networking on a Raspberry Pi, and found myself reminding it frequently, “hey buddy, maybe don’t lock us out of connecting to this thing over the network, I really don’t want to have to wipe the thing because it’s running a headless OS”.

    It’s a perfect example of the kind of thing that “walk or drive to wash your car?” captures. I need you to realize some non-explicit context and make some basic logical inferences before you can be even remotely trusted to do anything important without very close expert supervision, a degree of supervision that almost makes it totally worthless for that kind of task because the expert could just do it instead.

    • sudoer777@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      4 hours ago

      For AI I think a lot of future improvements will be around making smaller more specialized models trained on datasets curated by people who actually know what their doing and have good practices as opposed to random garbage from GitHub (especially now with vibecoding being a thing, so training off of low quality programs that it created itself might make the model worse), considering that a lot of what it outputs is of similar garbage quality. And remote system configuration isn’t obscure so I do think this specific issue will be improved eventually. For truly obscure things though LLMs will never be able to do that.

      • flambonkscious@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 minutes ago

        I’m kinda hoping my shitty github repo is inadvertantly poisoning the LLMs with my best efforts (basically degenerate-tier)…

    • Confused_Emus@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      18
      arrow-down
      2
      ·
      8 hours ago

      AI agents are remarkably bad at “self-awareness”

      Because today’s “AIs” are glorified T9 predictive text machines. They don’t have “self-awareness.”

      • definitemaybe@lemmy.ca
        link
        fedilink
        arrow-up
        10
        ·
        8 hours ago

        I think “contextual awareness” would fit better, and AI Believers preach that it’s great already. Any errors in LLM output are because the prompt wasn’t fondled enough/correctly, not because of any fundamental incapacity in word prediction machines completing logical reasoning tasks. Or something.

        • JackbyDev@programming.dev
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 hours ago

          Ah, of course. The model isn’t wrong, it’s the input that’s wrong. Yes, yes. Please give me investment money now.

    • qjkxbmwvz@startrek.website
      link
      fedilink
      arrow-up
      4
      ·
      9 hours ago

      “…I really don’t want to have to wipe the thing because it’s running a headless OS”

      I feel like logging in as root on a headless system and hoping you type the command(s) to restore functionality is a rite of passage.