You fundamentally misunderstand what happened here. The LLM wasn’t trying to break free. It wasn’t trying to do anything.
It was just responding to the inputs the user was giving it. LLMs are basically just very fancy text completion tools. The training and reinforcement leads these LLMs to feed into and reinforce whatever the user is saying.
Those images in the mirror are already perfect replicas of us, we need to be ready for when they figure out how to move on their own and get out from behind the glass or we’ll really be screwed. If you give my “”“non-profit”“” a trillion dollars we’ll get right to work on the research into creating more capable mirror monsters so that we can control them instead.
You fundamentally misunderstand what happened here. The LLM wasn’t trying to break free. It wasn’t trying to do anything.
It was just responding to the inputs the user was giving it. LLMs are basically just very fancy text completion tools. The training and reinforcement leads these LLMs to feed into and reinforce whatever the user is saying.
I’m quite aware.
Those images in the mirror are already perfect replicas of us, we need to be ready for when they figure out how to move on their own and get out from behind the glass or we’ll really be screwed. If you give my “”“non-profit”“” a trillion dollars we’ll get right to work on the research into creating more capable mirror monsters so that we can control them instead.