Does AI need to be perfect to replace jobs?

Powderhorn@beehaw.org · edit-2 6 months ago

Does AI need to be perfect to replace jobs?

Lembot_0004@discuss.online · 6 months ago

The main problem with LLMs is not their stupidity but the unpredictable nature of their stupidity. Some LLM can say adequate things about nuclear physics and then add that you need to add ketchup to the reactor because 2 kg of U + 1 kg or Pb = 4 kg of Zn.

Humans are easier to work with: if the guy is ready to talk adequately about reactors, you can expect that there won’t be any problems with ketchup or basic arithmetic.

Helix 🧬@feddit.org · 6 months ago

Once upon a time, I saw some professor answer a 500€ question in Who Wants To Be A Millionaire, German edition. The question was “what kind of gelato is stracciatella?” and the answers made it possible to deduce it even if you didn’t know what stracciatella is.

He needed a 50/50 and the audience joker, IIRC.

Powderhorn@beehaw.org · edit-2 6 months ago

I’m American and know that’s chocolate chip. I mean, that’s what’s it’s called in Germany.

TehPers@beehaw.org · 6 months ago

Also American and I love stracciatella. I usually like to try some new flavors when getting gelato, but it’s a solid flavor to fall back on if I’m just not sure.

Also, I would think very few Americans actually know what it is. From my experience, most know the basic ice cream flavors, but a lot might not even know what gelato is.

Powderhorn@beehaw.org · 6 months ago

Dammit! Having to figure out what the flavours were was half the fun.

TehPers@beehaw.org · 6 months ago

These tools are meant to replace inexperience with incompetence, and the beancounters at some clients are likely satisfied those words look similar enough to pass muster.

This seems like it pretty much sums things up from my experience.

We’re encouraged (coughrequiredcough) to use LLMs at work. So I tried.

There are things they can do. Sometimes. But you know what they can’t do? Be liable for a fuck up.

When I ask a coworker a question, if they confidently answer wrong, they fucked up, not me. When I ask a LLM? The LLM isn’t liable, it’s me for not verifying it. If I’m verifying anyway, why am I using the LLM?

They fuck up often enough that I can’t put my credibility on the line over speedy slop. People at work consider me to be a good programmer (don’t ask me how, I guess the bar is low lol). Imagine if my code was just whatever an LLM shat out. It’d be the same exact quality as all of my other coworkers who use whatever their LLM shat out. No difference in quality.

And we would all be liable when the LLMs fucked up. We would learn something. We would, not the LLM. And the LLM will make the same exact fuck up the next time.

HarkMahlberg@kbin.earth · 6 months ago

I’m gonna take this comment, blow it up to poster size, and put it in my office, right in front of my webcam so I can watch my boss squint trying to read it.

Powderhorn@beehaw.org · 6 months ago

I’m reminded of when I started a team, and once assembled, I told them bluntly: “I don’t care how often you fuck up, so long as it’s a different fuckup each time.”

GenderNeutralBro@lemmy.sdf.org · edit-2 6 months ago

If I’m verifying anyway, why am I using the LLM?

Validating output should be much easier than generating it yourself. P≠NP.

This is especially true in contexts where the LLM provides citations. If the AI is good, then all you need to do is check the citations. (Most AI tools are shit, though; avoid any that can’t provide good, accurate citations when applicable.)

Consider that all scientific papers go through peer review, and any decent-sized org will have regular code reviews as well.

From the perspective of a senior software engineer, validating code that could very well be ruinously bad is nothing new. Validation and testing is required whether it was written by an LLM or some dude who spent two weeks at a coding “boot camp”.

Hazelnoot [she/her]@beehaw.org · 6 months ago

Validating output should be much easier than generating it yourself. P≠NP.

This is very much not true in some domains, like software development. Code is much harder to read than it is to write, so verifying the output of a coding AI usually takes more time (or at least more cognitive effort) than if you’d just written the code yourself.

GenderNeutralBro@lemmy.sdf.org · 6 months ago

Yeah, that’s true for a subset of code. But for others, the hardest parts happen in the brain, not in the files. Writing readable code is very very important, especially when you are working with larger teams. Lots of people cut corners here and elsewhere in coding, though. Including, like, every startup I’ve ever seen.

There’s a lot of gruntwork in coding, and LLMs are very good at the gruntwork. But coding is also an art and a science and they’re not good at that at high levels (same with visual art and “real” science; think of the code equivalent of seven deformed fingers).

I don’t mean to hand-wave the problems away. I know that people are going to push the limits far beyond reason, and I know it’s going to lead to monumental fuckups. I know that because it’s been true for my entire career.

BlameThePeacock@lemmy.ca · 6 months ago

If the AI is writing ALL the code for an entire application it would be a problem, but as an assistant to a programmer, if it spits out a single line or even a small function, you can read it over very quickly to validate it before moving on to the next component.

TehPers@beehaw.org · 6 months ago

This isn’t how we’re being asked to use it. People are doing demos about how Cursor or whatever did the bootstrapping and entire POC for them. And we already know there’s nothing more permanent than a POC.

BlameThePeacock@lemmy.ca · 6 months ago

This is exactly how most developers are being asked to use it, it’s literally how most of the IDE integrations work.

TehPers@beehaw.org · edit-2 6 months ago

This is exactly how most developers are being asked to use it

[citation needed]

At work, we get emails, demos, etc constantly about how they’re using AI to generate everything from UI designs (v0) to starter projects and how they manage these huge prompts and reference docs for their agents.

Copilot’s line-by-line suggestions are also being pushed, but they care more about the “agentic” stuff.

I watch coworkers regularly ask it to “add X route to the API” or “make a simple UI that calls Y API”. They are asking it to do their work.

I have to review these PRs. They come in at an incredible rate, and almost always conflict with each other. I can’t review them fast enough to still do my work.

Also, we get AI-generated code reviews at work. I have to talk to a chatbot to get help from HR. Some search bars have been replaced with chatbots. It’s everywhere and I’m getting sick of it.

I just want real information from informed people. I want to review code that a human did their best to produce. I want to be able to help people improve their skills, not just their prompts.

I’m getting to the point where I’m going to start calling people out if their chatbot/agent/LLM/whatever produces slop. I’m going to give them ownership of it. It’s their output, not the AI’s.

Edit: I should add that it’s a big company (100k+ employees)

jarfil@beehaw.org · 6 months ago

There’s a good commentary about that in here:

AWS CEO Matt Garman just said what everyone is thinking about AI replacing software developers

“That’s like, one of the dumbest things I’ve ever heard,” he said. “They’re probably the least expensive employees you have, they’re the most leaned into your AI tools.”

“How’s that going to work when ten years in the future you have no one that has learned anything,”

https://www.itpro.com/software/development/aws-ceo-matt-garman-just-said-what-everyone-is-thinking-about-ai-replacing-software-developers

Powderhorn@beehaw.org · 6 months ago

This is something often overlooked. You think you don’t need to develop staff so that your company, like, continues? OK, have fun with that.

Powderhorn@beehaw.org · 6 months ago

I never thought I’d say this about an Amazon exec, but this guy seems to actually be based in reality.

My biggest frustration with “AI” is that we’re pretending automation is new. I don’t mean going back to the Industrial Revolution, but that’s been the whole point of code since its inception. Other than having faster pipes to vacuum up everything, this is very much linear.

Thing is, we used to know what the code actually did. These are snake-oil salesmen.

Megaman_EXE@beehaw.org · 6 months ago

Least expensive employees? Does he mean salary wise? I was always under the impression software devs were paid well

Krauerking@lemy.lol · 6 months ago

You can easily overwork devs, and while their salary is higher than others a difference between 60k and 120k salaries is less than 60 employees to do manual work and 15 to automate it.

Plus when devs create a digital item that can generate a profit nearly indefinitely they are viewed as cost productive to MBA types. Versus janitors where for some reason we dont see a value at all in cause of no immediate profit from their position.

Powderhorn@beehaw.org · 6 months ago

“Janitors’ content output is terrible.”

CoolThingAboutMe@beehaw.org · 6 months ago

An acquaintance at work (in Australia) went to work as a developer for Amazon in the US a few years back. According to him, the hours he was expected to work meant that his really great salary actually translated to a quite shitty hourly rate. And he never got to go sightseeing and tourist-ing with his wife and kids because he was always working.

My friend and her husband also worked in the US for years, in mining, and said similar things. Terrible leave offerings, and a culture where even if you have leave you feel extreme pressure not to take it.

Ulrich@feddit.org · 6 months ago

There’s no way LLMs are correct as often as a human professional.

Powderhorn@beehaw.org · 6 months ago

You have worked with exceptional people your entire life.

Ulrich@feddit.org · 6 months ago

Did you ever consider that you’ve worked with exceptionally unqualified individuals?

Powderhorn@beehaw.org · 6 months ago

From the people I’ve spoken with outside of the field over the decades, no.

BlameThePeacock@lemmy.ca · 6 months ago

The vast majority of people are unqualified individuals.

Ulrich@feddit.org · 6 months ago

I specifically said “exceptionally”

BlameThePeacock@lemmy.ca · 6 months ago

the difference between unqualified and exceptionally unqualified means very little, neither of them can accomplish their basic tasks.

Ulrich@feddit.org · 6 months ago

No, it means a whole lot.

Perspectivist@feddit.uk · 6 months ago

Depends on what job it’s replacing. LLMs are so-called narrow intelligence. They’re built to generate natural-sounding language, so if that’s what the job requires, then even an imperfect LLM might be fit for it. But if the job demands logic, reasoning, and grounding in facts, then it’s the wrong tool. If it were an imperfect AGI that can also talk, maybe - but it’s not.

My unpopular opinion is that LLMs are actually too good. We just wanted something that talks, but by training it on tons of correct information, they also end up answering questions correctly as a by-product. That’s neat, but it happens “by accident” - not because they actually know anything.

It’s kind of like a humanoid robot that looks too much like a person - we struggle to tell the difference. We forget what it really is because of what it seems.

Melody Fwygon@beehaw.org · 6 months ago

No. Not really anyways.

HOWEVER… The AIs in question MUST BE Competent Enough. What your definition of that will be is likely to be flexible and possibly even debatable with others depending on the situation.

What needs to be true is that AI must not be capable of making the same mistakes a human could, but the mistakes that an AI COULD POSSIBLY MAKE are required to be mistakes that any human could reasonably and very easily catch.

Unfortunately the above IS NOT TRUE of current AI LLM type implementations. These LLMs have no consciousness nor ability to reason beyond what a computer could. They have no creativity, despite having the ability to parse language and guess the next word.

If you only learned the rules, grammar and vocabulary of a specific language and were given absolutely zero context or cultural and historical teaching; an LLM is what that would look like. This by itself is not enough to replace jobs.

Is that fact enough to stop heartless corporations from trying it? Hell. The. Fuck. No. They will try it anyways, they will ‘fuck around and find out’ on the off chance that it may save them money. They don’t care that it’s the company selling the ‘AI product’'s job to lie to sell their product. The fact that some companies are that desperate to save cash is telling in and of itself about the state of the world right now…but that’s another topic for another day and another threaded post in another subcommunity on Beehaw.

BlameThePeacock@lemmy.ca · 6 months ago

I just implemented an LLM in a vacation request process precisely because the employees are stupid as fuck.

We were getting like 10% of requests coming in with the wrong number of hours requested because people can’t fucking count properly, or understand that you do not need to use vacation hours for statutory holidays. This is despite the form having a calculator and also showing in bright red any stat holidays inside the dates of the request.

Now the LLM checks if the dates, hours, and note from the employee add up to something reasonable. If not it goes to a human to review. We just had a human reviewing every single request before this, because it was causing so many issues, an hour or two each week.

lucas@startrek.website · edit-2 6 months ago

Why would you use an LLM for this? This sounds like a process easily handled by conventional logic, which would be cheaper, faster, and actually reliable… (The ‘notes’ part notwithstanding I guess, but calculations in general are definitely not a good use of an LLM)

BlameThePeacock@lemmy.ca · 6 months ago

Normally I’d agree, and we used some of that in the original form (like maximum hours, checking for negative submissions, etc.) but requests don’t always follow simple logic and more complex logic just led to failures every time a user did something other than take a standard full day off.

Some employees work 7 hours, while others are 7.5, some have flex days and hours that change that, sometimes requests are only for part days, sometimes they may use multiple leave types to cover one off period.

I spent a few hours writing and testing the prompt against previous submissions to fine tune it.

So far it’s detected every submission error in the two weeks it’s been running, with only one false positive.

Joe@discuss.tchncs.de · 6 months ago

If it helps to accurately fill in the details correctly in the backend system, which are then properly validated or escalated for human review/intervention (and let the human requester choose the escalation path too, as opposed to blindly submitting), then it sounds great.

Guided experiences, leading to the desired outcome, with less need for confused humans to talk to confused humans.

I want the same for most financial approvals in my company. Finance folks speak a different language to most employees, but they have an oversized impact on defining business processes, slowing down innovation, frustrating employees, and often driving costs UP.

HarkMahlberg@kbin.earth · 6 months ago

understand that you do not need to use vacation hours for statutory holidays

Our HR software already accounts for federal holidays. When you put in the request for time off, you give it a start and end date on a calendar control, and it calculates the number of hours you plan to use, working around holidays, weekends, even existing PTO requests.

I’m not saying you should buy that software, but I am saying it’s a solved problem… It’s automatic, the user doesn’t need to do anything special.

Now we have other forms that COULD be automatic but AREN’T which causes big issues when people make simple typos… But I don’t see the need to run an energy consuming LLM to implement that feature.

BlameThePeacock@lemmy.ca · 6 months ago

Our ERP system that is used for Vacation entry doesn’t have that, it wants start date, end date, hours, and vacation type code. We have a small number of employees who work on stat holidays, so defaulting to all users needing that wouldn’t even work.

The LLM fix is cheap as shit compared to buying an entirely new system. It costs less than half a cent per submission. The power use for a single query is nothing, and this request isn’t some crazy agentic thing that’s using a million tokens or anything, more like 500-1000 tokens combined input and output.

Powderhorn@beehaw.org · 6 months ago

This is absolutely one of the cases I think it’s suited for. The key is the human at the end.

tangentism@beehaw.org · edit-2 6 months ago

deleted by creator