I Went All-In on AI. The MIT Study Is Right.

AutistoMephisto@lemmy.world · edit-2 3 months ago

I Went All-In on AI. The MIT Study Is Right.

edgemaster72@lemmy.world · 3 months ago

Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive.

And all they’ll hear is “not failure, metrics great, ship faster, productive” and go against your advice because who cares about three months later, that’s next quarter, line must go up now. I also found this bit funny:

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me… I was proud of what I’d created.

Well you didn’t create it, you said so yourself, not sure why you’d be proud, it’s almost like the conclusion should’ve been blindingly obvious right there.

AutistoMephisto@lemmy.world · 3 months ago

The top comment on the article points that out.

It’s an example of a far older phenomenon: Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired. I’ll have to find it but there’s a story about a modern fighter jet pilot not being able to handle a WWII era Lancaster bomber. They don’t know how to do the stuff that modern warplanes do automatically.

LOGIC💣@lemmy.world · 3 months ago

It’s more like the ancient phenomenon of spaghetti code. You can throw enough code at something until it works, but the moment you need to make a non-trivial change, you’re doomed. You might as well throw away the entire code base and start over.

And if you want an exact parallel, I’ve said this from the beginning, but LLM coding at this point is the same as offshore coding was 20 years ago. You make a request, get a product that seems to work, but maintaining it, even by the same people who created it in the first place, is almost impossible.

Joe@discuss.tchncs.de · 3 months ago

Indeed… Throw-away code is currently where AI coding excels. And that is cool and useful - creating one off scripts, self-contained modules automating boilerplate, etc.

You can’t quite use it the same way for complex existing code bases though… Not yet, at least…

De Lancre@lemmy.world · 3 months ago

Yes, that exactly how I use cursor and local llms. There a ton of cases, where you need one time script to prepare data/sort thru data/fetch data via API, etc. Even something simple like adding role on discord channel (god save you, if your company uses that piece of crap for communication), that can be done with script too, especially if you need to add role to thousands of users, for example. Of course, it can be done properly by normal development cycle, but that expensive, while shitcoding thru cursor can be done by anyone.

ctrl_alt_esc@lemmy.ml · 3 months ago

I agree with you, though proponents will tell you that’s by design. Supposedly, it’s like with high-level languages. You don’t need to know the actual instructions in assembly anymore to write a program with them. I think the difference is that high-level language instructions are still (mostly) deterministic, while an LLM prompt certaily isn’t.

Scubus@sh.itjust.works · 3 months ago

Yep, thats the key issue that so many people fail to understand. They want AI to be deterministic but it simply isnt. Its like expecting a human to get the right answer to any possible question, its just not going to happen. The only thing we can do is bring error rates with ai lower than a human doing the same task, and it will be at that point that the ai becomes useful. But even at that point there will always be the alignment issue and nondeterminism, meaning ai will never behave exactly the way we want or expect it to.

Cocodapuf@lemmy.world · 3 months ago

Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired.

Well, to be fair, different skills are acquired. You’ve learned how to create automated systems, that’s definitely a skill. In one of my IT jobs there were a lot of people who did things manually, updated computers, installed software one machine at a time. But when someone figures out how to automate that, push the update to all machines in the room simultaneously, that’s valuable and not everyone in that department knew how to do it.

So yeah, I guess my point is, you can forget how to do things the old way, but that’s not always bad. Like, so you don’t really know how to use a scythe, that’s fine if you have a tractor, and trust me, you aren’t missing much.

boonhet@sopuli.xyz · 3 months ago

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me… I was proud of what I’d created.

Well you didn’t create it, you said so yourself, not sure why you’d be proud, it’s almost like the conclusion should’ve been blindingly obvious right there.

Does a director create the movie? They don’t usually edit it, they don’t have to act in it, nor do all directors write movies. Yet the person giving directions is seen as the author.

The idea is that vibe coding is like being a director or architect. I mean that’s the idea. In reality it seems it doesn’t really pan out.

rainwall@piefed.social · 3 months ago

You can vibe write and vibe edit a movie now too. They also turn out shit.

The issue is that llm isnt a person with skills and knowledge. Its a complex guessing box that gets thing kinda right, but not actually right, and it absolutely cant tell whats right or not. It has no actual skills or experience or humainty that a director can expect a writer or editor to have.

k0e3@lemmy.ca · 3 months ago

What’s impressive about LLM is how good it is at sounding right.

edgemaster72@lemmy.world · 3 months ago

Just makes me think of this character from Adventure Time

maccentric@sh.itjust.works · 3 months ago

What season they from? I thought I’d seen most of it but don’t recall them

edgemaster72@lemmy.world · 3 months ago

This is from season 1 episode 18, titled “Dungeon”

MrSmith@lemmy.world · edit-2 3 months ago

Wrong, it’s just outsourcing.

You’re making a false-equivalence. A director is actively doing their job; they’re a puppeteer and the rest is their puppet. The puppeteer is not outsourcing his job to a puppet.

And I’m pretty sure you don’t know what architects do.

If I hire a coder to write an app for me, whether it’s a clanker or a living being, I’m outsourcing the work; I’m a manager.

It’s like tasking an artist to write a poem for you about love and flowers, and being proud about that poem.

jimmy90@lemmy.world · 3 months ago

yeah i don’t get why the ai can’t do the changes

don’t you just feed it all the code and tell it? i thought that was the point of 100% AI

raspberriesareyummy@lemmy.world · 3 months ago

So there’s actual developers who could tell you from the start that LLMs are useless for coding, and then there’s this moron & similar people who first have to fuck up an ecosystem before believing the obvious. Thanks fuckhead for driving RAM prices through the ceiling… And for wasting energy and water.

psycotica0@lemmy.ca · 3 months ago

I can least kinda appreciate this guy’s approach. If we assume that AI is a magic bullet, then it’s not crazy to assume we, the existing programmers, would resist it just to save our own jobs. Or we’d complain because it doesn’t do things our way, but we’re the old way and this is the new way. So maybe we’re just being whiny and can be ignored.

So he tested it to see for himself, and what he found was that he agreed with us, that it’s not worth it.

Ignoring experts is annoying, but doing some of your own science and getting first-hand experience isn’t always a bad idea.

5too@lemmy.world · 3 months ago

And not only did he see for himself, he wrote up and published his results.

Knock_Knock_Lemmy_In@lemmy.world · 3 months ago

Yup. This was almost science. It’s just lacking measurements and repeatablity.

bassomitron@lemmy.world · 3 months ago

100% this. The guy was literally a consultant and a developer. It’d just be bad business for him to outright dismiss AI without having actual hands on experience with said product. Clients want that type of experience and knowledge when paying a business to give them advice and develop a product for them.

raspberriesareyummy@lemmy.world · 3 months ago

Except that outright dismissing snake oil would not at all be bad business. Calling a turd a diamond neither makes it sparkle, nor does it get rid of the stink.

fruitycoder@sh.itjust.works · 3 months ago

I can’t just call everything snake oil without some actual measurements and tests.

Naive cynicism is just as naive as blind optimism

raspberriesareyummy@lemmy.world · 3 months ago

I can’t just call everything snake oil without some actual measurements and tests.

With all due respect, you have not understood the basic mechanic of machine learning and the consequences thereof.

Knock_Knock_Lemmy_In@lemmy.world · 3 months ago

With due respect, you have not understood how snake oil is detected.

raspberriesareyummy@lemmy.world · 3 months ago

Problem is that statistical word prediction has fuck-all to do with AI. It’s not and will never be. By “giving it a try” you contribute to the spread of this snake oil. And even if someone came up with actual AI, if it used enough resources to impact our ecosystem, instead of being a net positive, and if it was in the greedy hands of billionaires, then using it is equivalent to selling your executioner an axe.

jve@lemmy.world · edit-2 3 months ago

Terrible take. Thanks for playing.

It’s actually impressive the level of downvotes you’ve gathered in what is generally a pretty anti-ai crowd.

khepri@lemmy.world · 3 months ago

They are useful for doing the kind of boilerplate boring stuff that any good dev should have largely optimized and automated already. If it’s 1) dead simple and 2) extremely common, then yeah an LLM can code for you, but ask yourself why you don’t have a time-saving solution for those common tasks already in place? As with anything LLM, it’s decent at replicating how humans in general have responded to a given problem, if the problem is not too complex and not too rare, and not much else.

Lambda@lemmy.ca · 3 months ago

Thats exactly what I so often find myself saying when people show off some neat thing that a code bot “wrote” for them in x minutes after only y minutes of “prompt engineering”. I’ll say, yeah I could also do that in y minutes of (bash scripting/vim macroing/system architecting/whatever), but the difference is that afterwards I have a reusable solution that: I understand, is automated, is robust, and didn’t consume a ton of resources. And as a bonus I got marginally better as a developer.

Its funny that if you stick them in an RPG and give them an ability to “kill any level 1-x enemy instantly, but don’t gain any xp for it” they’d all see it as the trap it is, but can’t see how that’s what AI so often is.

raspberriesareyummy@lemmy.world · 3 months ago

As you said, “boilerplate” code can be script generated - and there are IDEs that already do this, but in a deterministic way, so that you don’t have to proof-read every single line to avoid catastrophic security or crash flaws.

InvalidName2@lemmy.zip · 3 months ago

And then there are actual good developers who could or would tell you that LLMs can be useful for coding, in the right context and if used intelligently. No harm, for example, in having LLMs build out some of your more mundane code like unit/integration tests, have it help you update your deployment pipeline, generate boilerplate code that’s not already covered by your framework, etc. That it’s not able to completely write 100% of your codebase perfectly from the get-go does not mean it’s entirely useless.

Soggy@lemmy.world · 3 months ago

Other than that it’s work that junior coders could be doing, to develop the next generation of actual good developers.

SreudianFlip@sh.itjust.works · edit-2 3 months ago

Yes, and that’s exactly what everyone forgets about automating cognitive work. Knowledge or skill needs to be intergenerational or we lose it.

If you have no junior developers, who will turn into senior developers later on?

pinball_wizard@lemmy.zip · 3 months ago

If you have no junior developers, who will turn into senior developers later on?

At least it isn’t my problem. As long as I have CrowdStrike, Cloudflare, Windows11, AWS us-east-1 and log4j… I can just keep enjoying today’s version of the Internet, unchanged.

MisterOwl@lemmy.world · 3 months ago

AI, duh.

SreudianFlip@sh.itjust.works · 3 months ago

Al is a pretty good guy but he can’t be everywhere. Maybe he can use some A.I. to help!

JcbAzPx@lemmy.world · 3 months ago

If it’s boilerplate, copy/paste; find/replace works just as well without needing data centers in the desert to develop.

raspberriesareyummy@lemmy.world · 3 months ago

And then there are actual good developers who could or would tell you that LLMs can be useful for coding

The only people who believe that are managers and bad developers.

keegomatic@lemmy.world · 3 months ago

You’re wrong, whether you figure that out now or later. Using an LLM where you gatekeep every write is something that good developers have started doing. The most senior engineers I work with are the ones who have adopted the most AI into their workflow, and with the most care. There’s a difference between vibe coding and responsible use.

raspberriesareyummy@lemmy.world · 3 months ago

There’s a difference between vibe coding and responsible use.

There’s also a difference between the occasional evening getting drunk and alcoholism. That doesn’t make an occasional event healthy, nor does it mean you are qualified to drive a car in that state.

People who use LLMs in production code are - by definition - not “good developers”. Because:

a good developer has a clear grasp on every single instruction in the code - and critically reviewing code generated by someone else is more effort than writing it yourself
pushing code to production without critical review is grossly negligent and compromises data & security

This already means the net gain with use of LLMs is negative. Can you use it to quickly push out some production code & impress your manager? Possibly. Will it be efficient? It might be. Will it be bug-free and secure? You’ll never know until shit hits the fan.

Also: using LLMs to generate code, a dev will likely be violating copyrights of open source left and right, effectively copy-pasting licensed code from other people without attributing authorship, i.e. they exhibit parasitic behavior & outright violate laws. Furthermore the stuff that applies to all users of LLMs applies:

they contribute to the hype, fucking up our planet, causing brain rot and skill loss on average, and pumping hardware prices to insane heights.

keegomatic@lemmy.world · 3 months ago

We have substantially similar opinions, actually. I agree on your points of good developers having a clear grasp over all of their code, ethical issues around AI (not least of which are licensing issues), skill loss, hardware prices, etc.

However, what I have observed in practice is different from the way you describe LLM use. I have seen irresponsible use, and I have seen what I personally consider to be responsible use. Responsible use involves taking a measured and intentional approach to incorporating LLMs into your workflow. It’s a complex topic with a lot of nuance, like all engineering, but I would be happy to share some details.

Critical review is the key sticking point. Junior developers also write crappy code that requires intense scrutiny. It’s not impossible (or irresponsible) to use code written by a junior in production, for the same reason. For a “good developer,” many of the quality problems are mitigated by putting roadblocks in place to…

force close attention to edits as they are being written,
facilitate handholding and constant instruction while the model is making decisions, and
ensure thorough review at the time of design/writing/conclusion of the change.

When it comes to making safe and correct changes via LLM, specifically, I have seen plenty of “good developers” in real life, now, who have engineered their workflows to use AI cautiously like this.

Again, though, I share many of your concerns. I just think there’s nuance here and it’s not black and white/all or nothing.

raspberriesareyummy@lemmy.world · 3 months ago

While I appreciate your differentiated opinion, I strongly disagree. As long as there is no actual AI involved (and considering that humanity is dumb enough to throw hundreds of billions at a gigantic parrot, I doubt we would stand a chance to develop true AI, even if it was possible to create), the output has no reasoning behind it.

it violates licenses and denies authorship and - if everyone was indeed equal before the law, this alone would disqualify the code output from such a model because it’s simply illegal to use code in violation of license restrictions & stripped of licensing / authorship information
there is no point. Developing code is 95-99% solving the problem in your mind, and 1-5% actual code writing. You can’t have an algorithm do the writing for you and then skip on the thinking part. And if you do the thinking part anyways, you have gained nothing.

A good developer has zero need for non-deterministic tools.

As for potential use in brainstorming ideas / looking at potential solutions: that’s what the usenet was good for, before those very corporations fucked it up for everyone, who are now force-feeding everyone the snake oil that they pretend to have any semblance of intelligence.

keegomatic@lemmy.world · 3 months ago

violates licenses

Not a problem if you believe all code should be free. Being cheeky but this has nothing to do with code quality, despite being true

do the thinking

This argument can be used equally well in favor of AI assistance, and it’s already covered by my previous reply

non-deterministic

It’s deterministic

brainstorming

This is not what a “good developer” uses it for

Terrasque@infosec.pub · 3 months ago

You’re pushing code to prod without pr’s and code reviews? What kind of jank-ass cowboy shop are you running?

It doesn’t matter if an llm or a human wrote it, it needs peer review, unit tests and go through QA before it gets anywhere near production.

Randelung@lemmy.world · 3 months ago

Maybe they’ll listen to one of their own?

raspberriesareyummy@lemmy.world · 3 months ago

The kind of useful article I would expect then is one exlaining why word prediction != AI

jali67@lemmy.zip · edit-2 3 months ago

Don’t worry. The people on LinkedIn and tech executives tell us it will transform everything soon!

ImmersiveMatthew@sh.itjust.works · 3 months ago

I really have not found AI to be useless for coding. I have found it extremely useful and it has saved me hundreds of hours. It is not without its faults or frustrations, but the it really is a tool I would not want to be without.

raspberriesareyummy@lemmy.world · 3 months ago

That’s because you are not a proper developer, as proven by your comment. And you create tech legacy that will have a net cost in terms of maintenance or downtime.

ImmersiveMatthew@sh.itjust.works · 3 months ago

I am for sure not a coder as it has never been my strong suite, but I am without a doubt an awesome developer or I would not have a top rated multiplayer VR app that is pushing the boundaries of what mobile VR can do.

The only person who will have to look at my code is me so any and all issues be it my code or AI code will be my burden and AI has really made that burden much less. In fact, I recently installed Coplay in my Unity Engine Editor and OMG it is amazing at assisting not just with code, but even finding little issues with scene setup, shaders, animations and more. I am really blown away with it. It has allowed me to spend even less time on the code and more time imagineering amazing experiences which is what fans of the app care about the most. They couldn’t care less if I wrote the code or AI did as long as it works and does not break immersion. Is that not what it is all about at the end of the day?

As long as AI helps you achieve your goals and your goals are grounded, including maintainability, I see no issues. Yeah, misdirected use of AI can lead to hard to maintain code down the line, but that is why you need a human developer in the loop to ensure the overall architecture and design make sense. Any code base can become hard to maintain if not thought through be is human or AI written.

raspberriesareyummy@lemmy.world · 3 months ago

Look, bless your heart if you have a successful app, but success / sales is not exclusive to products of quality. Just look around at all the slop that people buy nowadays.

As long as AI helps you achieve your goals and your goals are grounded, including maintainability, I see no issues.

Two issues with that

what you are using has nothing whatsoever to do with AI, it’s a glorified pattern repeater - an actual parrot has more intelligence
if the destruction of entire ecosystems for slop is not an issue that you see, you should not be allowed anywhere near technology (as by now probably billions of people)

ImmersiveMatthew@sh.itjust.works · 3 months ago

I do not understand your point you are making about my particular situation as I am not making slop. Plus one persons slop is another’s treasure. What exactly are you suggesting as the 2 issues you outlined see like they are being directed to someone else perhaps?

I am calling it AI as that is what it is called, but you are correct, it is a pattern predictor
I am not creating slop but something deeply immersive and enjoyed by people. In terms of the energy used, I am on solar and run local LLMs.

raspberriesareyummy@lemmy.world · 3 months ago

I didn’t say your particular application that I know nothing about is slop, I said success does not mean quality. And if you use statistical pattern generation to save time, chances are high that your software is not of good quality.

Even solar energy is not harvested waste-free (chemical energy and production of cells). Nevertheless, even if it were, you are still contributing to the spread of slop and harming other people. Both through spreading acceptance of a technology used to harm billions of people for the benefit of a few, and through energy and resource waste.

ImmersiveMatthew@sh.itjust.works · 3 months ago

I am sure my code could be better. I am also sure the SDKs I use could be better and the gam engine could’ve better. For what I need, they all work good enough to get the job done. I am sure issues will come up as a result as it has many times in the past already, even before LLMs helped, but that is par for the course for a developer to tackle.

vpol@feddit.uk · 3 months ago

The developers can’t debug code they didn’t write.

This is a bit of a stretch.

Xyphius@lemmy.ca · 3 months ago

agreed. 50% of my job is debugging code I didn’t write.

funkless_eck@sh.itjust.works · 3 months ago

I mean I was trying to solve a problem t’other day (hobbyist) - it told me to create a

function foo(bar): await object.foo(bar)

then in object

function foo(bar): _foo(bar)

function _foo(bar): original_object.foo(bar)

like literally passing a variable between three wrapper functions in two objects that did nothing except pass the variable back to the original function in an infinite loop

add some layers and complexity and it’d be very easy to get lost

theparadox@lemmy.world · 3 months ago

The few times I’ve used LLMs for coding help, usually because I’m curious if they’ve gotten better, they let me down. Last time it was insistent that its solution would work as expected. When I gave it an example that wouldn’t work, it even broke down each step of the function giving me the value of its variables at each step to demonstrate that it worked… but at the step where it had fucked up, it swapped the value in the variable to one that would make the final answer correct. It made me wonder how much water and energy it cost me to be gaslit into a bad solution.

How do people vibe code with this shit?

vpol@feddit.uk · 3 months ago

As a learning process it’s absolutely fine.

You make a mess, you suffer, you debug, you learn.

But you don’t call yourself a developer (at least I hope) on your CV.

_g_be@lemmy.world · 3 months ago

Vibe coders can’t debug code because they didn’t write

embed_me@programming.dev · 3 months ago

Vibe coders can’t debug code because they can’t write code

_g_be@lemmy.world · 3 months ago

Yes, this is what I intended to write but I submitted it hastily.

Its like a catch-22, they can’t write code so they vibecode, but to maintain vibed code you would necessarily need to write code to understand what’s actually happening

Evotech@lemmy.world · 3 months ago

I don’t get this argument. Isn’t the whole point that the ai will debug and implement small changes too?

Cyber Yuki@lemmy.world · 3 months ago

Think an interior designer having to reengineer the columns and load bearing walls of a masonry construction.

What are the proportions of cement and gravel for the mortar? What type of bricks to use? Do they comply with the PSI requirements? What caliber should the rebars be? What considerations for the pouring of concrete? Where to put the columns? What thickness? Will the building fall?

“I don’t know that shit, I only design the color and texture of the walls!”

And that, my friends, is why vibe coding fails.

And it’s even worse: Because there are things you can more or less guess and research. The really bad part is the things you should know about but don’t even know they are a thing!

Unknown unknowns: Thread synchronization, ACID transactions, resiliency patterns. That’s the REALLY SCARY part. Write code? Okay, sure, let’s give the AI a chance. Write stable, resilient code with fault tolerance, and EASY TO MAINTAIN? Nope. You’re fucked. Now the engineers are gone and the newbies are in charge of fixing bad code built by an alien intelligence that didn’t do its own homework and it’s easier to rewrite everything from scratch.

Evotech@lemmy.world · 3 months ago

If you need to refractor your program you might aswell start from the beginning

anon_8675309@lemmy.world · 3 months ago

Some can’t because they never acquired to skill to read code. But most did and can.

Rooster326@programming.dev · 3 months ago

If you’ve never had to debug code. Are you really a developer?

There is zero chance you have never written a big so… Who is fixing them?

Unless you just leave them because you work for Infosys or worse but then I ask again - are you really a developer?

mal3oon@lemmy.world · 3 months ago

I think it highly depends on the skill and experience of the dev. A lot of the people flocking into the vibe coding hype are not necessarily always people who know how about coding practices (including code review etc …) nor are experienced in directing AI agent to achieve such goals. The result is MIT prediction. Although, this will start to change soon.

pdxfed@lemmy.world · 3 months ago

Great article, brave and correct. Good luck getting the same leaders who blindly believe in a magical trend for this or next quarters numbers; they don’t care about things a year away let alone 10.

I work in HR and was stuck by the parallel between management jobs being gutted by major corps starting in the 80s and 90s during “downsizing” who either never replaced them or offshore them. They had the Big 4 telling them it was the future of business. Know who is now providing consultation to them on why they have poor ops, processes, high turnover, etc? Take $ on the way in, and the way out. AI is just the next in long line of smart people pretending they know your business while you abdicate knowing your business or employees.

Hope leaders can be a bit braver and wiser this go 'round so we don’t get to a cliffs edge in software.

Ancalagon@lemmy.world · 3 months ago

Tbh I think the true leaders are high on coke.

borari@lemmy.dbzer0.com · 3 months ago

Wow I didn’t know that I was leading this whole time.

ripcord@lemmy.world · 3 months ago

I’m trying

HakunaHafada@lemmy.dbzer0.com · 3 months ago

Much appreciated 🫡

CarbonatedPastaSauce@lemmy.world · edit-2 3 months ago

Something any (real, trained, educated) developer who has even touched AI in their career could have told you. Without a 3 month study.

AutistoMephisto@lemmy.world · edit-2 3 months ago

What’s funny is this guy has 25 years of experience as a software developer. But three months was all it took to make it worthless. He also said it was harder than if he’d just wrote the code himself. Claude would make a mistake, he would correct it. Claude would make the same mistake again, having learned nothing, and he’d fix it again. Constant firefighting, he called it.

felbane@lemmy.world · 3 months ago

As someone who has been shoved in the direction of using AI for coding by my superiors, that’s been my experience as well. It’s fine at cranking out stackoverflow-level code regurgitation and mostly connecting things in a sane way if the concept is simple enough. The real breakthrough would be if the corrections you make would persist longer than a turn or two. As soon as your “fix-it prompt” is out of the context window, you’re effectively back to square one. If you’re expecting it to “learn” you’re gonna have a bad time. If you’re not constantly double checking its output, you’re gonna have a bad time.

ctrl_alt_esc@lemmy.ml · 3 months ago

It’s still useful to have an actual “study” (I’d rather call it a POC) with hard data you can point to, rather than just “trust me bro”.

some_designer_dude@lemmy.world · 3 months ago

Untrained dev here, but the trend I’m seeing is spec-driven development where AI generates the specs with a human, then implements the specs. Humans can modify the specs, and AI can modify the implementation.

This approach seems like it can get us to 99%, maybe.

CaptDust@sh.itjust.works · edit-2 3 months ago

Trained dev with a decade of professional experience, humans routinely fail to get me workable specs without hours of back and forth discussion. I’d say a solid 25% of my work week is spent understanding what the stakeholders are asking for and how to contort the requirements to fit into the system.

If these humans can’t be explict enough with me, a living thinking human that understands my architecture better than any LLM, what chance does an LLM have at interpreting them?

Piatro@programming.dev · 3 months ago

How is what you’re describing different to what the author is talking about? Isn’t it essentially the same as “AI do this thing for me”, “no not like that”, “ok that’s better”? The trouble the author describes, ie the solution being difficult to change, or having no confidence that it can be safely changed, is still the same.

some_designer_dude@lemmy.world · 3 months ago

This poster https://calckey.world/notes/afzolhb0xk is more articulate than my post.

The difference between this “spec-driven” approach is that the entire process is repeatable by AI once you’ve gotten the spec sorted. So you no longer work on the code, you just work on the spec, which can be a collection of files, files in folders, whatever — but the goal is some kind of determinism, I think.

I use it on a much smaller scale and haven’t really cared much for the “spec as truth” approach myself, at this level. I also work almost exclusively on NextJS apps with the usual Tailwind + etc stack. I would certainly not trust a developer without experience with that stack to generate “correct” code from an AI, but it’s sort of remarkable how I can slowly document the patterns of my own codebase and just auto-include it as context on every prompt (or however Cursor does it) so that everything the LLMs suggest gets LLM-reviewed against my human-written “specs”. And doubly neat is that the resulting documentation of patterns turns out to be really helpful to developers who join or inherit the codebase.

I think the author / developer in the article might not have been experienced enough to direct the LLMs to build good stuff, but these tools like React, NextJS, Tailwind, and so on are all about patterns that make us all build better stuff. The LLMs are like “8 year olds” (someone else in this thread) except now they’re more like somewhat insightful 14 year olds, and where they’ll be in another 5 years… Who knows.

Anyway, just saying. They’re here to stay, and they’re going to get much better.

ChunkMcHorkle@lemmy.world · edit-2 3 months ago

They’re here to stay

Eh, probably. At least for as long as there is corporate will to shove them down the rest of our throats. But right now, in terms of sheer numbers, humans still rule, and LLMs are pissing off more and more of us every day while their makers are finding it increasingly harder to forge ahead in spite of us, which they are having to do ever more frequently.

and they’re going to get much better.

They’re already getting so much worse, with what is essentially the digital equivalent of kuru, that I’d be willing to bet they’ve already jumped the shark.

If their makers and funders had been patient, and worked the present nightmares out privately, they’d have a far better chance than they do right now, IMO.

Simply put, LLMs/“AI” were released far too soon, and with far too much “I Have a Dream!” fairy-tale promotion that the reality never came close to living up to, and then shoved with brute corporate force down too many throats.

As a result, now you have more and more people across every walk of society pushed into cleaning up the excesses of a product they never wanted in the first place, being forced to share their communities AND energy bills with datacenters, depleted water reserves, privacy violations, EXCESSIVE copyright violations and theft of creative property, having to seek non-AI operating systems just to avoid it . . . right down to the subject of this thread, the corruption of even the most basic video search.

Can LLMs figure out how to override an angry mob, or resolve a situation wherein the vast majority of the masses are against the current iteration of AI even though the makers of it need us all to be avid, ignorant consumers of AI for it to succeed? Because that’s where we’re going, and we’re already farther down that road than the makers ever foresaw, apparently having no idea just how thin the appeal is getting on the ground for the rest of us.

So yeah, I could be wrong, and you might be right. But at this point, unless something very significant changes, I’d put money on you being mostly wrong.

floofloof@lemmy.ca · edit-2 3 months ago

Even more efficient: humans do the specs and the implementation. AI has nothing to contribute to specs, and is worse at implementation than an experienced human. The process you describe, with current AIs, offers no advantages.

AI can write boilerplate code and implement simple small-scale features when given very clear and specific requests, sometimes. It’s basically an assistant to type out stuff you know exactly how to do and review. It can also make suggestions, which are sometimes informative and often wrong.

If the AI were a member of my team it would be that dodgy developer whose work you never trust without everyone else spending a lot of time holding their hand, to the point where you wish you had just done it yourself.

pelespirit@sh.itjust.works · 3 months ago

Have you used any AI to try and get it to do something? It learns generally, not specifically. So you give it instructions and then it goes, “How about this?” You tell it that it’s not quite right and to fix these things and it goes off on a completely different tangent in other areas. It’s like working with an 8 year old who has access to the greatest stuff around.

SpaceNoodle@lemmy.world · 3 months ago

It doesn’t even actually learn, though.

Unlearned9545@lemmy.world · 3 months ago

Fractional CTO: Some small companies benefit from the senior experience of these kinds of executives but don’t have the money or the need to hire one full time. A fraction of the time they are C suite for various companies.

rekabis@lemmy.ca · 3 months ago

Sooo… he works multiple part-time jobs?

Weird how a forced technique of the ultra-poor is showing up here.

Jyek@sh.itjust.works · 3 months ago

It’s more like the MSP IT style of business. There are clients that consult you for your experience or that you spend a contracted amount of time with and then you bill them for your time as a service. You aren’t an employee of theirs.

Diplomjodler@lemmy.world · 3 months ago

Or he’s some deputy assistant vice president or something.

bitjunkie@lemmy.world · 3 months ago

Deputy assistant to the vice president

Agent641@lemmy.world · 3 months ago

I cannot understand and debug code written by AI. But I also cannot understand and debug code written by me.

Let’s just call it even.

I Cast Fist@programming.dev · 3 months ago

At least you can blame yourself for your own shitty code, which hopefully will never attempt to “accidentally” erase the entire project

PoliteDudeInTheMood@lemmy.ca · 3 months ago

I don’t know how that happens, I regularly use Claude code and it’s constantly reminding me to push to git.

dejected_warp_core@lemmy.world · 3 months ago

To quote your quote:

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

I think the author just independently rediscovered “middle management”. Indeed, when you delegate the gruntwork under your responsibility, those same people are who you go to when addressing bugs and new requirements. It’s not on you to effect repairs: it’s on your team. I am Jack’s complete lack of surprise. The idea that relying on AI to do nuanced work like this and arrive at the exact correct answer to the problem, is naive at best. I’d be sweating too.

fuck_u_spez_in_particular@lemmy.world · 3 months ago

The problem though (with AI compared to humans): The human team learns, i.e. at some point they probably know what the mistake was and avoids doing it again. AI instead of humans: well maybe the next or different model will fix it maybe…

And what is very clear to me after trying to use these models, the larger the code-base the worse the AI gets, to the point of not helping at all or even being destructive. Apart from dissecting small isolatable pieces of independent code (i.e. keep the context small for the AI).

Humans likely get slower with a larger code-base, but they (usually) don’t arrive at a point where they can’t progress any further.

Rimu@piefed.social · edit-2 3 months ago

FYI this article is written with a LLM.

Don’t believe a story just because it confirms your view!

AmbiguousProps@lemmy.today · 3 months ago

I’ve heard that these tools aren’t 100% accurate, but your last point is valid.

/home/pineapplelover@lemmy.dbzer0.com · 3 months ago

I agree but look at that third paragraph, it has the dash that nobody ever uses. Tell tale signs right there

AmbiguousProps@lemmy.today · 3 months ago

Sure, but plenty of journalists use the em-dash. That’s where LLMs got it from originally. It alone is not a signature of LLM use in journalistic articles (I’m not calling this CTO guy a journalist, to be clear)

JcbAzPx@lemmy.world · 3 months ago

Context is everything. In publishing it’s standard; in online forums it’s either needlessly pretentious or AI and either way they deserve to be called out.

/home/pineapplelover@lemmy.dbzer0.com · 3 months ago

When I mean “nobody uses it” I mean nobody other than people getting paid writing for a living would use it. This tech bro would not use that em dash and the quotation marks you can’t also find on the keyboard.

Rimu@piefed.social · 3 months ago

GPTZero is 99% accurate.

https://gptzero.me/news/gptzero-accuracy-stats/

AmbiguousProps@lemmy.today · edit-2 3 months ago

I mean… has anyone other than the company that made the tool said so? Like from a third party? I don’t trust that they’re not just advertising.

Rimu@piefed.social · edit-2 3 months ago

The answer to that is literally in the first sentence of the body of the article I linked to.

DSN9@lemmy.ml · 3 months ago

Ai says Ai correction tool about how crappy Ai is at coding’s article is 99 percent chance of being Ai, results generated by Ai. . .

LiveLM@lemmy.zip · 3 months ago

Aren’t these LLM detectors super inaccurate?

Rimu@piefed.social · 3 months ago

I’ve tested lots and lots of different ones. GPTZero is really good.

If you read the article again, with a critical perspective, I think it will be obvious.

Randelung@lemmy.world · 3 months ago

Yes, but also the opposite. Don’t discount a valid point just because it was formulated using an LLM.

Rimu@piefed.social · 3 months ago

The story was invented so people would subscribe to his substack, which exists to promote his company.

We’re being manipulated into sharing made-up rage-bait in order to put money in his pocket.

flamingo_pinyata@sopuli.xyz · edit-2 3 months ago

“fractional CTO”(no clue what that means, don’t ask me)

For those who were also interested to find out: Consultant and advisor in a part time role, paid to make decisions that would usually fall under the scope of a CTO, but for smaller companies who can’t afford a full-time experienced CTO

zerofk@lemmy.zip · 3 months ago

That sounds awful. You get someone who doesn’t really know the company or product, they take a bunch of decisions that fundamentally affect how you work, and then they’re gone.

… actually, that sounds exactly like any other company.

bigfondue@lemmy.world · edit-2 2 months ago

deleted by creator

rainwall@piefed.social · edit-2 3 months ago

Ive worked with a fractional CISO. He was scattered, but was insanly useful about setting roadmaps, writting procedure/docs, working audits and correcting us moving in bad cybersecurity directions.

Fractional is way better than none.

Telodzrum@lemmy.world · 3 months ago

That’s more what a consultant is. A “Fractional C[insert function here]O is permanent or at least long-term. It just means the firm doesn’t have the resources and need for a full-time executive in that role. I’ve worked with fractional CTO, CIO, CFO, and CMO executives at different companies and they’ve all been required to have the company, industry, market, etc. knowledge that a non-fractional employee would. Honestly, this concept has been wonderful for small to midsize companies.

Nalivai@lemmy.world · 3 months ago

They never actually say what “product” do they make, it’s always “shipped product” like they’re fucking amazon warehouse. I suspect because it’s some trivial webpage that takes an afternoon for a student to ship up, that they spent three days arguing with an autocomplete to shit out.

e461h@sh.itjust.works · 3 months ago

Cloudflare, AWS, and other recent major service outages are what come to mind re: AI code. I’ve no doubt it is getting forced into critical infrastructure without proper diligence.

Humans are prone to error so imagine the errors our digital progeny are capable of!

HugeNerd@lemmy.ca · 3 months ago

Computers are too powerful and too cheap. Bring back COBOL, painfully expensive CPU time, and some sort of basic knowledge of what’s actually going on.

Pain for everyone!

Thorry@feddit.org · 3 months ago

Yeah I think around the Pentium 200mhz point was the sweet spot. Powerful enough to do a lot of things, but not so powerful that software can be as inefficient and wasteful as it is today.

Čauky Mňauky@lemmy.zip · 3 months ago

I share a similar sentiment, but I’d place the turning point somewhere between 1 and 2 GHz.

HC4L@lemmy.world · 3 months ago

Be careful what you wish for, with RAM prices soaring owning a home computer might become less of an option. Luckily we can get a subscription for computing power easily!

Omgpwnies@lemmy.world · 3 months ago

I built a new PC early October, literally 2 weeks later RAM prices went nuts… so glad I pulled the trigger when I did

Evotech@lemmy.world · 3 months ago

Just ask the ai to make the change?

BarneyPiccolo@lemmy.today · 3 months ago

I don’t know shit about anything, but it seems to me that the AI already thought it gave you the best answer, so going back to the problem for a proper answer is probably not going to work. But I’d try it anyway, because what do you have to lose?

Unless it gets pissed off at being questioned, and destroys the world. I’ve seen more than few movies about that.

Evotech@lemmy.world · edit-2 3 months ago

You are in a way correct. If you keep sending the context of the “conversation” (in the same chat) it will reinforce its previous implementation.

The way ais remember stuff is that you just give it the entire thread of context together with your new question. It’s all just text in text out.

But once you start a new conversation (meaning you don’t give any previous chat history) it’s essentially a “new” ai which didn’t know anything about your project.

This will have a new random seed and if you ask that to look for mistakes etc it will happily tell you that the last Implementation was all wrong and here’s how to fix it.

It’s like a minecraft world, same seed will get you the same map every time. So with AIs it’s the same thing ish. start a new conversation or ask a different model (gpt, Google, Claude etc) and it will do things in a new way.

TheBlackLounge@lemmy.zip · 3 months ago

Doesn’t work. Any semi complex problem with multiple constraints and your team of AIs keeps running circles. Very frustrating if you know it can be done. But what if you’re a “fractional CTO” and you get actually contradictory constraints? We haven’t gotten yet to AIs who will tell you that what you ask is impossible.

Evotech@lemmy.world · 3 months ago

Yeah right now you have to know what’s possible and nudge the ai in the right direction to use the correct approach according to you if you want it to do things in an optimized way

BarneyPiccolo@lemmy.today · 3 months ago

Maybe the solution is to keep sending the code through various AI requests, until it either gets polished up, or gains sentience, and destroys the world. 50-50 chance.

This stuff ALWAYS ends up destroying the world on TV.

Seriously, everybody is complaining about the quality of AI product, but the whole point is for this stuff to keep learning and improving. At this stage, we’re expecting a kindergartener to product the work of a Harvard professor. Obviously, were going to be disappointed.

But give that kindergartener time to learn and get better, and they’ll end up a Harvard professor, too. AI may just need time to grow up.

And frankly, that’s my biggest worry. If it can eventually start producing results that are equal or better than most humans, then the Sociopathic Oligarchs won’t need worker humans around, wasting money that could be in their bank accounts.

And we know what their solution to that problem will be.

phed@lemmy.ml · 3 months ago

I do a lot with AI but it is not good enough to replace humans, not even close. It repeats the same mistakes after you tell it no, it doesn’t remember things from 3 messages ago when it should. You have to keep re-explaining the goal to it. It’s wholey incompetant. And yea when you have it do stuff you aren’t familiar with or don’t create, def. I have it write a commentary, or I take the time out right then to ask it what x or y does then I add a comment.

kahnclusions@lemmy.ca · edit-2 3 months ago

Even worse, the ones I’ve evaluated (like Claude) constantly fail to even compile because, for example, they mix usages of different SDK versions. When instructed to use version 3 of some package, it will add the right version as a dependency but then still code with missing or deprecated APIs from the previous version that are obviously unavailable.

More time (and money, and electricity) is wasted trying to prompt it towards correct code than simply writing it yourself and then at the end of the day you have a smoking turd that no one even understands.

LLMs are a dead end.

Echo Dot@feddit.uk · 3 months ago

There’s no point telling it not to do x because as soon as you mention it x it goes into its context window.

It has no filter, it’s like if you had no choice in your actions, and just had to do every thought that came into your head, if you were told not to do a thing you would immediately start thinking about doing it.

kahnclusions@lemmy.ca · edit-2 3 months ago

I’ve noticed this too, it’s hilarious(ly bad).

Especially with image generation, which we were using to make some quick avatars for a D&D game. “Draw a picture of an elf.” Generates images of elves that all have one weird earring. “Draw a picture of an elf without an earing.” Great now the elves have even more earrings.

deathbird@mander.xyz · 3 months ago

I think this kinda points to why AI is pretty decent for short videos, photos, and texts. It produces outputs that one applies meaning to, and humans are meaning making animals. A computer can’t overlook or rationalize a coding error the same way.