Newer AI Coding Assistants Are Failing in Insidious Ways

brianpeiris@lemmy.ca · 23 days ago

Newer AI Coding Assistants Are Failing in Insidious Ways

middlemanSI@lemmy.world · 23 days ago

Who would have thought…Oh wait, any sane person would. Don’t move, you might burst it.

TheFogan@programming.dev · 23 days ago

I mean it’s kind of obvious… they are giving their LLMs simulators. access to test etc…, IE chat gpt can run code in a python environment and detect errors. but obviously it can’t know what the intention is, so it’s inevitably going to stop when it gets it’s first “working” result.

of course I’m sure further issues will come from incestuous code… IE AIs train on all publicly listed github code.

Vibe coders begin working on a lot of “projects” that they upload to github. now new AI can pick up all the mistakes of it’s predicesors on top of making it’s new ones.

Kissaki@programming.dev · 23 days ago

A task that might have taken five hours assisted by AI, and perhaps ten hours without it, is now more commonly taking seven or eight hours, or even longer.

What kind of work do they do?

in my role as CEO of Carrington Labs, a provider of predictive-analytics risk models for lenders. My team has a sandbox where we create, deploy, and run AI-generated code without a human in the loop. We use them to extract useful features for model construction, a natural-selection approach to feature development.

I wonder what I have to imagine this is doing and how. How do they interface with the loop-without-a-human?

Either way, they do seem to have a (small, narrow) systematic test case and the product variance to be useful at least anecdotally/for a sample case.

lad@programming.dev · 22 days ago

I have a feeling that their test case is also a bit flawed. Trying to get index_value instead of index value is something I can imagine happening, and asking an LLM to ‘fix this but give no explanation’ is asking for a bad solution.

I think they are still correct in the assumption that output becomes worse, though

VoterFrog@lemmy.world · 21 days ago

It just emphasizes the importance of tests to me. The example should fail very obviously when you give it even the most basic test data.

lad@programming.dev · 21 days ago

Yeah, if only QA vere not the first ‘replaced’ by AI 😠

VoterFrog@lemmy.world · 21 days ago

This isn’t even a QA level thing. If you write any tests at all, which is basic software engineering practice, even if you had AI write the tests for you, the error should be very, very obvious. I mean I guess we could go down the road of “well what if the engineer doesn’t read the tests?” but at that point the article is less about insidious AI and just about bad engineers. So then just blame bad engineers.

lad@programming.dev · 21 days ago

Yeah, I understand that this case doesn’t require a QA, but in the wild companies seem to increasingly think that developers are necessary (yet), but QA are surely not

It’s not even bad engineers, it’s just squeezing of productivity as dry as possible, as I see it

xep@discuss.online · 22 days ago

As opposed to the older ones working perfectly well?

/s

themusicman@lemmy.world · 21 days ago

As opposed to older ones failing in obvious and fixable ways

Horrabin@programming.dev · 23 days ago

Sorry for the self-reference but… https://medium.com/@nandofm/ai-the-danger-of-endogamic-programming-8114b90dffb2

voodooattack@lemmy.world · 21 days ago

I’d really love to read that, but medium is just… not my thing. I hate that site so much.

Have you considered writing on dev.to? I won’t promote it, extol any virtues, or try to convince you to go there. Just asking if you’re aware of it and others like it!

Horrabin@programming.dev · 21 days ago

I joined Medium before dev.to… that’s the reason. But I have to ask, why dev.to? To be honest I don’t care one place or the other.

voodooattack@lemmy.world · 21 days ago

I used to write articles on medium too, but dev.to was where I ended up because I witnessed it being founded. As for why I mentioned it specifically? Not entirely sure to be honest. I think it’s the first thing that stuck out when I thought of medium because it’s literally the opposite in many ways… less focused on profits and ads. No non-negotiable paywalls for valuable knowledge that I can recall, developer/tech focused with a great and supportive community, easy access/exposure for new authors, and a whole gamut of other small but positive differences that aligned with me personally. These were the first things I noticed from my experience publishing stuff there.

There are many other sites with communities like that that I’ve come across, for example: writeas.com as an alternative to tumblr/blogger and such, devRant is great as a venting space for developer-specific trouble/humour/jokes and interesting stories. Etc.

I have a soft spot for small independent sites like that. The ones trying to revive the 2000s internet spirit/experience. No shareholders or algorithms to dictate what becomes popular and what gets buried based on profit-driven logic/metrics to steer the masses or influence opinions for the sake of ad revenue or sales.

Horrabin@programming.dev · 20 days ago

“The ones trying to revive the 2000s internet spirit/experience”

That was one of the reasons why I recently move from Twitter/X, Instagram… to the fediverse. And honestly I didn’t pay attention to Medium as I don’t write so often. I will take a look at dev.to and proably I will move my (three) Medium articles here. Thank you so much!

voodooattack@lemmy.world · edit-2 20 days ago

You’re welcome. :)

I just overcame my aversion to medium to read your article, then I read the rest. I have to admit I’m very impressed, not only with what you were doing back then, but the fact we were exploring the same corners before modern “data science” was even a thing! My parallel journey was on a different track from NLP though. I was exploring spike-train-based neural net architecture, unsupervised learning, and how to give neural networks “tools” to work with (not tool calls, actual tools like a paint brush and a virtual canvas, etc). Damn… I think I still have ancient videos of that on my YT channel.

My Intel Celeron™ CPU could never handle more than 4 layers of ~512 neurons maybe? I don’t really remember the specifics, but I think that’s why I stopped back then.

I think that’s why the 2000s were magical for me, although your “grandpa” comments are now hitting me right in the soul. Damn it. :P

Horrabin@programming.dev · 20 days ago

I’m glad that you like it but sorry for taking you to the dark side :-)

You are completely right, in the early 2000 and before, sites where built to share, help and learn. The word “enshitification” describes perfectly the current landscape. Luckily there are some oasis, like the fediverse and other places, where we can still feel Internet as it should be.