Does vibe coding risk destroying the Open Source ecosystem? According to a pre-print paper by a number of high-profile researchers, this might indeed be the case based on observed patterns and some…
Microsoft steeply lowered expectations on the AI Sales team, though they have denied this since they got pummelled in their quarterly and there’s been a lot of news about how investors are not happy with all the circular AI investments pumping those stocks. When the bubble pops (and all signs point to that), investors will flee. You’ll see consolidation, buy-outs, hell maybe even some bullshit bailouts, but ultimately it has to be a sustainable model and that means it will cost developers or they will be pummeled with ads (probably both).
There will be some more spending to make sure a good chunk of CEOs “add value” (FOMO) and then a critical juncture where AI spending contracts sharply when they continue to see no returns, accelerated if the US economy goes tits up. Then the domino’s fall.
Hah, they wish. It’s a business, and they need a return on investment eventually. Maybe if we were in a zero interest rate world again, but even that didn’t last.
I wouldn’t be surprised if that’s only a temporary problem - if it becomes one at all. People are quickly discovering ways to use LLMs more effectively, and open source models are starting to become competitive with commercial models. If we can continue finding ways to get more out of smaller, open-source models, then maybe we’ll be able to run them on consumer or prosumer-grade hardware.
GPUs and TPUs have also been improving their energy efficiency. There seems to be a big commercial focus on that too, as energy availability is quickly becoming a bottleneck.
So far, there is serious cognitive step needed that LLM just can’t do to get productive. They can output code but they don’t understand what’s going on. They don’t grasp architecture. Large projects don’t fit on their token window. Debugging something vague doesn’t work. Fact checking isn’t something they do well.
So far, there is serious cognitive step needed that LLM just can’t do to get productive. They can output code but they don’t understand what’s going on. They don’t grasp architecture. Large projects don’t fit on their token window.
There’s a remarkably effective solution for this, that helps both humans and models alike - write documentation.
It’s actually kind of funny how the LLM wave has sparked a renaissance of high-quality documentation. Who would have thought?
I am not aware of what they are selling but every vibe coder i know produces obsessive amounts of documentation. It’s kind of baked into the tool (if you use Claude Code at least), it will just naturally produce a lot of documentation.
They don’t need the entire project to fit in their token windows. There are ways to make them work effectively in large projects. It takes some learning and effort, but I see it regularly in multiple large, complex monorepos.
I still feel somewhat new-ish to using LLMs for code (I was kinda forced to start learning), but when I first jumped into a big codebase with AI configs/docs from people who have been using LLMs for a while, I was kinda shocked. The LLM worked far better than I had ever experienced.
It actually takes a bit of skill to set up a decent workflow/configuration for these things. If you just jump into a big repo that doesn’t have configs/docs/optimizations for LLMs, or you haven’t figured out a decent workflow, then they’ll be underwhelming and significantly less productive.
(I know I’ll get downvoted just for describing my experience and observations here, but I don’t care. I miss the pre-LLM days very much, but they’re gone, whether we like it or not.)
This sounds a lot like every framework, 20 years ago you could have written that about rails.
Which IMO makes sense because if code isn’t solving anything interesting then you can dynamically generate it relatively easily, and it’s easy to get demos up and running, but neither can help you solve interesting problems.
Which isn’t to say it won’t have a major impact on software for decades, especially low-effort apps.
Yeah true. I’m assuming (and hoping) that the problems with consumer grade hardware being less accessible will be temporary.
I have wristwatches with significantly higher CPU, memory, and storage specs than my first few computers, while consuming significantly less energy. I think the current state of LLMs is pretty rough but will continue to improve.
Oh, sorry, I didn’t mean to imply that consumer-grade hardware has gotten more efficient. I wouldn’t really know about that, but I assume most of the focus is on data centers.
Those were two separate thoughts:
Models are getting better, and tooling built around them are getting better, so hopefully we can get to a point where small models (capable of running on consumer-grade hardware) become much more useful.
Some modern data center GPUs and TPUs compute more per watt-hour than previous generations.
Can you provide evidence the “more efficient” models are actually more efficient for vibe coding? Results would be the best measure.
It also seems like costs for these models are increasing, and companies like Cursor had to stoop to offering people services below cost (before pulling the rug out from them).
I wish I could, but it would kinda be PII for me. Though, to clarify some things:
I’m mostly not talking about vibe coding. Vibe coding might be okay for quickly exploring or (in)validating some concept/idea, but they tend to make things brittle pile up a lot of tech debt if you let them.
I don’t think “more efficient” (in terms of energy and pricing) models are more efficient for work. I haven’t measured it, but the smaller/“dumber” models tend to require more cycles before they reach their goals, as they have to debug their code more along the way. However, with the right workflow (using subagents, etc.), you can often still reach the goals with smaller models.
There’s a difference between efficiency and effectiveness. The hardware is becoming more efficient, while models and tooling are becoming more effective. The tooling/techniques to use LLMs more effectively also tend to burn a LOT of tokens.
TL;DR:
Hardware is getting more efficient.
Models, tools, and techniques are getting more effective.
I think this kind of claim really lies in a sour spot.
On the one hand it is trivial to get an IDE, plug it to GLM 4.5 or some other smaller more efficient model, and see how it fares on a project. But that’s just anecdotal. On the other hand, model creators do this thing called benchmaxing where they fine-tune their model to hell and back to respond well to specific benchmarks. And the whole culture around benchmarks is… i don’t know i don’t like the vibe it’s all AGI maximalists wanking to percent changes in performance. Not fun. So, yeah, evidence is hard to come by when there are so many snake oil salesmen around.
On the other hand, it’s pretty easy to check on your own. Install opencode, get 20$ of GLM credit, make it write, deploy and monitor a simple SaaS product, and see how you like it. Then do another one. And do a third one with Claude Code for control if you can get a guest pass (i have some hit me up if you’re interested).
What is certain from casual observation is that yes, small models have improved tremendously in the last year, to the point where they’re starting to get usable. Code generation is a much more constrained world than generalist text gen, and can be tested automatically, so progress is expected to continue at breakneck pace. Large models are still categorically better but this is expected to change rapidly.
Only until AI investor money dries up and vibe coding gets very expensive quickly. Kinda how Uber isn’t way cheaper than a taxi now.
Is that the latest term for “when hell freezes over”?
Microsoft steeply lowered expectations on the AI Sales team, though they have denied this since they got pummelled in their quarterly and there’s been a lot of news about how investors are not happy with all the circular AI investments pumping those stocks. When the bubble pops (and all signs point to that), investors will flee. You’ll see consolidation, buy-outs, hell maybe even some bullshit bailouts, but ultimately it has to be a sustainable model and that means it will cost developers or they will be pummeled with ads (probably both).
A Majority of CEOs are saying their AI spend has not paid off. Those are the primary customers, not your average joe. MIT reports 95% generative AI failure rate at companies. Altman still hasn’t turned a profit. There are Serious power build-out problems for new AI centers (let alone the chips needed). It’s an overheated reactionary market. It’s the Dot Com bubble all over again.
There will be some more spending to make sure a good chunk of CEOs “add value” (FOMO) and then a critical juncture where AI spending contracts sharply when they continue to see no returns, accelerated if the US economy goes tits up. Then the domino’s fall.
Hah, they wish. It’s a business, and they need a return on investment eventually. Maybe if we were in a zero interest rate world again, but even that didn’t last.
This.
I wouldn’t be surprised if that’s only a temporary problem - if it becomes one at all. People are quickly discovering ways to use LLMs more effectively, and open source models are starting to become competitive with commercial models. If we can continue finding ways to get more out of smaller, open-source models, then maybe we’ll be able to run them on consumer or prosumer-grade hardware.
GPUs and TPUs have also been improving their energy efficiency. There seems to be a big commercial focus on that too, as energy availability is quickly becoming a bottleneck.
So far, there is serious cognitive step needed that LLM just can’t do to get productive. They can output code but they don’t understand what’s going on. They don’t grasp architecture. Large projects don’t fit on their token window. Debugging something vague doesn’t work. Fact checking isn’t something they do well.
There’s a remarkably effective solution for this, that helps both humans and models alike - write documentation.
It’s actually kind of funny how the LLM wave has sparked a renaissance of high-quality documentation. Who would have thought?
High-quality documentation assumes there’s someone with experience working on this. That’s not the vibe coding they’re selling.
I am not aware of what they are selling but every vibe coder i know produces obsessive amounts of documentation. It’s kind of baked into the tool (if you use Claude Code at least), it will just naturally produce a lot of documentation.
Complete hands-off no-review no-technical experience vibe coding is obviously snake oil, yeah.
This is a pretty large problem when it comes to learning about LLM-based tooling: lots of noise, very little signal.
They don’t need the entire project to fit in their token windows. There are ways to make them work effectively in large projects. It takes some learning and effort, but I see it regularly in multiple large, complex monorepos.
I still feel somewhat new-ish to using LLMs for code (I was kinda forced to start learning), but when I first jumped into a big codebase with AI configs/docs from people who have been using LLMs for a while, I was kinda shocked. The LLM worked far better than I had ever experienced.
It actually takes a bit of skill to set up a decent workflow/configuration for these things. If you just jump into a big repo that doesn’t have configs/docs/optimizations for LLMs, or you haven’t figured out a decent workflow, then they’ll be underwhelming and significantly less productive.
(I know I’ll get downvoted just for describing my experience and observations here, but I don’t care. I miss the pre-LLM days very much, but they’re gone, whether we like it or not.)
Exactly this. You can’t just replace experienced people with it, and that’s basically how it’s sold.
Yep, it’s a tool for engineers. People who try to ship vibe-coded slop to production will often eventually need an engineer when things fall apart.
This sounds a lot like every framework, 20 years ago you could have written that about rails.
Which IMO makes sense because if code isn’t solving anything interesting then you can dynamically generate it relatively easily, and it’s easy to get demos up and running, but neither can help you solve interesting problems.
Which isn’t to say it won’t have a major impact on software for decades, especially low-effort apps.
They’ve thought of that as well, soon nobody will be able to afford consumer grade hardware
Yeah true. I’m assuming (and hoping) that the problems with consumer grade hardware being less accessible will be temporary.
I have wristwatches with significantly higher CPU, memory, and storage specs than my first few computers, while consuming significantly less energy. I think the current state of LLMs is pretty rough but will continue to improve.
Can you cite some sources on the increased efficiency? Also, can you link to these lower priced, efficient (implied consumer grade) GPUs and TPUs?
Oh, sorry, I didn’t mean to imply that consumer-grade hardware has gotten more efficient. I wouldn’t really know about that, but I assume most of the focus is on data centers.
Those were two separate thoughts:
Can you provide evidence the “more efficient” models are actually more efficient for vibe coding? Results would be the best measure.
It also seems like costs for these models are increasing, and companies like Cursor had to stoop to offering people services below cost (before pulling the rug out from them).
I wish I could, but it would kinda be PII for me. Though, to clarify some things:
There’s a difference between efficiency and effectiveness. The hardware is becoming more efficient, while models and tooling are becoming more effective. The tooling/techniques to use LLMs more effectively also tend to burn a LOT of tokens.
TL;DR:
I think this kind of claim really lies in a sour spot.
On the one hand it is trivial to get an IDE, plug it to GLM 4.5 or some other smaller more efficient model, and see how it fares on a project. But that’s just anecdotal. On the other hand, model creators do this thing called benchmaxing where they fine-tune their model to hell and back to respond well to specific benchmarks. And the whole culture around benchmarks is… i don’t know i don’t like the vibe it’s all AGI maximalists wanking to percent changes in performance. Not fun. So, yeah, evidence is hard to come by when there are so many snake oil salesmen around.
On the other hand, it’s pretty easy to check on your own. Install opencode, get 20$ of GLM credit, make it write, deploy and monitor a simple SaaS product, and see how you like it. Then do another one. And do a third one with Claude Code for control if you can get a guest pass (i have some hit me up if you’re interested).
What is certain from casual observation is that yes, small models have improved tremendously in the last year, to the point where they’re starting to get usable. Code generation is a much more constrained world than generalist text gen, and can be tested automatically, so progress is expected to continue at breakneck pace. Large models are still categorically better but this is expected to change rapidly.
deleted by creator