At work today we had a little presentation about Claude Cowork. And I learned someone used it to write a C (maybe C++?) compiler in Rust in two weeks at a cost of $20k and it passed 99% of whatever hell test suite they use for evaluating compilers. And I had a few thoughts.
99% pass rate? Maybe that’s super impressive because it’s a stress test, but if 1% of my code fails to compile I think I’d be in deep shit.
20k in two weeks is a heavy burn. Imagine if what it wrote was… garbage.
“Write a compiler” is a complete project plan in three words. Find a business project that is that simple and I’ll show you software that is cheaper to buy than build. We are currently working on an authentication broker service at work and we’ve been doing architecture and trying to get everyone to agree on a design for 2 months. There are thousands of words devoted to just the high level stuff, plus complex flow diagrams.
A compiler might be somewhat unique in the sense that there are literally thousands of test cases available - download a foss project and try to compile it. If it fails, figure out the bug and fix it. Repeat. The ERP that your boss wants you to stand up in a month has zero test coverage and is going to be chock full of bugs — if for no other reason than you haven’t thought through every single edge case and neither has the AI because lots of times those are business questions.
There is not a single person who knows the code base well enough to troubleshoot any weird bugs and transient errors.
I think this is a cool thing in the abstract. But in reality, they cherry picked the best possible use case in the world and anyone expecting their custom project is going to go like this will be lighting huge piles of money on fire.
It’s even simpler than that: using an LLM to write a C compiler is the same as downloading an existing open source implementation of a C compiler from the Internet, but with extra steps, as the LLM was actually fed with that code and is just re-assembling it back together but with extra bugs - plagiarism hidden behind an automated text parrot interface.
A human can beat the LLM at that by simply finding and downloading an implementation of that more than solved problem from the Internet, which at worse will take maybe 1h.
The LLM can “solve” simple and well defined problems because its basically plagiarizing existing code that solves those problems.
Also, software development is already the best possible use case for LLMs: you need to build something abiding by a set of rules (as in a literal language, lmao), and you can immediately test if it works.
In e.g. a legal use case instead, you can jerk off to the confident sounding text you generated, then you get chewed out by the judge for having hallucinated references. Even if you have a set of rules (laws) as a guardrails, you cannot immediately test what the AI generated - and if an expert needs to read and check everything in detail, then why not just do it themselves in the same amount of time.
We can go on to business, where the rules the AI can work inside are much looser, or healthcare, where the cost of failure is extremely high. And we are not even talking about responsibilities, official accountability for decisions.
I just don’t think what is claimed for AI is there. Maybe it will be, but I don’t see it as an organic continuation of the path we’re in. We might have another dot com boom when investors realize this - LLMs will be here to stay (same as the internet did), but they will not become AGI.
A C compiler in two weeks is a difficult, but doable, grad school class project (especially if you use lex and yacc instead of hand-coding the parser). And I guarantee 80 hours of grad student time costs less than $20k.
Frankly, I’m not impressed with the presentation in your anecdote at all.
Here is the original cite that my company pulled that from if you want more details.
I’ve never written a compiler, nor in Rust, so I have no idea the effort involved. I’m just boggling over the price tag. I’ll bet that’s the cost of an entire offshore team.
Agree with all points. Additionally, compilers are also incredibly well specified via ISO standards etc, and have multiple open source codebases available, eg GCC which is available in multiple builds and implementations for different versions of C and C++, and DQNEO/cc.go.
So there are many fully-functional and complete sources that Claude Cowork would have pulled routines and code from.
The vibe coded compiler is likely unmaintainable, so it can’t be updated when the spec changes even assuming it did work and was real. So you’d have to redo the entire thing. It’s silly.
At work today we had a little presentation about Claude Cowork. And I learned someone used it to write a C (maybe C++?) compiler in Rust in two weeks at a cost of $20k and it passed 99% of whatever hell test suite they use for evaluating compilers. And I had a few thoughts.
I think this is a cool thing in the abstract. But in reality, they cherry picked the best possible use case in the world and anyone expecting their custom project is going to go like this will be lighting huge piles of money on fire.
It’s even simpler than that: using an LLM to write a C compiler is the same as downloading an existing open source implementation of a C compiler from the Internet, but with extra steps, as the LLM was actually fed with that code and is just re-assembling it back together but with extra bugs - plagiarism hidden behind an automated text parrot interface.
A human can beat the LLM at that by simply finding and downloading an implementation of that more than solved problem from the Internet, which at worse will take maybe 1h.
The LLM can “solve” simple and well defined problems because its basically plagiarizing existing code that solves those problems.
I would be interested in knowing what language it was for sure, as there is a huge difference between a C and a C++ compiler in terms of complexity
I just posted where I found the source in another comment. It would have probably the information you’re interested in.
I think this is the reported https://github.com/anthropics/claudes-c-compiler.
And here’s a pretty good article about it https://arstechnica.com/ai/2026/02/sixteen-claude-ai-agents-working-together-created-a-new-c-compiler/
I also often get assigned projects where all the tests are written out beforehand and I can look at an existing implementation while I work…
Also, software development is already the best possible use case for LLMs: you need to build something abiding by a set of rules (as in a literal language, lmao), and you can immediately test if it works.
In e.g. a legal use case instead, you can jerk off to the confident sounding text you generated, then you get chewed out by the judge for having hallucinated references. Even if you have a set of rules (laws) as a guardrails, you cannot immediately test what the AI generated - and if an expert needs to read and check everything in detail, then why not just do it themselves in the same amount of time.
We can go on to business, where the rules the AI can work inside are much looser, or healthcare, where the cost of failure is extremely high. And we are not even talking about responsibilities, official accountability for decisions.
I just don’t think what is claimed for AI is there. Maybe it will be, but I don’t see it as an organic continuation of the path we’re in. We might have another dot com boom when investors realize this - LLMs will be here to stay (same as the internet did), but they will not become AGI.
Don’t forget that there are tons of C compilers in the dataset already
A C compiler in two weeks is a difficult, but doable, grad school class project (especially if you use
lexandyaccinstead of hand-coding the parser). And I guarantee 80 hours of grad student time costs less than $20k.Frankly, I’m not impressed with the presentation in your anecdote at all.
Here is the original cite that my company pulled that from if you want more details.
I’ve never written a compiler, nor in Rust, so I have no idea the effort involved. I’m just boggling over the price tag. I’ll bet that’s the cost of an entire offshore team.
Agree with all points. Additionally, compilers are also incredibly well specified via ISO standards etc, and have multiple open source codebases available, eg GCC which is available in multiple builds and implementations for different versions of C and C++, and DQNEO/cc.go.
So there are many fully-functional and complete sources that Claude Cowork would have pulled routines and code from.
The vibe coded compiler is likely unmaintainable, so it can’t be updated when the spec changes even assuming it did work and was real. So you’d have to redo the entire thing. It’s silly.
Updates? You just vibecode a new compiler that follows the new spec