No apps, no code, just intent and execution.
So the only problems you’re left with are:
- Making a precise description of what you want, at high and low levels of detail with consistent terminology
 - Verifying that the system is behaving as you expect, by exercising specific parts of it in isolation
 - The ability to make small incremental steps from one complete working state to the next complete working state, so you don’t get stuck by painting yourself into a corner
 
Problems which… code is much better than English at handling.
And always will be.
Almost like there’s a reason code exists other than just “Idk let’s make it hard so normies can’t do it mwahaha”.
It’s really funny to think about.
Equations and algorithms used to be written in human language and there were many problems.
So over thousands of years we made this great thing called math language.
And now some people are saying it’s elitist to be against writing algorithms in human language.
Okay, this is fun, but it’s time for an old programmer to yell at the cloud, a little bit:
The cost per AI request is not trending toward zero.
Current ludicrous costs are subsidized by money from gullible investors.
The cost model whole house of cards desperately depends on the poorly supported belief that the costs will rocket downward due to some future incredible discovery very very soon.
We’re watching an edurance test between irrational investors and the stubborn boring nearly completely spent tail end of Moore’s law.
My money is in a mattress waiting to buy a ten pack of discount GPU chips.
Hallucinating a new unpredictable result every time will never make any sense for work that even slightly matters.
But, this test still super fucking cool. I can think of half a dozen novel valuable ways to apply this for real world use. Of course, the reason I can think of those is because I’m an actual expert in computers.
Finally - I keep noticing that the biggest AI apologists I meet tend to be people who aren’t experts in computers, and are tired of their “million dollar” secret idea being ignored by actual computer experts.
I think it is great that the barrier of entry is going down for building each unique million dollar idea.
For the ideas that turn out to actually be market viable, I look forward to collaborating with some folks in exchange for hard cash, after the AI runs out of lucky guesses.
If we can’t make an equitable deal, I look forward to spending a few weeks catching up to their AI start-up proof-of-concept, and then spending 5 years courting their customers to my new solution using hard work and hard earned decades of expert knowledge.
This cool AI stuff does change things, but it changes things far less than the tech bros hope you will believe.
The conclusion of this experiment is objectively wrong when generalized. At work, to my disappointment, we have been trying for years to make this work, and it has been failure after failure (and I wish we’d just stop, but eventually we moved to more useful stuff like building tools adjacent to the problem, which is honestly the only reason I stuck around).
There are a couple reasons why this problem cannot succeed:
- The outputs of LLMs are nondeterministic. Most problems require determinism. For example, REST API standards require idempotency from some kinds of requests, and a LLM without a fixed seed and a temperature of 0 will return different responses at least some of the time.
 - Most real-world problems are not simple input-output machines. When calling, let’s say for example, an API to post a message to Lemmy, that endpoint does a lot of work. It needs to store the message in the darabase, federate the message, and verify that the message is safe. It also needs to validate the user’s credential before all of this, and it needs to record telemetry for observability purposes. LLMs are not able to do all this. They might, if you’re really lucky, be able to generate code that does this, but a single LLM call can’t do it by itself.
 - Some real world problems operate on unbounded input sizes. Context sizes are constrained and as currently designed cannot handle unbounded inputs. See signal processing for an example of this, and for an example of a problem a LLM cannot solve because it cannot receive the input.
 - LLM outputs cannot be deterministically improved. You can make changes to prompts and so on but the output will not monotonically improve when doing this. Improving one result often means sacrificing another result.
 - The kinds of models you want to run are not in your control. Using Claude? K Anthropic updated the model and now your outputs all changed and you need to update your prompts again. This fucked us over many times.
 
The list keeps going on. My suggestion? Just don’t. You’ll spend less time implementing the thing than trying to get an LLM to do it. You’ll save operating expenses. You’ll be less of an asshole.
The future is here! And it costs $10-$50 per 1000 HTTP requests.
Yes, sounds ridiculous, but how will this ratio change if we take into account the cost of hiring a programmer and the costs of implementation of a niche feature that this experiment provides at a cost of LLM inference?
Also: we can cache and reuse enpoint implementation.
Play tic tac toe a few times against Chat-GPT. Wouldn’t trust an LLM that can’t win tic tac toe against four year olds with production code 🤣
The cost of an HTTP request with a normal web server is fractions of a penny, perhaps even less.
$50 for 1000 requests is $5 per request. Per request. One page load on Lemmy can be 100 requests.
Your company is bankrupt in 24 hours.
Yes it’s much cheaper to hire a guy to create a feature than it is _have an LLM hallucinate a new HTTP response in realtime _ each time a browser sends a packet to your webserver.
And from a .ml user too, I’d like to think you’d see through this LLM horseshit, brother. It’s a capitalist mind trap, they’re creating a religion around it to allow magical thinking to drive profits.
$50 for 1000 requests is $5 per request
me when i use chat gpt to do maths
Considering that most techbro startups are going to be dead within a year, I’d say AI wins.
Plus most of the competent programmers already have high resistance for technobabble bullshit, and will simply refuse to work on something like an online contacts app (are you copying a Facebook or what?)
I like writing code myself, its a process I enjoy. If the LLM write it for me, then I would only do the worse part of the job: debugging. Also for many people let the Ai write code means less understanding. Otherwise you could have written it yourself. However there are things when the Ai is helpful, especially for writing tests in a restrictive language such as Rust. People forget that writing the code is one part of the job, the other is to depend on it, debug and build other stuff on top.
However there are things when the Ai is helpful, especially for writing tests in a restrictive language such as Rust.
For generating the boilerplate surrounding it, sure.
But the contents of the tests are your specification. They’re the one part of the code, where you should be thinking what needs to happen and they should be readable.A colleague at work generated unit tests and it’s the stupidest code I’ve seen in a long while, with all imports repeated in each test case, as well as tons of random assertions also repeated in each test case, like some shotgun-approach to regression testing.
It makes it impossible to know which parts of the asserted behaviour are actually intended and which parts just got caught in the crossfire.I think maybe the biggest conceptual mistake in computer science was calling them “tests”.
That word has all sorts of incorrect connotations to it:
- That they should be made after the implementation
 - That they’re only useful if you’re unsure of the implementation
 - That they should be looking for deviations from intention, instead of giving you a richer palette with which to paint your intention
 
You get this notion of running off to apply a ruler and a level to some structure that’s already built, adding notes to a clipboard about what’s wrong with it.
You should think of it as a pencil and paper — a place where you can be abstract, not worry about the nitty-gritty details (unless you want to), and focus on what would be right about an implementation that adheres to this design.
Like “I don’t care how it does it, but if you unmount and remount this component it should show the previous state without waiting for an HTTP request”.
Very different mindset from “Okay, I implemented this caching system, now I’m gonna write tests to see if there are any off-by-one errors when retrieving indexed data”.
I think that, very often, writing tests after the impl is worse than not writing tests at all. Cuz unless you’re some sort of wizard, you probably didn’t write the impl with enough flexibility for your tests to be flexible too. So you end up with brittle tests that break for bad reasons and reproduce all of the same assumptions that the impl has.
You spent extra time on the task, and the result is that when you have to come back and change the impl you’ll have to spend extra time changing the tests too. Instead of the tests helping you write the code faster in the first place, and helping you limit your tests to only what you actually care about keeping the same long-term.
It’s actually the first time I used to do Ai assisted unit test creation. There were multiple iterations and sometimes it never worked well. And the most important part is, as you say, think through and read every single test case and edit or replace if necessary. Some tests are really stupid, especially stuff that is already encoded in the type system through Rust. I mean you still need a head for revision and know what you want to do.
I still wonder if I should have just gave it the function signature without the inner workings of the function. That’s an approach I want to explore next time. I really enjoyed working with it for the tests, because writing tests is very time consuming. Although I am not much of test guy, so maybe the results aren’t that good anyway.
Edit: In about 250 unit tests (which does not cover all functions sadly) for a cli json based tool, several bugs were found thanks to this approach. I wouldn’t have done it manually.

But you can do all this anyways, this isn’t new or ground breaking.
you can load up Claude Code an a completely empty directory and tell it to build something and it will do it. it’ll do it slowly and most of the time incorrectly but it’ll eventually build “something” that will sort of work. Unless i’m still waiting for my coffee to kick in and I’m missing something here - companies already do this. Hell a lot of my current clients currently do this. no code, nothing to base anything off, just tells Claude an idea for a project and to build it.
Yeah so what I’m getting from the description is that this LLM doesn’t generate code, at all.
This feeds HTTP traffic directly to an LLM that is prompted how to respond to those requests.
This isn’t an LLM being served prompts to write code to create an HTTP server; the model’s output IS the HTTP server. The model itself is being the webserver, instead of being an autocomplete for an IDE.
The author seems to acknowledge that “the future where it’s just us and our LLMs and intent, no code and no apps” is “science fiction” but he wanted to see how close we could get with today’s tech.
Thanks for making this clear. Certainly a fun little experiment, but the shear inefficiency of the whole thing just boggles one’s mind. Hopefully this is not the direction tech is going though, it’s not like we should curb our energy needs anyway…
ah ok, thanks for explaining that makes sense. yeah I clearly missed it.
Kinda reminds me of this: I built the most expensive CPU ever! (Every instruction is a prompt)







