Interesting read, but I’m perhaps even less convinced than before that chat-oriented programming is a better use of my time than learning to write the thing myself. Especially having to work about token quotas is frightening; I don’t have to spend any money to think for as long as the problem requires me to, but each “thought” you make the LLM agent do is cents if not dollars up in flames.
I’ll continue to play around with self-hosted, local models for now.
as Google’s L1 Cache
Idiot does not know what a cache is.
deleted by creator
Thanks for this thoughtful write up of your process. I’m increasingly thinking about what context the model has and keeping it as focused as possible - both to reduce token usage, and to ensure it doesn’t have any cruft in it that potentially causes the model to go down an un-useful path. The prompts for this read like what I imagine a conversation with a junior developer would be when handing off a task.
In practice, this is usually clearing the context after quite small changes and the prompting for the next one with just what I think it is going to need. I guess this is ‘context engineering’ although that sounds like too fancy a term for it.




