You either have a backup or will have a backup next time.
Something that is always online and can be wiped while you’re working on it (by yourself or with AI, doesn’t matter) shouldn’t count as backup.
AI or not, I feel like everybody has had “the incident” at some point. After that, you obsessively keep backups.
For me it was a my entire “Junior Project” in college, which was a music album. My windows install (Vista at that time - I know, vista was awful, but it was the only thing that would utilize all 8gb of my RAM because x64 XP wasn’t really a thing) bombed out, and I was like “no biggie, I keep my OS on one drive and all of my projects on the other, I’ll just reformat and reinstall Windows”
Well… I had two identical 250gb drives and formatted the wrong one.
Woof.
I bought an unformat tool that was able to recover mostly everything, but I lost all of my folder structure and file names. It was just like 000001.wav, 000002.wav etc. I was able to re-record and rebuild but man… Never made that mistake again. Like I said. I now obsessively backup. Stacks of drives, cloud storage. Drives in divverent locations etc.
The developer is to blame. Using a cutting edge tool irresponsibly. I have made mistakes using AI to help coding as well, never this bad though. Blaming AI would be like blaming the hammer a roofer was using to hammer nails and slamming their finger accidentally with it. You don’t blame the hammer, you blame the negligence of the roofer.
Nobody wants to point out that Alexey Grigorev changes to being named Gregory after 2 paragraphs?
Slop journalism at its sloppiest. I wouldn’t be surprised to find out that this story was entorely fabricated.
That’s it Son of Anton is banned.
The lesson: AI cannot bridge an air-gapped backup. This could all be prevented with a crappy portable hard drive from costco.
I am still unable to delete the backup. Trying *nuke tool*.
[Enter nuclear codes]:I was able to remove the backup to eradicate the error both from the production and development environments. But wait a second, the user specified not to lose data. But I just eliminated all versions of the data. The user won’t be happy. Oopsie whoopsie!
A developer having the ability to accidentally erase your production db is pretty careless.
An AI agent having the ability to “accidentally” erase your production db is fucking stupid as all fuck.
An AI agent having the ability to accidentally erase your production db and somehow also all the backup media? That requires a special course on complete dribbling fuckwittery.
<insert Padme meme>: You had a backup, right?
But ai is s good thing! /s
AI is like a circular saw. Are circular saws useful?
Of course.
Can you cut your entire hand off if you don’t use it correctly? Absolutely.
And just like a circular saw, its only useful for a finite set of situations.
Sure — as with every tool. Hammers are great for many things, but don’t do all that well driving screws. Money is one of the most used tools humans have ever devised, but you can’t use it for everything.
AI in coding may only be good for a finite set of situations — but that set is massive. You’re dealing with regular languages that can be mathematically proven to be correct (in the sense that they will generate a working program, and not in the sense that they program will in fact function the way the user intends). This is a less open-ended scenario than something like an AI generated video, and so it’s easier for AI to excel at it, especially for non-novel algorithms.
But if you use it like an idiot, you’re going to get burned — and this guy was an idiot who doesn’t understand what he’s doing, or the tools researchers in software development have made over the last few decades. AI shouldn’t be touching your production environment — at all. And it shouldn’t have to — code needs to be stored in a versioning source repository of some sort (and backed up so you are unlikely to ever lose it), deployment needs to be fully scripted and should be able to rebuild your environments from scratch (from code right to production), and developers and development tools (like AI tools) should only have access to development environments, and not production environments.
So unless you’re a total dumbass, an AI agent (or even a shitty human developer) should never have the kind of access to do what happened here. They violated some pretty basic principals of software development, and got burned. This guy sawed his own hand off because he misused the tools to take a bunch of shortcuts, without building in any backups or reproducibility. The AI isn’t the proximal fault here — trusting it when you have no way to reproduce your environment when things go wrong is the problem, and that’s 100% on the human sitting at the keyboard (PEBKAC).
Filters out the biggest fools it seems.
Stop giving chat bots tools with this kind of access.
Wrong answer. If you don’t give them access, the alternative (ruling out not using AI because leadership will never go for that) is to hire high school kids to take a task from a manager, ask the ai to do it, then do what the AI says repeatedly to iterate to the solution. The problem with that alt is that it is no better than giving the ai access, and it leaves you with no senior tech people. Instead, you give it access, but only give senior tech people access to the AI. Ones who would know to tell the AI to have a backup of the database, one designed to not let you delete it without multiple people signing off.
Senior tech people aren’t going to spend thier time trying things an AI needs tried to find the solution. So if you don’t give it access, they won’t use it, and eventually they will all be gone. Then you are even further up shit creek than you are now.
The answer overall, is smarter people talking to the AI, and guardrails to stop a single point of failure. The later is nothing new.
What is this insane rambling?
The alternative is that the only thing with access to make changes in your production environment is the CI pipeline that deploys your production environment.
Neither the AI, nor anything else on the developers machine, should have access to make production changes.
What are you even talking about?
The answer is no AI. It’s really simple. The costs for ai are not worth the output.
Nah. As a tech people, I am not going to give an llm write access to anything in production, period
I’m in favour of hiring kids to figure out the solution through iteration and doing web searches etc. If they fuck up, then they learn and eventually become better at their job - maybe even becoming a Senior themselves eventually.
I get what you’re saying - Seniors are more likely to use the tools more effectively, but there are many cases of the AI not doing what its told. Its not repeatably consistent like a bash script.
People are better - always.
No risk, no reward. People are desperate for these tools to help them success.
Success bigly, even.
Ai or not. This is on the person who gave it prod access. I don’t care if the dev was running CC in yolo mode, not paying attention to it or CC went completely rogue. Why would you give it prod access, this is human error.
We used to say Raid is not a backup. Its a redundancy
Snapshots are not a backup. Its a system restore point.
Only something offsite, off system and only accessible with seperate authentication details, is a backup.
3-2-1 Backup Rule: Three copies of data at two different types of storage media, with 1 copy offsite
AND something tested to restore successfully, otherwise it’s just unknown data that might or might not work.
(i.e. reinforcing your point, no disagreements)
AKA Schrödinger’s Backup. Until you have successfully restored from a backup, it is just an amorphous blob of data that may or may not be valid.
I say this as someone who has had backups silently fail. For instance, just yesterday, I had a managed network switch generate an invalid config file for itself. I was making a change on the switch, and saved a backup of the existing settings before changing anything. That way I could easily reset the switch to default and push the old settings to it, if the changes I made broke things. And like an idiot, I didn’t think to validate the file (which is as simple as pushing the file back to the switch to see if it works) before I made any changes.
Sure enough, the change I made broke something, so I performed a factory reset and went to upload that backup I had saved like 20 minutes prior… When I tried to restore settings after the factory reset, the switch couldn’t read the file that it had generated like 20 minutes earlier.
So I was stuck manually restoring the switch’s settings, and what should have been a quick 2 minute “hold the reset button and push the settings file once it has rebooted” job turned into a 45 minute long game of “find the difference between these two photos” for every single page in the settings.
That’s always just one of the worst feelings in the world. This thing is supposed to work and be easy and… nope. Not there. It’s gone. Now you have work to do. heh
Always a fun time when technology decides to just fuck you over for no reason
But the backup software verified the backup!
Schrödinger’s backup
Fukan yes
- D\L all assets locally
- proper 3-2-1 of local machines
- duty roster of other contributors with same backups
- automate and have regular checks as part of production
- also sandbox the stochastic parrot
A LTO drive with a non-consumer interface?
We still say that.
I remember back when I first started seeing a DR plan with three tiers of restore, 1 hour, 12 hours or 72 hours. I knew that to 1 hour meant a simple redirect to a DB partition that was a real time copy of the active DB, and twelve hours meant that failed, so the twelve hours was a restore point exercise that would mean some data loss, but less than one hour, or something like that.
I had never heard of 72 hours and so raised a question in the meeting. 72 hours meant having physical tapes shipped to the data center, and I believe meant up to 12 (though it could have been 24) hours of data lost. I was impressed by this, because the idea of having a job that ran either daily or twice daily that created tape backups was completely new to me.
This was in the early aughts. Not sure if tapes are still used…
but should serve as a cautionary tale.
Jesus there’s a headline like this every month, how many tales people need to learn???
I am approaching caution critical mass.
Once the threshold is hit, I buy some solar panels and become an off grid farmer.
Caution Treshold!
Have you met software. Nearly all of it is a cautionary tale. Even before AI. So this is just business as usual for the software industry.
Whoever gave it access to production is a complete moron.
If you’ve ever used it you can see how easily it can happen.
At first you Sandbox box it and you’re careful. Then after a while the sand box is a bit of a pain so you just run it as is. Then it asks for permission a 1000 times to do something and at first you carefully check each command but after a while you just skim them and eventually, sure you can run ‘psql *’ to debug some query on the dev instance…
It’s one of the major problems with the “full self driving” stuff as well. It’s right often enough that eventually you get complacent or your attention drifts elsewhere.
This kind of stuff happened before the LLM coding agents existed, they have just supercharged the speed and as a result increased the amount of damage that can be done before it’s noticed.
There are already a bunch of failures in place for something like this to happen. Having the prod credentials available etc etc it’s just now instead of rolling the dice every couple weeks your LLM is rolling them every 20s.
If you’ve ever used it you can see how easily it can happen.
How could this happen easily? A regular developer shouldn’t even have access to production outside of exceptional circumstances (e.g. diagnosing a production issue). Certainly not as part of the normal dev process.
They shouldn’t and we know that but this is hardly the first time that story has been told even before LLMs. Usually it was blamed on “the intern” or whatever.
This isn’t just an issue with a developer putting too much trust into an LLM though. This is a failure at the organizational level. So many things have to be wrong for this to happen.
If an ‘intern’ can access a production database then you have some serious problems. No one should have access to that in normal operations.
Sure, I’m not telling you how it should be, I’m telling you how it is.
The LLM just increases the damage done because it can do more damage faster before someone figures out they fucked up.
This is the last big one I remembered offhand but I know it happens a couple times a year and probably more just goes unreported.
https://www.cnn.com/2021/02/26/politics/solarwinds123-password-intern
Why would an intern be given prod supply chain credentials, who knows. People fuck up all the time.
If you’ve ever used it you can see how easily it can happen.
Yes, I can see how it can easily happen to stupid lazy people.
Just a freak accident. Maybe next time, give it more permissions so it can fix any problems that occur. 😉
/s ?
Whether human, AI, or code, you don’t give a single entity this much power in production.
It’s why there a two keys to launch nukes.
WOPR disagrees:






