
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
Honestly. At this point, after it having happened to multiple people, multiple times, this is the only appropriate response.
Given that the infrastructure description included the DataTalks.Club website, this resulted in a full wipe of the setup for both sites, including a database with 2.5 years of records, and database snapshots that Grigorev had counted on as backups. The operator had to contact Amazon Business support, which helped restore the data within about a day.
Non-story. He let Terraform zap his production site without offsite backups. But then support restored it all back.
I'd be more alarmed that a 'destroy' command is reversible.
Never assume anything is gone when you hit delete.
For technical reasons, you never immediately delete records, as it is computationally very intense.
For business reasons, you never want to delete anything at all, because data = money.
Whoever did this was incredibly lazy. What you using an agent to run your Terraform commands for you in the first place if it's not part of some automation? You're saving yourself, what, 15 seconds tops? You deserve this kind of thing for being like this.
Yeah, and to do that without some sort of DR in place is peak hubris.
We used to say Raid is not a backup. Its a redundancy
Snapshots are not a backup. Its a system restore point.
Only something offsite, off system and only accessible with seperate authentication details, is a backup.
AND something tested to restore successfully, otherwise it's just unknown data that might or might not work.
(i.e. reinforcing your point, no disagreements)
AKA Schrödinger’s Backup. Until you have successfully restored from a backup, it is just an amorphous blob of data that may or may not be valid.
I say this as someone who has had backups silently fail. For instance, just yesterday, I had a managed network switch generate an invalid config file for itself. I was making a change on the switch, and saved a backup of the existing settings before changing anything. That way I could easily reset the switch to default and push the old settings to it, if the changes I made broke things. And like an idiot, I didn’t think to validate the file (which is as simple as pushing the file back to the switch to see if it works) before I made any changes.
Sure enough, the change I made broke something, so I performed a factory reset and went to upload that backup I had saved like 20 minutes prior… When I tried to restore settings after the factory reset, the switch couldn’t read the file that it had generated like 20 minutes earlier.
So I was stuck manually restoring the switch’s settings, and what should have been a quick 2 minute “hold the reset button and push the settings file once it has rebooted” job turned into a 45 minute long game of “find the difference between these two photos” for every single page in the settings.
3-2-1 Backup Rule: Three copies of data at two different types of storage media, with 1 copy offsite
You either have a backup or will have a backup next time.
Something that is always online and can be wiped while you're working on it (by yourself or with AI, doesn't matter) shouldn't count as backup.
AI or not, I feel like everybody has had "the incident" at some point. After that, you obsessively keep backups.
For me it was a my entire "Junior Project" in college, which was a music album. My windows install (Vista at that time - I know, vista was awful, but it was the only thing that would utilize all 8gb of my RAM because x64 XP wasn't really a thing) bombed out, and I was like "no biggie, I keep my OS on one drive and all of my projects on the other, I'll just reformat and reinstall Windows"
Well... I had two identical 250gb drives and formatted the wrong one.
Woof.
I bought an unformat tool that was able to recover mostly everything, but I lost all of my folder structure and file names. It was just like 000001.wav, 000002.wav etc. I was able to re-record and rebuild but man... Never made that mistake again. Like I said. I now obsessively backup. Stacks of drives, cloud storage. Drives in divverent locations etc.
He did have a backup. This is why you use cloud storage.
The operator had to contact Amazon Business support, which helped restore the data within about a day.
We don't need cautionary tales about how drinking bleach caused intestinal damage.
The people needing the caution got it in spades and went off anyway.
Or maybe the cautionary tale is to take caution dealing with the developers in question, as they are dangerously inept.
Yeah this is beyond ridiculous to blame anything or anyone else.
I mean accidently letting lose an autonomous non-tested non-guarailed tool in my dev environment... Well tough luck, shit, something for a good post mortem to learn from.
Having an infrastructure that allowed a single actor to cause this damage? This shouldn't even be possible for a malicious human from within the system this easily.
"and database snapshots that Grigorev had counted on as backups" -- yes, this is exactly how you run "production".
Whether human, AI, or code, you don't give a single entity this much power in production.
This keeps happening. I can understand using AI to help code, I don't understand Claude having so much access to a system.
At least you had backup, right?
Oh, yeah, that's right. You were dumb enough to give AI full access to your production system so likely you're dumb enough to not have backups of anything either.
I take it Claude has full access to all of your git repositories as well so that it could wipe those too?
You got what you deserve
Anyone who lets AI do this is absolutely inept, lazy, or deserving.
In its default configuration, it stops at EVERY STEP. Do you want to run this command, do you want to update this file, here's the file I want to modify and the patch i'm going to use with adds and deletes in green and red.
If you're using it in unsafe permissions mode, click yeah sure allow Claude to run whatever the fuck it wants in this directory, or just hitting yeah sure go ahead every time, it's your own damn fault.
It's self-driving for the terminal. Don't you dare take your eyes off the road or hands off the wheel.
Whoever gave it access to production is a complete moron.
If you've ever used it you can see how easily it can happen.
At first you Sandbox box it and you're careful. Then after a while the sand box is a bit of a pain so you just run it as is. Then it asks for permission a 1000 times to do something and at first you carefully check each command but after a while you just skim them and eventually, sure you can run 'psql *' to debug some query on the dev instance....
It's one of the major problems with the "full self driving" stuff as well. It's right often enough that eventually you get complacent or your attention drifts elsewhere.
This kind of stuff happened before the LLM coding agents existed, they have just supercharged the speed and as a result increased the amount of damage that can be done before it's noticed.
There are already a bunch of failures in place for something like this to happen. Having the prod credentials available etc etc it's just now instead of rolling the dice every couple weeks your LLM is rolling them every 20s.
According to mousetrap manufacturers, putting your tongue on a mousetrap causes you to become 33% sexier, taller and win the lottery twice a week.
While some experts have argued caution that it may cause painful swelling, bleeding, injury, and distress, and that the benefits are yet to be unproven, affiliated marketers all over the world paint a different, sexier picture.
However, it is not working out for everyone. Gregory here put his tongue in the mousetrap the wrong way and suffered painful swelling, bleeding, injury and distress while not getting taller or sexier.
Gregory considers this a learning experience, and hopes this will serve as a cautionary tale for other people putting their tongue on mousetraps: From now on he will use the newest extra-strength mousetrap and take precautions like Hope Really Hard that it works when putting his tongue in the mousetrap.
Remember when Gemini got caught in a loop of self-loathing and nuked itself?

Mistakes happen. But how do you go 2.5 years without proper backups?
It’s so easy. I can’t tell you how many “backed up” environments I’ve run into that simply cannot be restored. Often people set them up, but never test them, and assume the snaps are working.
Backups are typically only thought about when you need them, and by then it’s often too late. Real backups need testing and validation frequently, they need remote, off-site storage, with a process to restore that as well.
Been doing this shit for 30 years and people will never learn. I’d guess 9 out of 10 backup systems that I’ve run into were there to check a box on an audit, and never looked at otherwise.
have you heard of not giving the keys to your wacky robot wizard instead
Jesus Christ people. Terraform has a plan output option to allow for review prior to an apply. It's trivial to make a script that'll throw the json output into something like terraform visual if you don't like the diff format.
I've fucked up stuff with Terraform, but just once before I switched to a rudimentary script to force a pause, review, and then apply.
Don't worry, review was done by an LLM as well. ;)
My CTO keeps telling me I need to try agenic coding, and I keep telling him I won't touch shit until I have an isolated VM to use it in, because I'm not letting some fucking clanker nuke my scripts/documentation/mailbox/whatever for no reason.
Too bad there's never any free time to set that shit up. Oh damn........
Good. Anyone foolish enough to write code with a slop machine produces only slop. That garbage should've been deleted anyway.
That's entirely ignoring the fact that this person didn't have any backups elsewhere.
If you can't think, you can't code.
If your dumb fucking ass let an ai near your work AND you didn't have any recent backups that it couldnt have access to; you're really extra fucking stupid.
I don't feel an inkling of sympathy. Play stupid games, win stupid prizes.
The developer is to blame. Using a cutting edge tool irresponsibly. I have made mistakes using AI to help coding as well, never this bad though. Blaming AI would be like blaming the hammer a roofer was using to hammer nails and slamming their finger accidentally with it. You don't blame the hammer, you blame the negligence of the roofer.
Given that the infrastructure description included the DataTalks.Club website, this resulted in a full wipe of the setup for both sites, including a database with 2.5 years of records, and database snapshots that Grigorev had counted on as backups. The operator had to contact Amazon Business support, which helped restore the data within about a day.
sigh, SNAPSHOTS ARE NOT BACKUPS!
but should serve as a cautionary tale.
Jesus there's a headline like this every month, how many tales people need to learn???
Skill issue
It seems that every few weeks some developer makes this same mistake and a news is published each time.