I broke production and now my tech lead says he doesn't trust me

 


So, long story short, I was in charge of writing a data migration script that I had been testing on my local DB. It looked like everything was working properly, so I went on to the next step which was testing the script in a staging environment so that the results could be checked by others. This is where the fuck up happened. I pasted the address to the remote DB environment but forgot to change the name of the DB to the staging name. It just so happens that the local DB name is the same as the name on production so the script ended up corrupting data. Production was down for about 10 hours, but we were able to roll everything back without losing any data. By the way, this script was running from my local testing environment, so dev environments can reach production at this company. There are no safeguards in place.

This is the one and only time I have ever done anything like this, but now my tech lead is acting as if I do this kind of thing constantly. I'm now being micromanaged and being threatened with being put on PIP. My tech lead even said to me, "I don't trust you to not do this kind of thing now."

I know this was a careless error on my part, but is this warranted for a mistake like this?


Sarah:


In fact, I'm gonna take it a step further: Blaming yourself is counterproductive. Blameless postmortem culture really does exist, and it really is useful.

I have broken far larger things than OP. Things you have definitely heard of. Like, it's actually possible that everyone here noticed my largest outage.

When that happens, we blame the system. We figure out exactly which flaws in the system allowed me to fuck up the way I did, and then we go fix them.

Because it's much easier to fix automation than it is to fix human behavior. Because there will always be another junior. Because it's stressful enough handling a major outage without fearing for your job. And because if you're afraid for your job, you might try to fix it yourself and hope no one notices, instead of pulling in help immediately.

your tech lead is also in the right to remove privileges...

Yes, but from everyone, not from OP.

OP should not be on a PIP, not even threatened with one. OP should be leading the effort to implement the kind of safeguards that would've prevented this issue because OP is the most knowledgeable person about how you fuck up in this way.

Post a Comment

Previous Post Next Post