Close

March 11, 2019

We All Fail

I recently saw a tweet on my Twitter-feed about failures and how we should also share our failures, not just our successes. Everyone experiences failure, but if we don’t share them, new developers will feel it’s not OK to make mistakes. So here are some of my mistakes and what I learned from them.

This is the respective tweet:

Be sure to read the rest of the thread. It’s a good and short read.

Updating the Production Database Too Early

My first job as a developer was on an electronic health record application. We often had database changes to go with our monthly updates. These changes would be a collection of database scripts that had to be executed manually. We tried to avoid breaking changes, but not always.

At one point, another application that fed us data had been updated in the beta environment, and the next step was to update our application, including breaking database changes.

So I opened up SQL Server Management Studio, connected to the database and started executing the scripts one by one. I created tables, removed columns, dropped tables, etc. Only halfway through did I notice that I was doing this on the production database! Meanwhile, the other application was continuously feeding us data in production, but not all of this data was being stored correctly, if at all. Yikes!

I immediately stopped what I was doing and phoned the DBA’s for a backup, the hospital to tell them about the issue, and our COO too. Curiously, the backup too about 4 hours to restore. Luckily, nobody was really angry, but needless to say, I’ve had better days.

I learnt to always check my connections, to prefer not to have access to production environments, and to value automation. A few months later, the company invested in a technology to automate database deployments, which reduced the chances for human errors.

Not Testing My Work

On numerous occasions, I’ve had to work in code that was not easily tested by unit or other automated tests. In fact, it was often even hard to test manually.

I have skipped the testing in such cases sometimes, only to have to do my work over again, because QA managed to test it.

I could have saved time by asking QA how I could test it, and testing it myself first. Wanting to move fast almost always ends up making you slower.

And More

I’ve written bugs that slipped through any manual testing by myself or others. I’ve programmed features that broke something else because there was a dependency that I didn’t know of or didn’t think about. I’ve looked at pull requests just summarily which led to bugs going through.

If I’ve learned one thing from all these mistakes, is that it pays of to prefer slow and secure work over being that lightning fast developer. Take your time to double check things.

But we’re still human so we will make mistakes. That’s OK. If you work in an environment where that’s not OK, leave. If you feel bad when making a mistake, use that feeling to your advantage by learning not to make the mistake again. Or even better: setting up a system that reduces the possibility of that mistake. But keep in mind that failure is just a derogatory term for experience.

What are your stories? Which errors have you made in the past and how have you learned from them?

2 Comments on “We All Fail

Faceplant
March 22, 2019 at 18:11

I just face-flopped on something everyone loved so much! It went to prod and as soon as users started coming to the web app, BOOM they can’t log in! Turns out the data was missing a critical value that allowed them to login and access their data. Admins were fine because they had role-based access. But, individual users…nope!

The problem presented as an authentication issue which usually means configuration. So, I spent a couple maddening hours looking for configuration errors. I didn’t have full details about the issue so I was flying partially blind. I spoke to a couple others about the issue and something a colleague said hit the button!

I checked the data and surely it was missing the key values that allowed access to individuals! After some quick scripting, I was able to patch the issue, but this was a terrible lesson in having a good test plan!

Reply
petermorlion
March 22, 2019 at 20:09

Glad it worked out! Yes, every “failure” is a new opportunity to change the way we were doing things, so that we can try to avoid the same (type of) “failure” the next time.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.