John Allspaw avatar Admin
Hi everyone, I'm John Allspaw, the SVP for technical operations at Etsy. I'd like to provide greater context (www.etsy.com/teams/7722/business-topics/discuss/10630877/) on how we treat errors, and investigate the causes of site issues on Etsy, to maybe dispel some myths.

Q: You have "blameless postmortems" (codeascraft.etsy.com/2012/05/22/blameless-postmortems/) ? What does that even mean? It sounds like every time someone makes a mistake, they get off the hook?

A: The notion of a blameless postmortem does not mean that people get "off the hook." In fact, they are very much ON the hook: to give detailed information about how the error came about, and to participate in a team that can prevent the same mistake from happening again. As I mentioned in a Code As Craft blog post (codeascraft.etsy.com/2012/05/22/blameless-postmortems/) about this topic, blaming someone (identifying and making negative judgments about their skills and personality) is very different than holding them accountable.

We are not satisfied with ending an investigation by identifying the "perpetrator" of an error. Hiding from accountability for making a mistake is all but impossible at Etsy, since every change made to the code is recorded in multiple places with who, what (exactly), and when, all for the entire company to see. We hire engineers who feel confident, responsible, and willing to be held solely accountable for any actions they take, whether those actions are successful or not. So "who made this error?" is never a question we have to ask, it's a foregone conclusion.

The only way we can be satisfied after an error is made is to learn as much as we can about how it happened, so we can prevent it from happening again. Simply reprimanding someone after they make a mistake is a foolhardy approach, because this gives no confidence that it will prevent it from happening again. Only by putting time, focus, and effort into preventing mistakes will we remain fast and reliable in the future.

We know our community wants Etsy to be fast and reliable. We do this by building a culture where people want to be held accountable for making mistakes. Our post-mortem meetings are sometimes standing-room only. Only by putting time, focus, and effort into preventing mistakes will we remain fast and reliable in the future, and we can't do that if people think the end-result of making a mistake is being punished.

Q: But there has to be some fear that not doing your job correctly could lead to punishment, right?

A: We do not believe that motivation by fear is the way a mature organization operates, or that the fear of punishment is how you cause people to act "correctly" in the future. If all we do is yell, we've done nothing preventative because we've attributed the error to some quality that the person has, and not the circumstances they were in at the time they made the error.

If you follow this logic, it's based on the idea that people can be conditioned to make perfect actions by screaming at them. Suffice to say, that's not what we believe.

Q: You gave an "award" to someone who made the most spectacular mistake? How should that give me confidence in Etsy?

A: As I've said, we are opposed to the traditional view of "human error," which suggests that mistakes and errors are events to be feared, silenced, and skipped over. We don't want mistakes and errors to be "taboo" areas for focus and conversation. On the contrary, we want errors and mistakes to have a very high level of exposure in the company, because only then can we involve many people from different perspectives to determine how to prevent and better respond to issues when they come up in the future.

The award we give is in support of this intent to raise the awareness of errors and mistakes. Obviously, it's not an award to aspire to, and I can assure you that it doesn't cause more people to make larger mistakes in order to get the award.

I welcome any further questions. :)

Best,
John