RailsConf 2020.2 Couch Edition

Paul Zaich

← Back Home

The Sounds of Silence: Lessons from an 18 hour API outage

Sometimes applications are behaving "normally" along strict definitions of HTTP statuses but under the surface, something is terribly wrong. In 2017, Checkr's most important API endpoint went down for 12 hours without detection. In this talk I'll talk about this incident, how we responded (what went well and what could have gone better) and explore how we've hardened our systems today with simple monitoring patterns.

Paul Zaich

Paul hails from Denver, CO where he works as an Engineering Manager at Checkr. He's passionate about building technology for the new world of work. In a former life, Paul was a competitive swimmer. He now spends most of his free time on dry land with his wife and three children.

Thank you to our Sponsors.

Brought to you by Ruby Central