The Infoq Podcast
Kolton Andrus on Lessons Learnt From Failure Testing at Amazon and Netflix and New Venture Gremlin
- Autor: Vários
- Narrador: Vários
- Editora: Podcast
- Duração: 0:28:31
- Mais informações
Informações:
Sinopse
In this week's podcast, QCon chair Wesley Reisz talks to Kolton Andrus. Andrus is the founder of Gremlin Inc. He was a Chaos Engineer at Netflix, focused on the resilience of the Edge services. He designed and built FIT: Netflix’s failure injection service. Prior, he improved the performance and reliability of the Amazon Retail website. Why listen to this podcast: - Gremlin, Kolton Andrus' new start-up, is focused on providing failure testing as a service. Version 1, currently in closed beta, is focused on infrastructure failures. - Lineage-driven Fault Injection (LDFI) allowed Netflix to dramatically reduce the number of tests they needed to run in order to explore a problem space. - You generally want to run failure tests in production, but you can't start there. Start in developemnt and build up. - Having failure testing at an application level, as Netflix does, so you can have request level fault injection for a specific user or a specific device. - Being able to trace infrastructure with something li