anignoramuss
homearchiveabouttags

reliability

1 post

▸ When Your Deployment Rolls Itself Back: Building Autonomous Anomaly Detection

March 22, 2019 · 7 min read · anomaly-detection, machine-learning, deployment, reliability, observability

What happens when you give your deployment system the power to say 'no'? I built an auto-rollback service that detects anomalies in real-time and reverses bad deployments. Here's why accuracy is harder than you think.

github · linkedin · scholar · twitter · rss