Skip to content

DevOps Metrics and Measurement Panel - DevOpsDays Mountain View 2011

talks 2 min read

Event logo

This panel brought together some of the sharpest minds in monitoring at the time: Alexis Le-Quoc from Datadog, Laurie Denness from Etsy, Vladimir Vuksan from Ganglia, and Brian Doll from New Relic. The core question we kept circling was straightforward – how do you get developers to actually look at graphs? The answer turned out to be less about better tools and more about access. When developers can instrument their own code and see results immediately, the behavior changes on its own.

Etsy’s approach with statsd was a revelation. By making it trivially easy for any developer to emit a metric – literally one line of code – they removed the bottleneck of operations being the gatekeeper of monitoring. Self-service metrics meant developers stopped asking permission and started measuring what mattered to them. The panel agreed this was the pattern to follow: lower the barrier, increase the adoption.

We got into the tension between business metrics and infrastructure metrics. Most monitoring setups at the time were purely technical – CPU, memory, disk. But the real value comes from connecting those signals to business outcomes. Revenue per minute, conversion rates, customer experience scores. The organizations that bridged that gap were the ones getting executive buy-in for monitoring investments.

Data retention sparked a heated debate. How long do you keep detailed metrics? The cost of storage versus the value of historical data for capacity planning and trend analysis. There was no consensus, but the direction was clear: keep more than you think you need, because you can’t go back and re-collect data you’ve thrown away.

The hardest question was how to measure DevOps adoption itself. You can’t directly observe culture change, so you have to measure the effects indirectly – like measuring a black hole by its gravitational pull. Deployment frequency, lead time, mean time to recovery – these proxy metrics tell you something about collaboration health without trying to quantify culture directly. The panel also surfaced a gap nobody was addressing: most developers and operations people had no training in basic statistics, which meant they were drowning in data they couldn’t properly interpret.

Watch on YouTube — available on the jedi4ever channel

This summary was generated using AI based on the auto-generated transcript.

Navigate with