True observability is achieved when you can understand your application and application state from your collected data, no matter how bizarre, novel and unexpected. It is important that you are able to answer any question that pops us whilst looking into your system, without shipping new code.

If you’re only able to look into expected failure states, or have observability into states that were thought of before, you don’t have true observability but something more akin to monitoring. When you’re

This gets more and more difficult, and important, when your system grows in complexity. Be it from irreducible complexity, distributed systems or just a frighteningly interesting application.

Open questions

What to do with PII. If you measure all data, so you can emulate all states you will undoubtedly capture PII. There are ways of anonymizing expected PII, but it gets harder and harder when you capture dynamic data.

Since you capture so much data, you already start with a very data intensive application. What to do with that data?

I imagine functional languages, like Elixir makes it easier to figure out states because the immutable aspect and adversity to side effects.

Observability makes LLM dominated codebases more manageable

Reference

Majors, Charity, et al. Observability Engineering: Achieving Production Excellence. First edition, O’Reilly, 2022.