The app for independent voices

Back in February I did an resilient architecture audit for a company with an unusual setup: the system was air-gapped and the developers could not observe the system.

After 9 hours and discussion with 20+ engineers, we came up with a set of ideas that worked around the problem by implementing ownership on-prem, increasing trust, and reducing TTD (time to detect) and TTR (time to resolution).

This is not your typical cloud service where all you have to do is to configure the otel sink. We had to rethink the #observability requirements within military constraints.

#SRE#Operation#SystemDesign#NFR

Reliability Engineering for Air-Gapped Systems
Apr 3
at
9:08 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.