Blog

September 10, 2025

From pipelines to patterns: Observability in the wild

Sameer Mhaisekar

DevRel Engineer, SquaredUp & Microsoft MVP

I recently spoke with Vipul Gupta (Senior Software Engineer at Balena) as part of my Observability in the Wild series. Our discussion reinforced a simple truth: observability is not just about collecting logs, metrics, and traces. It’s about clarity, trust, and collaboration in complex systems.

Watch the interview here

“You can’t improve what you can’t measure — but collecting more data isn’t the point. Making it meaningful is.”

At Balena, Vipul’s team manages hundreds of IoT devices, each with its own embedded Linux OS. That creates a unique challenge: every code change needs to be tested on real hardware using “hardware in the loop” pipelines.

CI/CD at this scale is fragile. Pipelines fail constantly due to dependency changes, hardware flakiness, or transient network issues. GitHub’s native insights weren’t enough to answer critical questions like:

Which workflows are failing most often?
What patterns of failure are repeating?
Where are the biggest performance bottlenecks?

To close those gaps, Vipul’s team built their own observability stack on top of GitHub Actions, using OpenTelemetry, Prometheus, Grafana, and Sentry. This gave them visibility across hundreds of workflows running against fleets of devices.

“Observability is fireproofing, not firefighting.”

Vipul described a shift that resonated with me: most teams discover observability reactively — when something breaks. But real maturity comes when observability helps teams prevent problems instead of chasing them down.

That means:

Identifying bottlenecks early
Surfacing recurring failure patterns
Equipping teams to act quickly and confidently

As he put it: “Without actual work going into observability, it’s just metrics, just shiny dashboards. Teams need to come together around the data to actually execute on solutions.”

“One dashboard, one [set of] metric. If you can’t understand it in 10 seconds, it’s not doing its job.”

Dashboards were a major focus in our conversation. At Balena, the philosophy is one dashboard, one metric. Simple, reliable, and easy to interpret in seconds.

For example: tracking retry counts in GitHub Actions. A single retry doesn’t show up as a “failure,” but over time, retries reveal patterns:

a flaky device in the hardware fleet,
a problematic step in the pipeline,
or a misconfigured resource.

Visualizations like these transform observability from raw data into shared understanding. Teams can trust what they see, act on it, and align around the same view of reality.

[Shameless plug: at SquaredUp, this is exactly what we do. We call it “Operationally intelligent dashboards“ - click here to find out more!]

“The real challenge isn’t data collection. It’s building trust in the system so teams act on what they see.”

One recurring theme was the difficulty of establishing a clear source of truth.

For them, GitHub often remains the ultimate ground truth for failures — even as teams pipe data into Sentry and Grafana for visualization. The bigger goal is standardizing error reporting into OpenTelemetry so observability platforms themselves can be the authoritative source.

That’s still evolving, but it’s where the industry is heading.

“AI won’t replace human judgment in observability, but it can augment it — cutting through the noise to highlight what really matters.”

AI came up as a natural extension. Observability generates massive, noisy datasets, and humans can only process so much. AI can help by:

Recognizing recurring failure modes across thousands of runs
Filtering out noise to reduce alert fatigue
Predicting bottlenecks before they hit production

Imagine an AI assistant that not only reports a failed workflow but also highlights:

“This has failed 6 times in the last 10 days.”
“The attempted fix last time didn’t work.”
“The team responsible hasn’t triaged it yet.”

That’s the kind of context that turns observability into proactive engineering.

“Being perfect is the enemy of good.”

Vipul shared two lessons that stood out:

Be ruthlessly specific about requirements. Build a “shopping list” of what your current ecosystem provides and what you’ll need later. That helps avoid constant reinvention.
Good vs. great observability. Collecting more data doesn’t make observability great. The difference is clarity, trust in dashboards, and enabling teams to act together.

As he put it: “Being perfect is the enemy of good. The immediate value of getting basic observability up and running for the whole team is far more important than chasing 100%.”

Closing Thoughts

Vipul's experience highlights something we hear often: once you have plenty of telemetry, the real challenge becomes making it consumable. Teams need a way to share the right view with the right people — from engineers watching CI/CD pipelines to product leaders looking for trends.

That's where dashboards shine. With SquaredUp, you can simply bypass the issue of "trust in the data", since we bring data straight from GitHub, Azure DevOps, Jenkins, GitLab and more into clean, lightweight dashboards that surface what matters most. No digging, no noise — just the key signals, rolled up and easy to share across teams.

Whether your pipelines are as complex as Balena's or just business-critical in different ways, observability truly becomes actionable with operationally intelligent dashboards.

👉 Explore SquaredUp dashboard examples and use cases

Sameer Mhaisekar

DevRel Engineer, SquaredUp & Microsoft MVP

Continue learning

Blog

Operational Intelligence - the new horizon of observability

The evolution of observability into intelligent, real-time decision-making.

Blog

From Excel to live dashboards — introducing our new SharePoint plugin

Learn how the new SharePoint plugin makes it simple to bring business context into your dashboards.

Blog

The Single Pane of Glass in Modern Observability

What is a single pane of glass? Does it have practical applications? How does it fit into modern observability?

Blog

The 5 best Jira reporting tools for 2026

In this guide, we rank the 5 best Jira reporting tools on the market today. Whether you need real-time dashboards, cross-project insights, or powerful visualizations, these tools will help you get more value from your Jira data.

Blog

The 3 smart updates to our Jira plugin

We’ve supercharged our Jira integration – making dashboards faster, data streams simpler, and your workflow smoother than ever.

Blog

Getting the right signals: Mobile observability with Embrace and SquaredUp

Visualise rich analytics on your mobile applications with SquaredUp's smart dashboards

“You can’t improve what you can’t measure — but collecting more data isn’t the point. Making it meaningful is.”

“Observability is fireproofing, not firefighting.”

“One dashboard, one [set of] metric. If you can’t understand it in 10 seconds, it’s not doing its job.”

“The real challenge isn’t data collection. It’s building trust in the system so teams act on what they see.”

“AI won’t replace human judgment in observability, but it can augment it — cutting through the noise to highlight what really matters.”

“Being perfect is the enemy of good.”