Skip to Main Content

Slight Reliability joins SquaredUp!

We are thrilled to kick start 2023 with an exciting announcement: Slight Reliability is now a part of SquaredUp! Keep reading to learn how this partnership began, in an exclusive interview snippet with our CEO Richard Benwell and Slight Reliability host Stephen Townshend.

Richard (SquaredUp CEO) & Stephen (Slight Reliability host)

What is Slight Reliability?

Before we dive into the interview, first things first: What is Slight Reliability?

If you are an SRE, chances are you have found yourself wondering at multiple points in your career: What is site reliability engineering really about? We don’t all operate like Google, so how can I make sense of it in my organization? How do I cut through the buzzwords and actually improve the lives of my colleagues and customers?

While navigating these questions, one SRE decided to document his journey and share his learnings. And in so doing, Slight Reliability was born – a learning hub that is steadily evolving into a community.

That SRE is Stephen Townshend, and Slight Reliability is today a learning hub of podcast episodes (and in the near future, blogs!), featuring an array of community guests. Created with the goal of translating difficult concepts into something understandable, approachable and empathetic, Slight Reliability cuts through the crap on all things SRE and Observability.

But let’s hear it from Stephen himself.

Stephen, what is your background and why did you decide to start Slight Reliability?

Stephen: Hi there, my name is Stephen and I live in New Zealand. For thirteen years I worked as a performance engineer, and toward the end of that time I began speaking at events, blogging, and podcasting. When I transitioned from performance engineering to site reliability engineering (SRE) it was like becoming a junior engineer again. Because I was an absolute beginner (in the field of SRE), I adjusted the kind of content I created from “I’m an expert and here’s some useful knowledge” to “I’m learning this stuff from scratch, come and learn with me”.

I started Slight Reliability as a podcast about SRE in March 2022. The intent of the podcast was to share my learning journey and to get guests on the show to share their experiences and knowledge with the community (in the hopes of helping others in a comparable situation). It’s about making sense of the hugely complex field of SRE in a way that everyone can understand. SRE is a rapidly growing field and there are thousands of engineers out there trying to make sense of it right now.

Richard, what drew SquaredUp to Slight Reliability? How did we discover the channel?

Richard: You only have to spend a short time browsing observability and SRE topics – on social media or at events like O11yfest and SLOconf – to see Stephen’s fresh voice stand out. The industry has developed its own echo chamber of best practices and theory around observability and SRE, ranging from the ‘three pillars’ to SLOs, but Stephen has approached these topics this with a fresh pair of eyes (and an inquisitive mind). Combined with the decision to publish his SRE journey through ‘learning in the open’, the result is unique.

Some of my favourite Slight Reliability episodes are ‘Bad Observability’, ‘Afailability’ and ‘It’s SLO Going’. You can tell just from these podcast titles that he’s taking an honest look at these topics and – bravely and without hidden agenda – saying what many of us might be thinking.

Here’s a review on one of Stephen’s podcasts but that pretty much sums it up:

“I'll say it again, your capacity to distil this very complex topic in such a clear, thought-provoking way is your superpower. I think it is especially hard for those new to this field to understand what the end goal of SRE really is, given the noise around it and the lofty expectations in the job descriptions. Every video you make is a new aha moment for me – I think that makes you my SRE therapist.”

At SquaredUp, we’re taking a different view too. Unlike other observability vendors, we’re focusing entirely on how to use the data, not how to collect the data. We start from the viewpoint “who needs to see what?” and focus on making that happen. That usually means much more than just viewing metrics, logs and traces. It’s data from multiple different tools, across different teams and across business and technical domains. And it needs to be correlated, summarized, and put into context for different people in different roles. That’s what we call “big picture observability”.

I’m thrilled that Stephen decided to bring his Slight Reliability podcast to SquaredUp. We have always connected to our customers and community with transparency and openness. Stephen’s approach is just that, and I couldn’t think of a better fit to help us explore and solve real-world observability challenges together with the community.

Stephen, why did you choose to come on board?

Stephen: I connected with Richard’s energy and ideas right away. When he first introduced himself and SquaredUp I had just undertaken a lengthy reliability benchmarking exercise for a DevOps team at IAG. This involved obtaining data from several different tools spread across the organization. I was looking for a way to pull all that insight into one place, so that teams could track their own reliability easily and independently. This aligned perfectly with what SquaredUp is trying to achieve.

SquaredUp’s perspective of observability is very aligned to my own. A lot of the time I see all the focus being put on the technical side of monitoring. The classic “logs, metrics, and traces” definition. My view is that observability is foremost about understanding customer behaviour and whether we are achieving business outcomes. SquaredUp sees observability the same way.

I had also been drawn to the idea of turning my hobby content creation into part of my daily work, and the role of Developer Advocate is the perfect mix of creative and technical work.

Stephen, what can we expect from Slight Reliability moving forward?

Stephen: Slight Reliability will remain focused on SRE and observability knowledge and experiences, not a place for product marketing. I will continue to release short form solo content alongside interactive discussions with guests. The topics will remain SRE focused; observability, incident management, toil, blameless culture, automation and simplification, etc.

I am also going to be working as an SRE on SquaredUp’s SaaS product and sharing that experience with the community. That is going to be an interesting experience because the solution is built using serverless computing. I haven't worked with serverless much before and I am excited to figure out how to operate and observe it.

Further down the road I am hoping to organize some community events. Exactly how that will be implemented has not been decided yet, but one idea I have had is a virtual SRE book club who read and discuss SRE-relevant literature and come together to discuss the content of each book.

Later in January, Richard will also be coming on the show to talk about our shared view of “bigger picture” observability. He comes from a data analytics and visualization background, and I’m looking forward to chatting with him about this topic. Stay tuned!

Where can I find Slight Reliability?

For all Slight Reliability podcast episodes, check out https://www.youtube.com/@SlightReliability. Subscribe so you don’t miss Richard’s debut!

For upcoming Slight Reliability blog content, watch this space.

And to keep in touch, follow Slight Reliability on Twitter and LinkedIn, or Stephen Townshend on Twitter (@the_kiwi_sre) and LinkedIn.

Share this article to LinkedInShare this article on XShare this article to Facebook