Challenges of testing microservices and why we need observability

In this episode of Quality Sense, host, Federico Toledo intervews Alon Girmonsky, an accomplished, multi-time entrepreneur and technology leader. He’s a passionate engineer who loves building developer tools and companies.

Today, he’s the CEO and co-founder of UP9, a startup in the field of microservice software reliability. Previously, he founded BlazeMeter, a continuous performance testing platform acquired by CA Technologies.

Alon believes that developers are akin to painters and sculptors and that software is the manifestation of their art. At Abstracta, we’ve been collaborating with Alon in different ways over the past six years because we’re interested in the different projects that he’s lead and their contribution to the software quality world.

What’s the Episode About?

  • How Alon chooses new projects to work on or invest in
  • Challenges in testing systems with microservice architectures
  • Tools to help with testing them, automation and observability altogether
  • The story behind UP9 and what it aims to solve
  • And as usual, Federico asked what’s his advice for testers and favorite book recommendations

Listen Here:

Episode Transcript

(Lightly edited for clarity.)

Federico:

Thank you Alon for accepting the invitation of participating in this podcast. For me, it’s an honor to have you here.

Alon:

That’s so nice of you to invite me. I really appreciate it.

Federico:

I’d like to know a little bit more about your life as an investor. Do you work only with software testing companies?

Alon:

So, it’s tough to answer. The answer is no, it’s not about testing. I love being an entrepreneur and I love the excitement of building new products and companies.

When I know of an entrepreneur that I’d love to work with, I feel like a kid in a candy store.

So I have a quick rule of thumb; if I know an entrepreneur that I’d love to work with, but can’t, because I’m usually busy being in my own company, I invest or I become a close advisor to the CEO.

I can say that I’m fortunate to build a very wide network that they can leverage and pay it forward by providing access to new entrepreneurs.

Sofia and the Apptim team is a great example of people I greatly appreciate and have known for many years. And I’m a true believer in Apptim.

I feel very lucky to have Sofia ask me to join her on her journey.

Federico:

That’s cool. I’m really happy to have you as an advisor in this, in the challenges we are facing with in the Apptim team. That’s great.

The main topic I wanted to address with you in this interview is about microservices because I know that this is probably one of the main challenges that you are trying to contribute to the testing and reliability of microservices. So I wanted to start with the basics.

Can you give us an introduction to the topic of testing microservices?

Alon:

Sure. I’d love to, and I think that kind of microservices amongst services is a topic that is very common. So I’m sure there are people that probably can define it better than myself.

We [at UP9] are focused very much on where microservices meets testing and quality. Or, what we call maturing to reliability.”

– ALON GIRMONSKY

We’ve been developing software for decades, right? And we’ve been also practicing testing.

There is a very well-defined test practice of how you test. So could all agree that we all know how to test the monolith or monolith architecture. We’d all agree that we know how to test the web right? End-to-end testing browser automation. It’s kind of, it’s pretty deep it’s table stakes.

But, the thing is that with microservices, testing has become exponentially more challenging than what we used to know and where traditional solutions, they find it challenging, challenging to address them. The reason being, microservices is a fragmented architecture.

It’s not a one big monolith and a few services attached to it, but you know, it’s equal weight services and it can be hundreds and many hundreds of every service with dozens or more endpoints, right?

In a good microservice architecture, there is independency. And sometimes there’s dependency actually between these, but there are independent roadmaps. So one of microservice can move very fast. The other slower, they may be dependent. So if you have an API contract that you think is most up to date, this contract can change because the new release or the new microservice is happening.

So there are dependencies and independent roadmaps. If you consider usually in microservices, you need an orchestrator like Kubernetes or ECS or Docker Enterprise, right? When these, the idea is for an orchestrator to alleviate the pain of management. But then there is a layer that obstructs the people from what’s happening on the inside. So try to test something when you don’t have access to. Like if Kubernetes is the cloud native security model, it doesn’t allow you to access a certain service unless it’s exposed initially, right?

And couple that with a service discovery where there’s no fully qualified domain names anymore, there are no static IPs. It’s kind of a service can launch somewhere and have its own IP, which is completely virtual. And so trying to test that, it becomes a moving target.

And again, where this relates to testers and testing, the job has just become far more difficult than it was in the past. And there are no tools to help you. They’re not at least… very tough to move from monolith architecture, into fragmented, microservices architecture. Does this make sense to you?

Federico:

Yeah. Totally make sense.

And I guess it’s sort of related to a term that I’ve been hearing a lot lately, which is observability, right? Because it’s not only about how you approach testing, but also be understand what’s happening when you exercise a microservice or a set of microservices, understand what is happening in the infrastructure in the system.

So can you tell us a little bit about the challenges related to observability of these architectures?

Alon:

Yeah. I think that when you talk about microservices or any modern or cloud native tech stack, then you have to think about observability.

I think observability, as sort of the next-in-generation after monitoring, but it’s not only to make sure that everything’s alive and game, but as it sounds, it’s to understand what’s going on; to have observability into something that’s otherwise somewhat removed and you cannot look at it.

When things become so complex, as in a microservice environment, you have to have a system, a software that gives you the full perspective to figure out what’s wrong, to troubleshoot, to give you a root cause analysis.

Observability can come in many shapes and forms. Like, an Ops person would want their observability tool, a DevOp would want their observability tool, and a developer would want completely different observability tool. It’d be monitoring, it could be tracing, it can be used for numerous use cases.

But the one thing is for certain, no one heard of observability 10 years ago.

And suddenly if you don’t have it, it’s not a good place to be.

Federico:

Can you tell me about tools for observability?

Alon:

Unfortunately, you need many tools. So again, I think that if you take a certain term, like observability or reliability (a lot of the “ilities”) and every stakeholder in the organization you ask will have a different vision of what it is.

If you ask an Ops person in SRE, what is observability? They’ll have one aspect. If you ask a developer, what is observability? They tell you something completely different.

And if you ask a tester, by the way, what is observability, they should, although there are no tools currently available to provide the observability for testers, (other than Up9 of course) they should expect something different. Like for example, APIs and service architecture and API contract. So you would want to have observability into these if you’re testing microservices, right? But maybe these are of less importance for anyone that is monitoring the application. They care more about signals and health and stuff like that.

Right. So consider testing and testers, right. When they run a test, they want to know obviously the results of the test. And they want to know what happened during that test and how the test had affected kind of the entire system while it ran and all that stuff. So, these are observability topics that may not concern other people that don’t care as much about testing.

So when I originally started UP9 and researched the field, observability surfaced very often.

We found you can’t decouple observability from testing. You have to have observability as part of any testing framework, but observability will mean different things to different people.

Federico:

So what’s your current approach to solve these issues that you’re mentioning with UP9?

Alon:

Good question. I think UP9 is basically an implementation of when you have a problem, the first step is to understand the problem and you know, I’ll start with observability, but the thing is, everything is connected.

It’s not only about a observability, it’s not only about testing it.

And the problem is that the solutions are fragmented. You have a solution for this. So looking for that solution for that, but who is left to connect all the dots? The people. Right?

So going back, my partners and I have spent the past decade in test automation, and first in BlazeMeter, then in CA and now obviously in UP9, we’ve come to realize that testing is a bit broken. To put it mildly and it creates… There’s a huge workload on tester, developers, and engineers in test.

And by the way, it’s common knowledge that testing is the main reason for the slowing down of releases.

Take activities such as test planning, test creation, maintenance, automation, results, analysis, all that. When we look at them, we say, “Hey, we want to offload all of these things. We want to provide a system that takes care of the entire thing.”

You cannot find root cause analysis, if you don’t have observability, right? You have to spend time building a test to run a test, right? So there’s a lot of heavy lifting to be done.

So, UP9 firstly automates test generation. We try to test as code, it updates these tests, these tests are committed to Github as any developer or engineering in test will do, right? The test code constantly adapts to your service architecture, right? You have always up-to-date test coverage and this is all done instantly. So you don’t need to wait for something to happen. All right, so you get complete coverage, but as a machine will do.

So when we look at things holistically, we provide, you need to understand the architecture, observability into the architecture. You have to be able to build tests based on the architecture and see how tests connect like a certain flow. Why is a certain flow better than another flow, right? You want to see that flow in production or preproduction. You want to see that this is the positive path, right? When you run these tests hundreds of different of… who will analyze these results?

So UP9 analyzes all of the results for you, coming up with conclusions. And all of the time, this should be connected to the service architecture again, observability. So basically take all these together, and this is UP9 for you.

Federico:

Cool. Really interesting. So tell me, how can I try the tool? Is there any version available I can start trying?

Alon:

Yeah. So we actually released UP9 a couple of months ago, and I’d love you and your listeners to sign up. Simply go to UP9.com. And I promise to personally address any feedback we get. At this time, we’d love to get feedback.

Federico:

That’s amazing. To wrap up this interview, I would like to also ask you a couple of questions related to similar topics.

I truly believe that in order to improve something in your life, in your career, what you have to do is to modify those small things that you do all the time.

And I think that your insights as a successful entrepreneur could be very useful for anyone listening to this.

So, what habits do you have that maybe you can suggest the people to adopt or maybe to avoid?

Alon:

Yeah, I think to avoid is far more interesting than what to adopt, although both are, but I would say avoid stagnating.

“A way to drop off the race is getting too comfortable and stop innovating. Don’t do that, really embrace change. And most importantly, enjoy the ride.”

ALON GIRMONSKI

We are now building the next hundred years right there, and you’re special. So especially by the way for testers, things are changing, adapt to change, move fast, move at the speed of software development.

Federico:

Such a challenge, right? This is a very dynamic war specifically in, in our industry. And this is, I think I totally follow what you’re mentioning. I understand.

I also see it as a huge challenge that we have to be like paying attention to all the movements, all the new waves or the new tools or the new architectures and everything.

And I think this conversation, we are sharing it’s towards this direction. Right. Excellent. Thank you.

What about books? Do you have any book recommendations?

Alon:

Yeah. I love books and I’ve read many. And especially as an entrepreneur, you have to, you read constantly cause that’s a certain way to grow.

Two books have left their mark on me. One is Steve Jobs’ biography. I think this is a chapter in history that anyone that is in tech needs to read. And the second one is Ben Horowitz: “The Hard Thing About Hard Things.”

I also think it’s a must read for entrepreneurial specifically, but also, you know, people in tech, by the way, I suggest reading these books at least twice. At least. Yeah.

Federico:

Yeah. It’s true that if you read the same book in different stages of your life, you learn different things, right?

Alon:

Yeah. It’s, it’s from reading something just for the sake of reading or seeing the movie for entertainment, purpose to… hey, maybe I can learn something from it and actually evolve. So I think reading a book or even hearing it a few times actually does that job.

Federico:

Totally agree. Well, thank you so much for this time, for your time. I really appreciate it. I really enjoyed the talk with you.

Alon:

Federico, I really appreciate the invitation to appear in your podcast. I know the impact it can make. So thank you. Thank you for inviting me and believe I speak on behalf of the QA and test professional. Thank you for your contribution with this podcast inititative!

Federico:

Thank you so much. Bye bye!


Recommended for You

Quality Sense Podcast: Refael Botbol – Optimizing Performance Testing Costs
Quality Sense Podcast: Paul-Henry Pillet – Why We Made Gatling