Looking at some of the challenges of test automaton and how to pick the best tools and infrastructure
In today’s Quality Sense episode, Federico Toledo sits down for a chat with the founder and CEO of Testim, Oren Rubin, an Israeli entrepreneur who has over 20 years of experience in the software industry. Testim is an innovative product in the test automation domain that seeks to alleviate the problems around flaky tests. Prior to founding Testim, he was the Director of R&D at Applitools, where they use advanced computer vision technologies to solve the challenges of UI Verification.
What to Expect?
- In part one of the interview, the two discuss test automation challenges, particularly how to choose amongst the different test automation tools, highlighting their main differences, especially Selenium, Cypress, Playwright, Puppeteer and so on.
- In the second part, the two will talk about how AI helps test automation, differences between script-less testing tools and visual editors and more
(Lightly edited for clarity.)
Hello Oren! Nice to talk with you again. How are you doing?
I’m doing great, doing great. Last night, I didn’t sleep a lot, but besides that all is well.
I can imagine that it’s hard to be in this situation with a baby, right?
Indeed, yes. We have a baby. My wife is going back to work and it’s all in one place. But the baby, she’s just a newborn, she’s super small and still cries at night, especially when they do construction outside the building. So yes, let’s just say I didn’t sleep too much. But now after a glass of coffee… Much better now.
I see. You know, I always share the story of how I met you. For me it’s a very interesting story because I was in a meetup here in San Francisco and Angie Jones came to me with you and she said, “Hey, you should meet Oren!”
For me this is an amazing story about how I met you and then we started to discuss and to share a lot of things related to test automation and looked for ways to collaborate. I really love this about the Bay area.
And you know what, this reminds me of my tiny country because something that we typically say is that we’re a small country with only three million people in Uruguay and we all know each other, or at least we have a common friend, you know? And I have a similar feeling here in the Bay area. It’s like you typically know someone that knows someone you just met.
Indeed, indeed. It’s a great community here and I love that. Like what Angie did and she’s like, “Okay, you guys have to meet. You have to meet. Come on, I’ll introduce you.”
Exactly. Well, here we are and I wanted to discuss with you again some of the topics related to test automation because I know you have been researching and working on developing tools for that, right?
For a long time.
So my first question to start discussing, from your perspective, which are the biggest challenges in test automation nowadays?
I think that today there’s still some basic challenges that have existed for a long time, but actually have become more acute in the last few years. One of them is the flakiness of tests.
There’s more, I think there’s a lot. This is one of them that I’ve been focusing on in the last few years because I’ve seen it myself with my own eyes and hands, as they say. But I think this is still a big challenge and it’s becoming more acute as they shift left and as companies become more agile and it’s becoming more acute. People have to do more and more testing because they want to release faster. Everything that’s flaky, now if you run 100 times more, then you see 100 times more flakiness, so it’s becoming very bad.
Yeah. Because we want to run the test every day or after every build or after every change we make to the code and if the tests are flaky, for sure they are not going to be as useful as we need them to be. Right?
Yeah. I think you have to trust your tests. When they fail, you want to know that it’s not a problem with the test, it’s a problem with the application. That’s what the tests are for. They’re a safety net, but you have to trust your safety net.
Totally. Yeah. For me one of the things that is getting harder is to define the set of tools or the frameworks that we are going to use. As I see it, there is a fragmentation in the market, right? There are many, many tools, many options. What makes it harder to decide which one to use. This is one of the questions that we are frequently asked in Abstracta every time we start a project with a new customer, and I know that you know the market, because as you say, you were developing tools for test automation. You have been working on that for many years.
So, my question for you is, what are the main options we have today and what are the main advantages of each of them?
I think that we need to first maybe do some differentiation and clarification about when someone says “testing framework.” Because it’s a little bit overloaded with different meanings.
Some people call it “testing framework”… actually I call it “infrastructure.”
What is the infrastructure that I use to perform a click? Those can be Selenium, there’s Cypress, there’s Puppeteer, there’s now Playwright. This is the basic infrastructure that knows how to, when you say, “Click this” or “Click that” or “Drag and drop.” On the user interface, you want to use something very low level that knows how to perform that. And I’ll get to more details in a second, but I think there’s above it, whether you want to use something, a layer on top of that, some people use their own… I’m guessing the audience probably knows that for example, the page object design pattern, the screenplay pattern, there’s different patterns or even frameworks that work above that.
Testing, for example, is above that. We don’t implement the actual click. We use one of those infrastructures. Even there, you can see a big difference between… 10 years ago you only had HP, Mercury, you had Selenium and you had Sahi. Those are the infrastructures that we knew before how to perform the click.
I think today is different. I’m guessing everyone knows Selenium. It’s literally W3 standard. That’s the biggest advantage. That means everyone supports it. That means if you need a cloud provider, execution environment, to get a browser in the cloud… Everyone supports Selenium because it’s the standard. It’s the de facto standard and also literally the standard. But the way it works and the adoption of Selenium or improvement over there, obviously it’s a bit slower.
And then came Puppeteer, which I’ll explain. It actually started not so much as a test automation infrastructure. It started as internal… Google said that they needed an API framework for their own internal development against Chrome or to be more accurate, Chromium, and they started doing Puppeteer and they improved it in a way that it doesn’t connect through a grid and through a lot more things in the middle, it connects directly to the browser and it talks to the browser and the browser can talk back to Puppeteer. It’s what’s called a BD, a bi-directional channel. You can ask Puppeteer, you can ask the browser say, “Tell me when someone adds this. Tell me when there’s a console log. Tell me when the network is idle.”
Those kinds of things, the tight connection to the browser made it even more resilient than actually Selenium and of course they added so many things like you can check performance memory and anything you can with the DevTools, you can do with Puppeteer. But basically because it is the DevTools… they use the same API, it’s called the Chrome Debugging Protocol and they connect through the same thing like the DevTools and everything you have there it’s available through Puppeteer.
And everyone keeps asking me lately about Playwright, which is the new player in the infrastructural level, which basically… I’ll explain what happened. As Google was working for the last two and a half years on Puppeteer, Microsoft decided that they want to have their own browser, which is based on Chromium as well. Chromium is the open source power that both Chrome and now Edge are based on.
Actually, most of the team moved. The team that worked on DevTools and Puppeteer moved to Microsoft, they started working and they forked Puppeteer and continued working very fast, adding more improvements, for example, they added Safari support.
So Puppeteer was only Chrome and then Firefox so when they forked Puppeteer player in the [inaudible 00:11:40] team, the first thing they added was Safari and they’re adding more capabilities on running in parallel.
So I’m guessing the audience knows that working, moving between tabs, for example, on IFrames, in Selenium, you have to do it yourself, you have to go every time, move to one thing and then move to another. In Puppeteer, you just have the notion of a page, a tab, and then you can have two tabs, use two references, you can call each one. They do all the handling themselves behind the scenes, but they made it much more easy to work like that and Playwright improved that even more. They’re working even more on making that. And because they’re very close and very integrated with the browsers, simple operations like click and code injection, those are the things that I think are ideal. This is, I would say, the best level that you can have when you’re talking about automating a browser.
I guess that because of this, integration is more efficient, the way it’s solved in this infrastructure.
It is more. It’s a lot more. By the way, there are cases where it’s actually less. It’s rare, but when you think about it, it performs a click. How do you perform a click? It’s a mouse down, mouse up. So it depends on where do you run your test, your Puppeteer or Playwright test? If it’s in the same machine as the browser, then putting mouse down and mouse up, it’s local, it’s going to be super fast, but if it’s on a different machine, then you will go round about and say, “Hey, click this. Mouse down here and mouse down there.” And these would be two roundabouts.
By the way, this is when you talk about Selenium, when you think about it, every time you click with Selenium, even the operation, if you look at their API, “find an element” and then click, you have “find an element” that goes all the way to the browser that could be running on some cloud. You find it, and then you do a click. So even then you’ll see that they try to improve. Playwright, Puppeteer and API says, “Click and give me Selector.” I don’t want to go finding an element. They try to make them fewer, those round trips.
You’re making me think that you need to understand how the tools work in order to define the best way to run your tests. Right? Because as you say, I will prefer to run the testing in the same machine where the router is running. Well, this is not always the case, because maybe you want to test in different browsers and you use a service like-
The [inaudible 00:14:30] providers. Yes. I’m guessing that they would, by the way, very soon they’ll start supporting Puppeteer. Right now they don’t. So that’s the advantage of Selenium, that it’s a standout and everyone supports that. But my guess is that we’ll see much more coming up soon, support for both Puppeteer and Playwright.
This is why you have Chrome. With Selenium you have something called chromedriver.exe. Why do you have that? What it does, it converts the click operation into literally a click from the outside.
chromedriver.exe connects to the Chrome Debugging Protocol and tells it to do a click. Puppeteer connects directly. That’s why you don’t need a Chrome driver anymore. The Puppeteer connects directly to the CDP, the Chrome Debugging Protocol, and tells it to perform that click, that mouse down and mouse up.
Unless they rethink the entire architecture, they won’t have things like mouse moving hover. So they kind of go back to Selenium wanting the infrastructure. But of course they added things on top, which is more failure analysis.
But infrastructure… they lack things like trusted events. There’s other things. If Puppeteer improved the way tabs and frames work… If Puppeteer enforced Playwright, Cypress would have gone back, because still they can’t support that. So that means that Selenium supports that not in the best way, but Puppeteer and Playwright support that in an excellent way. Cypress doesn’t support this at all. You can’t have one support multi-tabs. You can’t have that right now. The frames, one inside another, if it’s not the same domain or sub-domains, it’s not supported.
You’re mentioning something that is key in order to decide which tool or which infrastructure you’re going to use, because we need to understand which type of events or things we want to simulate or to run in our tests in order to understand which tool or which infrastructure allows us to do that in our environment.
Yeah. By the way, I can share with the audience, we did a drill down with all the examples and all the other different things. Because we use those infrastructures underneath the hood [of Testim], we know them very intimately. We made a list of all the different things so you can see how you want to decide. And you just say, “This is not important or this is important to me.” And then, you will also see all the listings about what platform or what infrastructure is good for you and which one isn’t.
[Here’s the guide by Testim: Puppeteer, Selenium, Playwright, Cypress – how to choose?]
So this is very important. If someone wants to do IE 11, right now they can’t use Cypress, they can’t use Puppeteer, they can’t use Playwright. So it’s important to know a very big thing, “Do you need this? Is this a must? How important it is for you?” And then you need to decide on your infrastructure.
And again, all the discussion that we had right now was just on the infrastructure level and there’s more above that, which is how do you see the reports? How do you do failure analysis? Do you get the screenshots? Do you get the logs? Do you get the hard file? The network requests? What do you get above when a test fails?
And I think there’s even more layer even above that. If you look at each step, I think that there’s one more layer above is… When I look at 100 fails. Sometimes not one test fails. Let’s just say 100 tests failed. You want to know, “Do I really need to go over a hundred test runs?” Or maybe they can tell me, “Okay, 10 tests failed for some reason, 90 failed for another reason.” So you have two issues, it’s a hundred failures, but it’s two issues that you have, you only need to debug two tests and it’s two different issues.
So it’s kind of applying software engineering practices like modularization and encapsulation to your test design in order to be more efficient when you analyze the results and the data.
Yeah. I think this is one of the most important things when you talk about testing. Testing, just like anything in software, requires architecture and these are the things that people learn.
There’s the basic, which is how to you use an infrastructure, but people are usually focused on “How do I click and how do I perform that?” But the bigger picture, I think that’s what’s super important, “How do you do the architecture of your test suite and what do you test and how do you test and how do you reuse?” Software is all about reuse. That’s the most important thing. The tooling that helps you with reuse, and that should suggest to you how to reuse and that saves you a lot of time.
I understand that you’re working in these layers on the top of the infrastructure, adding some support for the reporting and also for suggesting how to apply different good practices, right?
Yeah. I think it’s very important to have proper tooling or even say [inaudible 00:22:18] that helps you understand the results better. There’s two things, I think, one is to understand your tests and the second thing is understanding the results. You can have one test but you can have a lot of results, and you can have the correlation between the results and today all my tests can fail because of bad infrastructure, the browser didn’t open and tomorrow, all my tests can fail because someone changed the log in, in some way and all my tests failed because of somebody else.
I have another question because I know that you are applying some machine learning to testing. The question is:
Is there a real advantage in using artificial intelligence and machine learning for testing?
Lately, many people and companies are talking about it and it’s difficult to know if it’s just a buzzword or if it’s something that is really helpful. So what’s your view on that?
So I think it depends on what you want the AI to do. Where can it help you? There’s different places that AI can help you.
The first way that I think it’s helpful, and I started with focusing on that a couple of five years ago with Testim, is with the locators, the stability of the locator of a click.
As I said, you want to perform a click, but any of those infrastructures that we mentioned, they give the decision on which element to click. It’s on you, you have to decide. So the question is: How much do you want to invest in time and energy in knowing which element to click on?
And AI can help with the maintenance. With any changes, the tests break, and everyone knows about it. Every time we change the UI, the tests break, and then you have to find out, okay, what’s the new ID? And so this is one thing, one aspect that AI can help.
Now, today we have AI that can use not one selector, but actually hundreds of thousands of selectors. It does that automatically. And then when something changes, even though you change the class, the text, it can still, if you have thousands of different ways, find that element, the platform can say in high fidelity that this is the right element. And that just saves you a lot of time.
It goes back to trust. If your test doesn’t fail all the time, because of every time that someone’s changed the UI, you start trusting your tests more. So I think though, talking about locators and stability, this is one case of stability that happens and that can be eliminated. Other frameworks, I think, for example, even infrastructures are starting to do more things, the basic things of actually doing a read light.
You don’t find an element on the screen, let’s try again a second later. Maybe it’s an adjunct score that takes a few seconds to load. Let’s try this again. All those things like implicit waits, we see that more and more.
So this is one aspect of AI when you’re talking, okay, locators and stability.
There are other aspects when you’re looking at the results. Can AI help us with looking, analyzing all the results? Say there’s a hundred failures, but there are two issues. Here’s the smallest test I could find. So you can reproduce it as fast as possible or suggest to you things.
For example, AI can help find code duplications. This is something which is hard for a human, go over all thousands of tests and find, “Okay, I have the duplication, I have to login here, login here and login here. I didn’t reuse that. I need to reuse the same code.
For a human it’s hard, going over a thousand tests. For machines (AI), they can do that in under a minute. They can go over them and give you a lot of suggestions.
So I think we are seeing and we’ll be seeing more AI helping there and actually assisting you and suggesting things and being more like in the software domains, all developers have seen such a thing as a lintel develop code, it helps you. And I didn’t see that that much in the test automation world. And this is where AI can help out.
I see it’s very related to the discussion that we as a community had about 15 years ago when we were raising the issue of test automation taking the place of testers. And actually my favorite view on that is that the tools can help us to increase our possibilities to do a better job. And as you just explained, it’s the same here with AI.
Tools are helping us with the things that they can do better than us, yet we still have our role using those tools, in order to do a better job, right?
I can remember when I started Testim, people always asked me, “Are you trying to…” They think there’s not going to be testers in the future and oh, everyone keeps asking. And I’m like, “No. I think there’s always going to be testers.” The whole idea is helping them do little tasks and actually save their time so they can act more focused on other things. I think you’ll always need humans.
Excellent. I remember we talked about having three layers of test automation in Testim. This is an approach you took that I really like. So can you explain it a little bit more?
I think there’s different levels when you talk about a test, there’s different things. For example, the highest level, first of all, is the business level. I think people don’t do that enough, separating between for example, the business level and then the implementation. But even on the implementation level, I think that people can have levels there.
I’ll explain what I mean by business level first of all. The way I look at a test, I do need to see that first of all, there’s a login, add to cart, checkout… this is how a user sees this. This is the language of the domain DSL of where the company is talking about. For Amazon, someone has a purchase, so you need to talk in that language. Then you go drill down and you say, “Okay, the implementation is the level of, okay, I need to set the username. I need to set the password and I need to click login.” This is one level below that.
For them, the infrastructure should be click set decks which should be very set and clear. In their tests, they need to go until that level. So I think it’s very, very… I like to purchase things like the page organize design pattern that it actually forces you. It says, “Okay, you have a class account.” And the class account has a method called login that forces you to split between the business logic and implementation level.
And by the way, everyone asks me what I think about things like Cucumber and they all think that I hate it.
And it’s not that. I don’t hate it. I know everyone thinks that way. There’s something that I hated about it and some things that I love about it. Cucumber forces you to distinguish between the two levels you have to talk in.
When you write in English, this is the high level and you have to implement the actual clicks in your code. And it forces you. This is the thing that I love about it. It forces you. You can’t ignore it and say, “No, I’m going to write in my test. My test will have “click this, click that.” No, you have to go through the English. And those are things that I-
It forces you to document your test in a certain way, right?
Exactly, exactly. In some places people use, and then that’s another discussion, whether it should be the “Given, When.” It should be how, I call it religion, how do you go by the rules? Do you go 100%, 50%? Do you have to do that or not? But this is where everyone’s fighting about.
Is this related to scriptless testing tools?
By the way, when people say, “Scriptless” or “Codeless,” there’s something that I… And I’m not sure how to phrase it differently.
I don’t think that you can write a full set of tests without some code. So that’s one thing. And you’ll see that in all different codeless platforms, they’re not really codeless.
Everyone allows you to write some code, and I see that in a way of, in the fact that it does, as you said, the different levels, 90% of the time, you don’t need to run some code, but you do need to customize. This is more related to do you need to do a recording? Can you record something that’s very fast? Right now when you have those, called smart locators, but you have multi-selectors, a recording can be not as good as a developer would write a selector. As I said, it’s even better. It’s much better.
So recording a scenario, like setting a username and password, that could be faster and even most stable than in coding that. So this is something that I do recommend, but there’s always some time and you need some custom logic when you need to go down. And sometimes you just need to drill down for not to recall the value [inaudible] that you set in a… You want to use a variable. Sometimes you even want to write your own code. So this is the one I said, this is a drill down that you can go inside the level and say, “I want to write the code. I don’t just want to record that.” So you always have places where you need to have code. So that’s why I don’t think of it as a codeless solution. I like to look at it as a visual editor.
This distinction is really important because for me, before talking with you, I always thought that codeless automation tools are good for people who don’t know how to code. And as you are talking now, it’s like, even if I can code my tests, these tools also can help me do a better job, right?
Yes. I think you need to code where you need to. Wherever you need to code, focus your time there. But if you have something that can record the login, and it can save you an hour a day or five hours a day, and do actually a better job there, don’t fight it.
You need to focus your energy and your skills on how to do better test architecture, how to build your tests in a way that they will be more resilient and reuse and pass the right parameters, and knowing when they do end-to-end, how do I end mocks? What do I want to mock in my test? How do I take the data having the infrastructures that I start my test in the same way, all the time. Those are the things that need a lot of thought. And I think humans, this is where we’re good at. But a lot of the time you can use, whether it’s recording or not, they can give you a lot more value. By the way, the fact that they give you, when you look at a test that you didn’t write and you can see the screenshot and you can see what is the entire screenshot and know what they meant, it’s easier for someone else to understand what is the test.
I look at code-less automation as a visual editor. I think it’s a great tool, even for developers, to save them a lot of time and energy. And they should focus on the things where you do need code. This is where you need to double down and think a lot and make the code as robust as possible.
Man, I love talking with you. Just to try to wrap up this interview, I have some more personal questions I would like to ask. The first one is, how did you get into testing?
Ooh, I loved compilers. I worked with a company that as a compiler engineer, what they did was they had a special language for hardware qualification. And Cadence acquired this, it’s a very well known developer platform. And I was a compiler engineer. My users were testers. That’s the people that used it, it was coded but with special language that added special capabilities for hardware for verification. As you know hardware, before it becomes hardware, you have a spec, it’s a coding spec system, very long. So I got to know more, if you want to do a language for developers, for test engineers, you need to know more about the domain.
So I got to know a lot about that. And then I moved to another company, a small company called Wix. Well, now they’ve got an IPO and everyone knows them, but when I moved there, it was super small and all the testing was manual. And it took a month to release something because everything was manual. And I came from a world where everything was automated. So I moved to Wix with web and I would make a transition to web and everything was manual. And I just said, “I have to look at all the tools out there to understand how can I do a better job of adding unit tests and end to end?” And this is how I got to find out.
I found out that, no, there’s not a good enough solution at the moment. That was a while, so many years ago. And I decided to pursue and try to help in building more testing tools.
I was the first employee at a company called Applitools that also does visual validation. So nobody did visual validation before that. So that was my first experience, I would say, already starting to build testing tools.
Cool. I remember someone telling me that one great way to improve your skills as a tester is to work for some years in development and a great way to improve your skills as a developer is work for some years as a tester. And it’s great!
I agree with you 100%. One of our architects actually started as a tester. I remember meeting him at Selenium meetups in Israel. And every presenter that came there, he was asking so many amazing questions. Amazed, I thought, “How about he come over to Testim? And not just the automation part, I think he can move to do other things as well.” And obviously, he moved up the ladder super, super fast. And I think everything he does is magic. One of the reasons is that he still thinks sometimes like a tester. He starts with that. A tester thinks like a user actually in a lot of ways. And that’s amazing.
Great. Another question. I really believe that forming good habits is really important in order to achieve your goals. So I’m curious about the habits you have. If you have any habits you want to suggest forming?
I think that it’s very important to know about what you like to do and what you are doing. Where are you spending your time? Where is your focus?
So I know that everyone recommends, okay, telling you what you’re supposed to do that way, day or week. I actually like to go also, in addition, to go the other way around. I go to my calendar and every time I do something, I kind of mark in the calendar what I did. So I know there, I look at the last week, I know where I spend my time. And then I can… If I decided that I have a goal that I want to spend 10% of my time working on learning something new, like a new framework… I want to make sure that I’ve done this. Have I allocated enough time for myself to actually do that?
And then I can look at the retrospective of myself and look at my calendar and say, “Oh, did I do that?” And I love how in Google Calendar you can add colors. So I do this because I have different types of things. So I add colors and then I can see how well I do. And if I need to allocate more time or less time to change if something doesn’t work, I know how to improve that.
Time management is key for almost anything! One last question: Have any book recommendations?
It depends on where you go with that? On the startup level, one of my favorites was From Zero to One. That’s Peter Thiel, to understand at a very, very high level.
And on the personal, the one I’m reading right now is on Bughouse. It’s a chess variation, a four player game of chess. There’s two boards, pieces move from board to board. There’s one only one book by some grand master that explains how to improve on that. And I want to improve my Bughouse skills, so.
So, you typically play this game?
Yes. Yes. I’m in love with this game.
Huh. I’m curious, but I haven’t heard about this chess variation.
Oh, so I think now we have to do another podcast only on Bughouse.
Right. Oren, is there anything you would like to invite listeners to do or to access?
So, regarding Testim, especially at this time, we want to do a lot more for the community at this point.
We did a few things. First of all, we offer a huge freemium. Either you can run your tests for free, even with the cloud execution for free. So people at this time can run because I know sometimes right now it’s hard. So we did that.
Another thing for the community, we launched a course that focuses on AI, where it can help.
So we did that so people can have, and I know a lot of people are looking for work. And so I think that the AI certification, we saw that it’s very helpful because people are starting to ask if they know all those things like how to use those assistants. So we sought that and we gave the course for that.
And there’s a third thing, which is we’re doing a lot more to contribute also to the open source world and we’ll continue to do a lot more. So you’ll see that actually we’re releasing more capabilities for those as well. So sure, I’m sure to go to the newsletter and subscribe. I suggest to people just sign up so that you’ll get notified when we release. We’re releasing a lot of cool things to the community. So keep your heads up.
Excellent. Excellent. Thank you so much for your time. It was an amazing talk. I really enjoyed it!
Right. It’s always fun talking to you.
Let’s do it in the future again, talk about web test automation challenges, chess, productivity…
Yes. Well, next thing we’d go back to not doing it virtually to going in the streets of San Francisco like last time.
Oh yeah. I’d love that. See you Oren. Thank you.
Recommended for You
Webinar Recap: The State of Test Automation 2020 – Testim Survey Results
COVID-19’s impact on test automation, today’s biggest hurdles, and looking ahead In June, our friends at Testim conducted a survey on end-to-end test automation. So, what were the results? Watch this webinar recording where our COO, Federico Toledo, joined Testim CEO Oren Rubin to uncover…
Free tools to process logs for performance analysis
An overview of free tools to process logs for performance that we created for Apache, IIS, and other web servers Our tools, Replace All and Access Log Analyzer are open and free, located in our Github repository. Some time ago, Símon de Uvarow, (performance expert) started automating tasks…