The Importance of Validating the Testing Infrastructure
A lesson to learn the easy way
I’m a firm believer in learning from one’s mistakes. When you make a mistake, and you are truly invested in what you do and strive to do it well, you naturally will want to analyze the mistake so you can learn how to avoid it in the future.
In this post I want to share one such lesson. In more than one client project, the information we had about the infrastructure of the system was incorrect (or didn’t paint the entire picture), so, when analyzing the results of our performance tests, there were things that just didn’t measure up. There were some behaviors that we simply couldn’t explain. In these cases, after a lot of research we came to find that there was an extra component we were unaware of, or that the network path through which we were accessing a certain component was not as direct as we thought.
The two challenges were quite different, but the solution is one in the same: validate the testing infrastructure.
Sometimes when given something to test, some key details may be forgotten- and that’s okay. That’s why, as testers, it’s on us to validate the test infrastructure before diving in. Fortunately, there are several ways to do so:
Ways to Validate Your Testing Architecture
Validate components and their versions
Access each node by checking the IPs of the components and that they have the indicated services. Validate the operating systems, and verify their versions, as well as the versions of the components (for example, Java, Apache, etc).
Validate initial configurations
In a performance test, looking for optimizations, different configurations are usually tested, trying to improve the results, comparing the performance of different options. So, to validate that what is documented in the results is accurate, it is necessary to review the initial configurations (at least the most relevant ones). For example, the size of each connection pool (in the database or the web server), the maximum and minimum allocated memory (in the case of JVM), etc.
Validate connections and network routes
To do so, from each node, make a traceroute to the nodes with which they connect, to validate the network jumps. You should also do this from the load generating machines.
I mention this in particular because it was what made us realize one of the problems we ran into. If you are accessing the web server to port 80 where there is a Tomcat, you should check that the Tomcat is configured in the port 80. What happened to us is that it was in the port 8080, and this was because they had placed the Tomcat behind an Apache. This is a common practice*, but we weren’t made aware because we were later told, “the Apache is lightweight and does not add overhead.” (Seriously!?) That is usually true, but it doesn’t mean that it won’t generate contention if something was configured wrong, as in this case. The number of connections it accepted was not enough for the load, so it would queue the requests. We were trying to understand why JMeter gave us certain results on the one hand, while in the Tomcat access log the times recorded in the time-taken were much smaller.
*Combining Tomcat and Apache has certain advantages. It should allow for greater concurrency management and resource optimization through compression and caching.
Clearly it involved some extra work to try to understand what was happening. Had we validated the infrastructure at the outset, we would not have had these problems. For the future, in order to provide a better service and not depend on the knowledge or transparency of the infrastructure that exists, it’s best to carry out certain validations before starting to analyze a system’s behavior.
The take-home message:
Don’t skip the part where you validate the testing infrastructure, since at the end of the day, we’re seeking information to reduce risk.
Is there anything else that you think is important to add to this list?