{"id":7715,"date":"2016-10-17T22:08:25","date_gmt":"2016-10-17T22:08:25","guid":{"rendered":"http:\/\/www.abstracta.us\/?p=7715"},"modified":"2025-05-05T21:21:05","modified_gmt":"2025-05-05T21:21:05","slug":"shutterfly-masters-continuous-performance-testing","status":"publish","type":"post","link":"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/","title":{"rendered":"How Shutterfly Masters Continuous Performance Testing"},"content":{"rendered":"<p><!-- Go to www.addthis.com\/dashboard to customize your tools --><script src=\"\/\/s7.addthis.com\/js\/300\/addthis_widget.js#pubid=ra-58d80a50fc4f926d\" type=\"text\/javascript\"><\/script><\/p>\n<h1><span style=\"font-weight: 400; color: #333333;\">How Shutterfly masters continuous performance testing by &#8220;adjusting the belt tight&#8221;<\/span><\/h1>\n<p><span style=\"font-weight: 400; color: #333333;\">Picture this, you are the owner of an e-commerce website and you want to be sure that its excellent customer experience doesn\u2019t deteriorate over time with the introduction of new functionalities and that your customer retention rate remains healthy. For you, one critical pillar for achieving this is having reliable and highly performant web pages. As they say, your competitor is always one click away. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">The same goes for <a href=\"https:\/\/www.shutterfly.com\/\" target=\"_blank\" rel=\"noopener\">Shutterfly<\/a><\/span>, the go-to platform for creating custom photo creations and gifts (that has recently passed the $1 billion mark in revenue), whose performance engineering team is concerned with testing and monitoring its web and application performance on a daily basis. Making sure that each user smoothly sails through the entire process from uploading photos, to designing their creation, to checkout is of the utmost importance for the company to remain the market leader.<\/p>\n<p><span style=\"font-weight: 400; color: #333333;\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-7719\" src=\"http:\/\/www.abstracta.us\/wp-content\/uploads\/2016\/10\/Screen-Shot-2016-10-17-at-1.16.54-PM-min-1.png\" alt=\"mug tester day shutterfly\" width=\"1238\" height=\"661\" \/><\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Above is a screenshot of a page that lets me preview my custom mug from the Shutterfly website. You can see how its powerful tool allows you to visualize almost any product you can think of with your photo(s) on it! Imagine all the work that goes into making sure this system works perfectly, all the time!<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">I have been fortunate enough to work with Shutterfly\u2019s performance engineering team, assisting them in the past few months while learning about their cutting-edge <strong>continuous performance testing<\/strong> method. I\u2019m thoroughly impressed by how this particular client has devised a way to test so that they can discover degradations in performance almost\u00a0immediately!<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\"><b>Today I\u2019d like to discuss how Shutterfly&#8217;s\u00a0continuous performance testing methodology works so that you may also use it to your advantage. <\/b><\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">The methodology involves a very different approach than that of having a load simulation for acceptance testing, in which we simulate the whole load scenario with a test infrastructure similar to the production environment. We are not substituting that for this, as this is just a way to complement those tests and discover degradations before the acceptance test. This way, we reduce the risk of having to implement big solutions (aka \u201csolve BIG problems\u201d) just a couple of days before the day we wanted to deliver our new version of the system.<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Essentially, the goal is to be able to detect the exact moment when someone enters a line of code that impairs the performance of the system. This methodology I will explain provides a great way to do so and to also be able to automatically identify it as soon as possible, having a record that gives us a clue as to when it was inserted so that we can identify it easier.<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">So, how does it work?<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Methodology\"><\/span><strong><span style=\"color: #00b674;\">Methodology<\/span><\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400; color: #333333;\">First off, a precondition that must be met is to perform tests frequently. Continuous integration would be best, running different sets of tests at different times: basic regression tests before each commit and a full regression at least 2 or 3 times a week. You must have load scenarios (acceptance testing) before putting a new version in production and tests ready for each fix. In Shutterfly\u2019s case, engineers check in code around the clock and the software is updated in the test environment daily.<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Secondly, we must make sure that we have an <\/span><b>exclusive environment<\/b><span style=\"font-weight: 400;\"> for tests. With an exclusive test environment, the results will be more or less predictable and won\u2019t be affected by if another person, for example, runs something else at the same time, causing the response times to soar, generating false positives, and wasting a whole lot of time. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Another key aspect is that the tests should have acceptance criteria (assertions) as close as possible so that at the slightest system regression, before any negative impact, some validation fails, indicating the problem. This should be done in terms of response times and throughput. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\"><strong>I like to call this &#8220;adjusting the belt tight.&#8221;<\/strong> \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Now I\u2019ll explain how to create, maintain and use this \u201cbelt\u201d in performance tests.<\/span><\/p>\n<p><span style=\"color: #333333;\"><span style=\"font-weight: 400;\">Imagine we have a new test up and ready for stress testing the login. What we will do is see just how much we can \u201ctighten the belt\u201d. That is, until what point can we increase the burden on the login while keeping an acceptable service level?<\/span><span style=\"font-size: 1rem;\">\u00a0 \u00a0 \u00a0<\/span><\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Say we run the first test with 100 concurrent users and as a result, there are no crashes, the response times are below 100ms (at least the 95th percentile) and the throughput is 50 TPS. Then we run the test with 200 virtual users and again, it results in zero crashes and the times are at 115 ms and the throughput is at 75 TPS. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Great, it\u2019s scaling. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">If we continue on this path of testing, we will at some point, reach a certain load in which we see that we are no longer achieving an increase in the throughput. We will also be getting errors (exceeding 1%, for example) which would indicate that we are saturating the server and it\u2019s likely that response times from then on will begin to increase significantly, because some process or connection or something begins to stick amid all the architecture of the system.<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Following this scenario, imagine we get to 350 concurrent users and we have a throughput of 150 TPS, with 130 millisecond response times and 0% errors. If we pass 400 virtual users and the throughput is still about 150 TPS, with 450 users the throughput will be even less than 150 TPS.<\/span><\/p>\n<p><span style=\"color: #333333;\"><span style=\"font-weight: 400;\">There is a concept called the &#8220;<\/span><span style=\"color: #00b674;\"><a href=\"http:\/\/www.informit.com\/articles\/article.aspx?p=391645\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">knee<\/span><\/a><\/span><span style=\"font-weight: 400;\">,&#8221; or the point at which performance \u201cdegrades ungracefully,\u201d that we would be encountering with this type of testing illustrated in the graphic below. The TPS is expected to increase when we increase the number of concurrent users and if it doesn\u2019t happen, it\u2019s because we are overloading the system\u2019s capacity. <\/span><\/span><\/p>\n<p><span style=\"color: #333333;\"><span style=\"font-weight: 400;\"><span style=\"color: #333333;\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-7747\" src=\"http:\/\/www.abstracta.us\/wp-content\/uploads\/2016\/10\/Screen-Shot-2016-10-18-at-3.26.02-PM.png\" alt=\"Screen Shot 2016-10-18 at 3.26.02 PM\" width=\"737\" height=\"439\" \/><\/span><br \/>\n<span style=\"color: #333333;\">This is the basic methodology for finding the \u201cknee\u201d when doing stress testing, when we want to know how much our servers can scale under the current configuration.<\/span> <\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: #333333;\"><span style=\"font-weight: 400;\">Then, the test that we will schedule to continue running frequently is the one that executes 350 users, is expected to have less than 1% error with expected response times below 130 * 1.2 ms (this way we give ourselves a margin of 20%), and last but not least, we have to assert the throughput, verifying that we are reaching 150 TPS. This is the way we can detect right on time when something degrades the <\/span><span style=\"font-weight: 400;\">performance. <\/span><\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">It is a good idea to monitor the test for a couple of weeks to ensure that this analysis is valid and endures and that the test stabilizes. It is also important to review these values from time to time, just in case the performance of the service improves, in which case, we\u2019d need to adjust the belt even tighter there! Meaning, we\u2019ll always have to adjust the assertion values. <\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Tools\"><\/span><strong><span style=\"color: #00b674;\">Tools<\/span><\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400; color: #333333;\">What tools are needed to accomplish this?<\/span><\/p>\n<p><span style=\"color: #333333;\"><span style=\"font-weight: 400;\">At Shutterfly, we are using\u00a0<\/span><span style=\"color: #00b674;\"><a href=\"http:\/\/gatling.io\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Gatling<\/span><\/a><\/span><span style=\"font-weight: 400;\"> +\u00a0<\/span><span style=\"color: #00b674;\"><a href=\"https:\/\/jenkins.io\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Jenkins<\/span><\/a><\/span><span style=\"font-weight: 400;\"> +\u00a0<\/span><span style=\"color: #00b674;\"><a href=\"http:\/\/docs.grafana.org\/datasources\/graphite\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Graphite<\/span><\/a><\/span><span style=\"font-weight: 400;\"> +\u00a0<\/span><span style=\"color: #00b674;\"><a href=\"https:\/\/www.appdynamics.com\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">AppDynamics<\/span><\/a><\/span><span style=\"font-weight: 400;\">. <\/span><\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">I did not mention tools earlier because the same strategy could potentially be implemented with others. The important components are going to be: performance testing tool (for load simulation) + Continuous Integration Engine + Visualize and storage monitoring data + deeper information on the system\u2019s behavior for analyzing problems and providing more information. <\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Results\"><\/span><strong><span style=\"color: #00b674;\">Results<\/span><\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400; color: #333333;\">The day someone enters a line of code that worsens the response time by 20%, or harms the scalability of the login, it will be detected almost immediately. This is the beauty of continuous performance testing. We can quickly avoid and eliminate any significant degradations of system performance, making sure that the end user never suffers from slowness when a new line of code is introduced. <\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Another advantage is that it\u2019s not necessary to have the infrastructure similar to that of production. The baseline would be how the system is working today. If we consider it to be good, later we can see what the \u201cknee\u201d is in our test environment and we can work with that, without letting it degrade.<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">Where cost prohibits the pre-production test environment from having the same scale and performance as production, you can use a scaled down lower performance system and still detect performance regressions. \u00a0In Shutterfly\u2019s case, one side benefit of using a scaled down version of production is that we can easily generate all of the necessary load from a single machine, which makes reporting the results much simpler. Otherwise, we would need a huge test infrastructure just to reproduce the load simulation!<\/span><\/p>\n<p><span style=\"font-weight: 400; color: #333333;\">What do you think of this method? How have you managed to continuously test performance?\u00a0<\/span><\/p>\n<p><em><span style=\"font-weight: 400; color: #333333;\">A special thanks to <span style=\"color: #00b674;\"><a href=\"https:\/\/www.linkedin.com\/in\/melissachawla\">Melissa Chawla<\/a><\/span> and <\/span><span style=\"font-weight: 400; color: #00b674;\"><a href=\"https:\/\/www.linkedin.com\/in\/fredberinger\" target=\"_blank\" rel=\"noopener\">Fred Beringer<\/a><\/span><span style=\"font-weight: 400; color: #333333;\"> for your support and feedback, helping me to write this post!<\/span><\/em><\/p>\n<p>&nbsp;<\/p>\n<p>Want to learn even more about Shutterfly&#8217;s continuous performance testing practices? <strong>Watch this webinar recording!<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p><iframe src=\"https:\/\/www.youtube.com\/embed\/5wO9FlhodcM\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<h2><span class=\"ez-toc-section\" id=\"Recommended_for_You\"><\/span><strong>Recommended for You<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><a href=\"http:\/\/www.abstracta.us\/2016\/05\/04\/yodas-guide-for-agile-testing\/\">Yoda&#8217;s &#8220;The Way of the Jedi Tester&#8221;: A Guide for Agile Testing<\/a><br \/>\n<a href=\"http:\/\/www.abstracta.us\/2016\/07\/11\/gatling-tool-review-performance-tests-written-scala\/\">Gatling tool review for performance tests (written in Scala)<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>How Shutterfly masters continuous performance testing by &#8220;adjusting the belt tight&#8221; Picture this, you are the owner of an e-commerce website and you want to be sure that its excellent customer experience doesn\u2019t deteriorate over time with the introduction of new functionalities and that your&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[32],"tags":[114,50,645],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v14.0.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How Shutterfly Masters Continuous Performance Testing | Abstracta<\/title>\n<meta name=\"description\" content=\"A critical pillar for achieving a great UX is having highly performant web pages. Today I\u2019ll discuss how Shutterfly masters continuous performance testing.\" \/>\n<meta name=\"robots\" content=\"index, follow\" \/>\n<meta name=\"googlebot\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<meta name=\"bingbot\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How Shutterfly Masters Continuous Performance Testing | Abstracta\" \/>\n<meta property=\"og:description\" content=\"A critical pillar for achieving a great UX is having highly performant web pages. Today I\u2019ll discuss how Shutterfly masters continuous performance testing.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog about AI-powered quality engineering for teams building complex software | Abstracta\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/AbstractaQA\/\" \/>\n<meta property=\"article:published_time\" content=\"2016-10-17T22:08:25+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-05T21:21:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/abstracta.us\/wp-content\/uploads\/2016\/10\/shutterfly2-min-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"420\" \/>\n\t<meta property=\"og:image:height\" content=\"236\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@fltoledo\" \/>\n<meta name=\"twitter:site\" content=\"@AbstractaUS\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/abstracta.us\/blog\/#website\",\"url\":\"https:\/\/abstracta.us\/blog\/\",\"name\":\"Blog about AI-powered quality engineering for teams building complex software | Abstracta\",\"description\":\"AI-powered quality engineering\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/abstracta.us\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"http:\/\/www.abstracta.us\/wp-content\/uploads\/2016\/10\/Screen-Shot-2016-10-17-at-1.16.54-PM-min-1.png\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/#webpage\",\"url\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\",\"name\":\"How Shutterfly Masters Continuous Performance Testing | Abstracta\",\"isPartOf\":{\"@id\":\"https:\/\/abstracta.us\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/#primaryimage\"},\"datePublished\":\"2016-10-17T22:08:25+00:00\",\"dateModified\":\"2025-05-05T21:21:05+00:00\",\"author\":{\"@id\":\"https:\/\/abstracta.us\/blog\/#\/schema\/person\/7421e539de0357d3adb0c69ed469a1c2\"},\"description\":\"A critical pillar for achieving a great UX is having highly performant web pages. Today I\\u2019ll discuss how Shutterfly masters continuous performance testing.\",\"breadcrumb\":{\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/abstracta.us\/blog\/\",\"url\":\"https:\/\/abstracta.us\/blog\/\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/\",\"url\":\"https:\/\/abstracta.us\/blog\/performance-testing\/\",\"name\":\"Performance Testing\"}},{\"@type\":\"ListItem\",\"position\":3,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\",\"url\":\"https:\/\/abstracta.us\/blog\/performance-testing\/shutterfly-masters-continuous-performance-testing\/\",\"name\":\"How Shutterfly Masters Continuous Performance Testing\"}}]},{\"@type\":[\"Person\"],\"@id\":\"https:\/\/abstracta.us\/blog\/#\/schema\/person\/7421e539de0357d3adb0c69ed469a1c2\",\"name\":\"Federico Toledo, Chief Quality Officer at Abstracta\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/abstracta.us\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6de7ec6536c4028b5c02ad4ec1b9af0d?s=96&d=blank&r=g\",\"caption\":\"Federico Toledo, Chief Quality Officer at Abstracta\"},\"description\":\"Co-founder and COO of Abstracta\",\"sameAs\":[\"https:\/\/twitter.com\/fltoledo\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/posts\/7715"}],"collection":[{"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/comments?post=7715"}],"version-history":[{"count":14,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/posts\/7715\/revisions"}],"predecessor-version":[{"id":16037,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/posts\/7715\/revisions\/16037"}],"wp:attachment":[{"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/media?parent=7715"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/categories?post=7715"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/abstracta.us\/blog\/wp-json\/wp\/v2\/tags?post=7715"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}