In the world of performance testing there is a heavy focus on the practice of load testing. This requires building complex automated test suites which simulate load on our services. But load testing is one of the most expensive, complicated, and time consuming activities you can do. It also generates substantial technical debt.
Load testing has its time and place, but it's not the only way to measure performance. There are simpler and faster ways to give us some information about the performance of our services early on. The "single user performance analysis" discussed in this article is one of them.
All you will need to follow along is a Chromium based web browser and its built in Developer Tools.
Using Chrome Developer Tools
Other than Firefox, most browsers (at the time of writing this article) are Chromium based. This means they all have a fairly consistent Developer Tools built into them. I'm using Brave browser, which is Chromium based, on an M1 Mac. You will have a nearly identical experience on Windows, Linux, and with other Chromium browsers.
In both Chrome and Brave you can access Developer Tools by going to the menu (three lines or three dots in the top right hand corner) and going to More Tools > Developer Tools.
By default Developer Tools will appear as a pane on top of whatever page you are on. I prefer to have Developer Tools in a separate window. To do this use the shortcut Command + Shift + D (or Control + Shift + D on Windows).
NOTE: Although Developer Tools will move to a separate window it remains linked to the specific tab you were viewing when you opened it.
Network Traffic Analysis in Developer Tools
For this demonstration we are going to focus specifically on the Network tab, so switch to that. Here is what it shows us:
- A list of all the requests made from the browser to the app, in chronological order
- The status code, response size (in KB), and response time of each request
- A timeline view (called the "Waterfall") of when each request started and ended
- We also have the ability to select a request to inspect the request and response headers.
Tips and Tricks for Developer Tools
Before we continue, I have some tips for using Chrome Developer Tools:
- I recommend switching on "Preserve log" (top left) so that it keeps recording even if the request you are making has multiple re-directs and switches to other domains. If you don't enable it, you may see it clear the requests away periodically, and you'll lose some of what you intended to capture.
- You may also find it helpful to switch on and off recording (circle/square icon in the top left) only when you are ready to capture an interaction. Otherwise you might capture extra requests that are not part of the user interaction you care about.
- I am going to use Chrome Developer Tools natively for this demonstration, but you can also export your recording (right click and go to "Save all as HAR with content") for analysis in a third party tool. Some third party tools I have used for this purpose (or instead of Developer Tools) include the paid tool Fiddler (I used the classic version on Windows for many years) and Charles Web Proxy.
- Think about what you are trying to measure. If you want to understand what happens when a new user comes along to your web app then you will want to clear your browser cache before taking a recording (otherwise it may use resources cached in the browser).
Measuring Response Time
Go ahead and capture some HTTP(S) traffic. I am going to analyze https://squaredup.com/ so feel free to do the same. I would recommend capturing traffic for a single page or user interaction, rather than multiple steps in a customer journey (for now). In other words, we want to know what happens between the time we click on a button or link until it has finished processing and presenting the next page.
The first thing you might want to know is how long this page or user interaction is taking. There's a few different angles we may want to investigate.
How long did it take the entire page to load?
This is how long it took to retrieve all content (including images and stylesheets). The page often appears fully loaded much sooner than this, so be careful how you interpret this number. In fact I would go as far as to say this is not a good measure of customer experience (keep reading for an alternative way to measure this).
You can see the total request time in two places. Firstly, down the bottom of the screen there is a label "Finish" which tells you the total time spent handling all requests. In my case it took 22.61 seconds in total to load every request included at squaredup.com:
Be careful with the above number, I found it unreliable when an interaction I was trying to capture followed multiple re-directs to other domains (for example Microsoft authentication). Alternatively, you can also see this visually in the timeline up the top:
How long until the page was interactive?
Despite taking more than 20 seconds to process every request, from a user perspective it felt like the page full loaded in less than 5 seconds.
This is a pretty good indicator of customer experience. When the load event fires the page should be interactive. Of course, it always depends on the specifics of your website or application.
Aside from the blue line in the timeline above you can find the load time down the bottom. In my recording it took 4.63 seconds to fully load the page:
How long did an individual request take?
Viewing the SquaredUp homepage involved requesting 89 different objects. Sometimes we may want to look at a specific request or resource of concern, such as an image file which was taking several seconds to download over a low bandwidth connection.
The "Time" column shows the elapsed time for each request. The waterfall view visualizes a breakdown of this time:
If you hover over the bar in the waterfall view it provides additional information. In the screenshot below we waited 3.99 seconds for server processing and 633 milliseconds transferring data over the network.
Keep in mind that here we are recording one sample. Depending on how stable your application is, response times may vary wildly and under different conditions. Make sure you capture multiple recordings to validate what you find, and even then, you may want to further validate your observations through load testing or by monitoring what happens in production before making an expensive decision.
Checking Response Size & Compression
The size of the data required to render a web app is another dimension to look at. This determines how long it takes to download, and is an especially critical factor on low bandwidth or unreliable networks (like a flaky mobile network).
You can see the overall data transferred down the bottom of the page. Transferred tells you the total amount of data transferred over the network. Resources is the total size of the page content once it has been decompressed (or retrieved from the browser cache).
It is common (and good) practice to compress the resources you send over the internet. There is no "golden rule" for what a good size is for a web page, but I've read that between 1MB and 3MB is around about the sweet spot.
Check that all large resources are being compressed. If not, this could be an easy win to improve performance.
To check if a resource is being compressed click the little icon of the resource on the left hand side of each row. This brings up a new pane which allows you to inspect the request and response headers:
Look for a response header called Content-Encoding. Compressed resources have this header. The header value shows algorithm used. In the screenshot above "br" tells us that the server used the Brotli compression algorithm.
If there is no Content-Encoding response header then look into whether compression is enabled. It is possible the compression is being done at a lower layer (and headers have not been handled correctly).
Analyzing System Behavior and Parallelism
The waterfall view is a great way to visualize the flow of requests and responses between our browser and the server.
Generically, it helps you understand how the server is responding. For example, I captured the requests required to take me to the SquaredUp login screen.
The screenshot below shows some interesting behavior. The initial request to https://app.squaredup.com/ was "canceled" after 1.08 seconds. If that continued to occur over multiple recordings I might go and speak to the product teams to figure out whether we can shave that second off the response times.
Requests in Parallel
Another aspect we can see above is the level of parallelism. HTML documents include images, stylesheets, scripts, and icons. We should download these resources in parallel to speed up download times.
By default, Chromium browsers downloaded six requests in parallel (per domain).
Investigate any long chains of requests made in sequence. Is it possible to implement parallelism to improve performance? This is not always possible because of requests that depend on the output of others. Such as a shopping cart requests made before presenting a payment screen.
High Latency + Many Requests = Slow Response Time
It's also worth mentioning the relationship between latency and the number of requests your app makes. If your app is "chatty" and makes a lot of requests then that increases the risk of network latency impacting the user experience. For example, consider an interaction makes 10 requests in sequence:
- A user with 10ms latency to the server would end up waiting 100ms in total for network packets to travel to the server.
- A user with 250ms latency (here in New Zealand this is not uncommon when talking to US or EU hosted services) then the user would end up waiting 2.5 seconds just for this network latency (before we consider the actual processing time of the requests).
Client Side Caching
Web browsers have the ability to cache resources (images, stylesheets, scripts, etc) so that when you return to a web page or app that you've used before you don't necessarily have to retrieve the resources a second time.
The Cache-Control header specifies how client side caching is to be handled. If there is no Cache-Control header in the response for a resource, it will always be retrieved from the server. Using this header the server can specify how long the resource can remain "fresh" for before it needs to be re-retrieved from the server.
The browser behaves differently depending on the freshness of the resource, and how the server specified it should be handled. Two common situations are...
- The server specifies that a resource can remain fresh for an hour using the response header
Cache-Control: max-age:3600. If the user visits the same page within the hour then their browser will not make any request to the server, it will retrieve the resource directly from the browser cache. Developer Tools in any Chromium based browser will report a response code of "200 (from memory cache)" or "200 (from disk cache)"... which is a little misleading because, as I said, no request went to the server.
- Alternatively, if the same user above went back to the page after that 1 hour freshness period or if the response headers specified that the browser should always validate a resource using
Cache-Control: no-cachethen the browser will make a request to the server to check whether the resource is fresh (or not). If it is still fresh then server will return an HTTP-304 Not Modified response which let's the browser know it can use the file in the cache. If not, the full resource will be returned with an HTTP-200 response code.
Checking if Caching is Working
Using this knowledge we can check how much client side caching is happening by:
- Clearing the cache and recording a web page or user interaction
- Then not clearing the cache and then recording the same traffic again
This gives you a comparison of what happens when a new user requests this page or service versus a returning one. Just make sure you don't tick the "Disable cache" checkbox up the top like I initially did for the second recording...
The second recording shows the resources retrieved from the browser cache. Take note of any not cached resources that should be:
You can also compare the "Transferred" time down the bottom with and without caching to see how much of a difference it made. Remember how the SquaredUp homepage transferred 2.4MB of data before? With client-side caching in effect we only needed to retrieve 953KB of data:
When should we turn to load testing?
If you want to know how your systems or services will perform (and how reliable they are) under various workloads, then load testing is a good way to go. For example, simulating what we expect to happen on a retail website on Black Friday. In other words, load testing allows us to test the capacity of our services. This is especially valuable when we have a product or service that is not yet live in production, or is in production but is not yet experiencing the levels of load that we expect in the future.
Another situation that may warrant load testing is where you are seeing inconsistent response times for your user actions. Perhaps the first time you record a page it takes 3 seconds, but the next time it takes 20 seconds, and the time after that 8 seconds. In this situation load testing could help us understand what the system behavior is. Is it random? Are there bands of of response times (which commonly occurs when a timeout/retry is happening at some layer of our service)? Generating many samples in a load test will give you a view of this behavior and provide you a higher level of statistical confidence in the findings.
You don't need to build complex load test suites to start measuring performance. You can learn quite a lot by simply opening up Developer Tools and capturing your user interactions. It's also something anyone can do, without specialist experience in load testing concepts and tools.
I'm a fan of checklists, so here's a few questions you might want to ask when doing this kind of single user analysis:
- How long is each customer interaction taking?
- Are there any particularly slow requests?
- How much data are my users downloading on each step of the customer journey?
- Is compression enabled for all resources?
- Is there an appropriate client-side caching strategy in place for each request?
- Are requests occurring sequentially or in parallel? And is this as intended?
We've only scratched the surface here. We could take this further by diving into the "Lighthouse" tab of Chrome Developer Tools which expands our focus to client side performance and user experience, but that topic deserves its own article.
So, go ahead and give it a go for yourself. What have you got to lose?