(Editor’s Note: The following advice is from a SearchSoftwareQuality article. The original, with links to further material, is available at http://searchsoftwarequality.techtarget.com/tip/Application-performance-metrics-are-what-to-test-not-how-to-test)

Metrics provide fuel for crucial conversations and give software development managers and project managers a no-nonsense view of how well their applications perform in reality. Alone, metrics are just visual or graphical representations of data. However, by analyzing metrics, software developers can figure out patterns, trends and issues that affect applications’ performance in production.

Deciding what is most important to measure is difficult; so is developing guidelines for understanding the numbers. The performance requirements for most software development projects are loose at best. It is vital to find metrics that improve the customers’ experience and enable engineers to diagnose and fix issues. In this article, we’ll cover what measures are valuable metrics for application performance and what the numbers mean.

Defining requirements, creating tests and monitoring results

Metrics are more than just wall decorations, or at least they should be. Metrics on personal productivity and everyday tasks may be beneficial to the individual in question, or that individual’s personnel manager. However, they have no bearing on the application the individual works on and are useless to the users of the application. Better metrics will reveal how an application performs in reality — at what point it tanks, how frequently it happens, and what can be done to remove the problem and negative performance impacts.

The first thing to find is a performance requirement. It can be as simple as “Page X loads in less than five seconds with 100 concurrent users.” Or it can be more complicated and written in a story format. For example, “When a user loads 5 items in their checkout cart, they can complete checkout in less than 1 minute,” or, “200 unique authenticated concurrent users can simultaneously request the same MRI result on patient A and receive that document in their secure depository in less than 60 seconds.” The complexity of the requirement determines what information is needed.

Personally, as a Web service and Web application user, I want speed, accuracy and security, regardless of what exactly an application provides.

After requirements are settled, the next step is to create a performance test suite. In order to be effectively executed, performance tests need to be automated. Automated or coded tests are best, particularly the ones that can be manipulated to simulate different numbers of simultaneous system users executing the basic functionality of an application.

Next, plan who is going to monitor and record the test results. Application performance testing is about the application, but it’s also about how the back-end systems perform when the application is used. How will the system respond when 500 users try to place orders simultaneously? Use performance testing to determine how many complete transactions users can make. Project managers will need to line up resources in the testing, engineering and systems or operations groups to determine the best tests to execute and how to monitor and record results so they can be analyzed and used.

Start with stress and load it up

You don’t hear this often: Start with stress. Stress testing gives a failure point to start with. Stress testing also helps determine an application infrastructure’s breaking point and assists in exposing traffic bottlenecks. In order to find the breaking point, load the system until it breaks. Most systems include an internal Web server, an external Web server, database system(s) and possibly one or more messaging system(s) in between.

In order to determine the effect of stress or load tests, monitor the use of CPU, memory and throughput per second for each piece of system hardware. As the concurrent number of users or “load” on the application increases, track how the system performs. Start with a stepped loading approach that runs performance tests at a stepped load increase.

For example, execute application performance tests with 50 concurrent users (threads, messages, whatever it is the application does) and record the results. Then move up to 100, 150, 200, 300, 400, 800 and then 1000. Keep going until the application fails. The point at which something in the system fails is the breaking point. Plan to generate measurements every 2 to 5 minutes, and track the results over at least a one hour period.

In addition, capture all application logs. Logging in general during performance testing tends to slightly hinder performance, but it is invaluable in pinpointing errors or issues. Measure frequently to generate an accurate metric of how the system performs and determine where it runs into performance issues under stress or load. With accurate test data generated, useful metrics can be determined.

Most useful metrics

Let’s review the most useful metrics, or those values that can be discussed and then used to improve applications’ performance quality.

What metrics to test and measure are infinitely more important than how they are tested.

Let’s start with reviewing the number of successful transactions and the response time. Response time gives us the time it takes for a user to use our application and complete a transaction — the more complete transactions that an application can handle the better. Depending on how successful transaction rates are calculated, typically higher numbers are better.

Another useful metric is the number of page views and how long it takes them to load. It’s a double whammy. Review the test data to determine how many pages can be opened and viewed, including the time it takes the pages to load. For this metric, a high number of page views with a short loading time is best. They go together. To get a high number of page views, pages are going to have to load quickly.

Throughput is another valuable metric. The throughput value should increase with the load. It will show whether the system can or cannot handle an increasing number of concurrent users. The higher the value of throughput, the more an application can handle.

Performance problems

A common point of failure is the database connection. Database errors show up as wildly ranging values or results that start high and steadily decrease. Depending on how frequently an application writes to or pulls from a database, the more likely it’ll be affected.

The two most common database errors evident under stress and load testing conditions are database connection pool issues and slow SQL queries. These types of failures are typically written to the log files captured during testing. A database connection pool issue is generally experienced by users as a frustrating inability to connect to the application.

The page may load quickly, but nothing happens after that. A slow query means the application doesn’t give the necessary data quickly enough. Again, the page may load, but under a load or stress testing condition, users don’t get the data returned from the application in a timely manner.

Application performance metrics are invaluable when they’re meaningful. Measuring how applications perform under stress and load conditions is imperative to ensure end users have positive experiences. Measure the what — the database connection, the server response time and how quickly the application serves customers the information they’re asking for.

Use values that mean something to the application and the end users. Measuring what’s important to improving the end-user experience is what to test for. Enabling software engineers to fix issues by generating valuable metrics also serves to improve end-users’ results, because we can find issues and fix them. In the end, what metrics to test and measure are infinitely more important than how they are tested.