Jon's Blog: Service Levels

Each month we chart the response-based service levels for the application.  We create and review three suites of charts: Last Month, Historical (18-month view), and Detailed (an interesting week). As discussed in the blog, we review these charts, focusing on the outliers, looking for worrisome trends or outright trouble. We also report more extensive data in tables, which we also review.  This page presents only the charts, since charts are more interesting.

Historical Transactions and Response Distribution

The area portion of the chart shows transactions by day.  The red, green, blue, and violet lines represent the percentage of transactions the completed within 2 seconds, 5, 10, and greater than 10 seconds. The blue and violet lines show that response began to degrade starting in April.

Historical Response By Authorizer

This chart shows the average response received from each back end.  It begins to show us that the (mythical) bank1 is giving us slower response than before.

Detailed Transactions and Response Distribution

By drilling down and using 30-minute interval data, the problem becomes more evident.  We can see:

Detailed Response By Authorizer

This chart confirms that the mythical bank1 is the culprit.  Further research identified a bottleneck in the legacy communications trunk between the client and bank1. The client upgraded the trunk and the problem went away.