What do these metrics and numbers mean?

This documentation acts as a glossary for each metric and number in the different parts of Tideways.

Monitoring Chart

The monitoring chart is displayed on the service overview page, the transaction details page and the history page.

Metric Name

Description

95% Percentile

The response time of the application calculated with the statistical score "percentiles". It means that 95% of all responses are faster than the given number and only 5% are slower. Looking at Response time using the 95% percentile instead of the Average or Median is better, bcecause percentiles better account for outliers that are very common in response times. Response time is measured in either Milliseconds (ms) or Seconds (s).

Max Memory

The maximum PHP memory any request in the selected time interval consumed. This metric is important to compare with the PHP memory_limit setting that defines how much any PHP request can use in memory, otherwise it is aborted. Max meamury is measured in Kilobytes (KB) or Megabytes (MB).

Requests

How many PHP Requests were handled by the application in the currently active time frame.

Failures, Failure Rate

How many percent of PHP requests failed due to one of three reasons: with a Fatal Error due to an Uncaught Exception, by explicitly returning a HTTP status code of 500 or more, by being slower than the response time target.

Events

How many notable events such as New Exceptions, Relesases or Notifications happend in the currently active time frame. These events are visible as vertical lines with a flag on the top in the chart.

Tooltip in Monitoring Chart

The monitoring chart tooltip is shown when hovering over the primary chart in the service overview page, the transaction detials page and the history page.

Metric Name

Description

95% Percentile

See definition above

Max Memory

See definition above

Requests

See definition above

Failures

How many PHP requests failed due to one of three reasons: with a Fatal Error due to an Uncaught Exception, by explicitly returning a HTTP status code of 500 or more, by being slower than the response time target.

Average Total

The averare response time of all requests in the time period the tooltip covers. This is presented as an anchor for the following average latencies of the downstream layers, databases, external services and so on.

SQL

The average time of a request spent performing SQL queries in the time period the tooltip covers.

HTTP

The average time of a request spent performing HTTP requests against other internal or external HTTP APIs in the time period the tooltip covers. This includes cURL, file_get_contents and stream based APIs to perform HTTP requests.

Memcache

The average time of a request spent performing Memcache operations in the time period the tooltip covers.

Redis

The average time of a request spent performing Redis operations in the time period the tooltip covers.

Elasticsearch

The average time of a request spent performing Elasticsearch operations in the time period the tooltip covers.

MongoDB

The average time of a request spent performing MongoDB operations in the time period the tooltip covers.

File I/O

The average time of a request spent performing file operations such as copy, move, is_dir, fopen, fread and others in the time period the tooltip covers. Only operations performed on the file:// stream are counted towards this metric.

Compiling

The average time of a request spent compiling PHP scripts to Bytecode in the time period the tooltip covers. When using Opcache this metric should 0ms or very close to it, because PHP scripts need to be compiled only on deployments.

Autoloading

The average time of a request spent in the PHP Autoloader in the time period the tooltip covers. Autoloading the PHP scripts necessary to run all code usually takes a significant amount of time and can be a bottleneck to look out for. This timer does not include time spent on compiling or File I/O that could potentially happen during autoloading, it refers entirely to computation time.

Transactions List

The transactions list is shown on the application/service monitoring screen and in the history screen.

Metric Name

Description

Typical Response

This metric is the average response time of the transaction

Problem Response

This metric is an approximation of the 95% percentile of the transaction for the selected time period, but it is not the actual 95% percentile because computing the exact percentile value would be to slow to query. The exact 95% percentile of a transaction can be viewed in the transaction details page.

Memory

The maximum PHP memory any request of the transaction in the selected time interval consumed. This metric is important to compare with the PHP memory_limit setting that defines how much any PHP request can use in memory, otherwise it is aborted. Max meamury is measured in Kilobytes (KB) or Megabytes (MB).

Failures

How many PHP requests failed due to one of three reasons: with a Fatal Error due to an Uncaught Exception, by explicitly returning a HTTP status code of 500 or more, by being slower than the response time target.

Impact

The impact of a transaction expresses how much total time is spent in a transaction compared to all other transactions in the same application/service. This number is influenced by a combination of the number of requests a transaction handles and the response time of these requests. A value of 50% means that half time time spent processing in this application is spent in this single transaction with 50% impact.

Trace Header/Summary

The Trace Header/Summary is shown for each entry in the traces list page and in the header of the trace details page.

Metric Name

Description

SQL

The time spent performing SQL queries during the trace.

HTTP

The time spent performing HTTP requests against other internal or external HTTP APIs during the trace. This includes cURL, file_get_contents and stream based APIs to perform HTTP requests.

Memcache

The time spent performing Memcache operations during the trace.

Redis

The time spent performing Redis operations during the trace.

Elasticsearch

The time spent performing Elasticsearch operations during the trace.

MongoDB

The time spent performing MongoDB operations during the trace.

File I/O

The time spent performing file operations such as copy, move, is_dir, fopen, fread and others during the trace. Only operations performed on the file:// stream are counted towards this metric.

Compiling

The time spent compiling PHP scripts to Bytecode during the trace. When using Opcache this metric should 0ms or very close to it, because PHP scripts need to be compiled only on deployments.

Autoloading

The time spent in the PHP Autoloader during the trace. Autoloading the PHP scripts necessary to run all code usually takes a significant amount of time and can be a bottleneck to look out for. This timer also includes the time spent on comipiling and File I/O that were performed during autoloading. This is different to "Autoloading" in monitoring data where they are excluded from the number.

Unaccounted Wait

Remaining I/O wait time recorded in the trace that cannot be accounted by the known time spent in SQL, HTTP, File I/O and other layers. More Details

Callgraph

The callgraph is a screen inside the trace details screen and contains a panel with a table of all function calls, a panel with details to the currently highlighted function call and a visual callgraph.

Metric Name

Description

Latency

Measuring "costs/performance" in terms of time and duration until a function is completed.

Memory

Measuring "costs/performance" in terms of memory use by a function.

Total Time

The total duration a function took by counting its own time and including the time of all child function calls.

Self Time

The duration a function took by counting only its own time and not calls to child functions.

Calls

How often a function was called during the callgraph’s recording.

Children

All functions that are called directly by a given function.

Parents

All functions that directly called the given function.

Folded Calls

A list of calls that were hidden and folded into another function call, because their duration is similar and they call each other in a long chain.

Still need help? Email [email protected]