Anonymized SQL Statements

The Timeline Profiler provides great detail into all executed SQL statements inside PHP applications. Given the nature of SQL strings to include personal data and other secrets, anonymization is an important feature provided by Tideways to ensure no data of your application is sent off your servers.

How the SQL Anonymizer works

The SQL statement anonymizer is located in the tideways-daemon, a component that is running on your infrastructure. There every SQL statement transmitted by the PHP extension, gets parsed and all values are anonymized. For example a query of the following kind:

SELECT
    t0.application_id AS application_id_1,
    t0.name AS name_2,
    t0.type AS type_3,
FROM service t0
WHERE t0.application_id = 1 AND t0.type = 'php'

Gets all string and number values replaced by question marks in the following way:

SELECT
  t0.application_id AS application_id_1,
  t0.name AS name_2,
  [..]
FROM
  service t0
WHERE
  t0.application_id = ?
  AND t0.type = ?

In addition parts of the SQL statement are shortened to reduce the size of the payload and make the statement better to grasp in its entirety. This means the following reductions are performed:

  • The list of columns in the SELECT part are reduced to at most 2 columns. If the SQL column list is very large, then this parsing already happens in the PHP extension and not in the daemon.

  • Subselects are reduced to a (…​)

  • Long lists of JOIN parts are reduced to only the first few ones and the rest is reduced to (…​).

SQL Truncation in the PHP Extension

While SQL anonymization is mostly performed on the daemon component of the Tideways stack, a small part may be done on the PHP extension. When an SQL statement is longer than 4000 characters, then the extension will truncate it down to a maximum of 4000 characters to keep the size of the tracing payload from growing too large.

There are two strategies for truncating the SQL statements:

  1. Up until PHP Extension 5.5 and since then when INI setting tideways.features.sql_smart_truncate=0 the SQL statement will be truncated from the beginning up to 4000 characters.

  2. Starting with PHP Extension 5.5 the tideways.features.sql_smart_truncate=1 setting is the default, activating a smart truncation for SELECT statement where the column list is shortened so that for statements above 4000 characters the structurally more important FROM, JOIN and WHERE clauses are kept instead of SELECT clause.

Still need help? Email [email protected]