The Tideways XHProf Extension
tideways_xhprof is a hierarchical Profiler for PHP, forked from the XHProf Extension originally developed by Facebook.
This PHP extension is a complete, modernized open-source rewrite of the original XHProf extension, with a new core data structure and specifically optimized for PHP 7. The result is an XHProf data-format compatible extension with a much reduced overhead in the critical path that you are profiling.
The code for this extension is extracted from the main Tideways extension as we are moving to a new extension with incompatible data-format.
We are committed to provide support for this extension and port it to as many platforms as possible.
|The public API is not compatible to previous XHProf extensions and forks, as function names are different. Only the data format is compatible.|
This repository now contains an extension by the name of
tideways_xhprof, which only contains the XHProf related (Callgraph) Profiler functionality.
tideways extension contained this functionality together with other functionality used in our Software as a Service.
If you want to use the SaaS, the current approach is to fetch the code using pre-compiled binaries and packages from our Downloads page.
PHP >= 7.0
OS: Linux, MacOS, Windows (Download DLLs)
You can install the extension from source:
git clone [email protected]:tideways/php-xhprof-extension.git phpize ./configure make sudo make install
Configure the extension to load with this PHP INI directive:
Restart Apache or PHP-FPM for this change to take effect.
The API is not compatible to previous XHProf extensions and forks, only the data format is compatible:
<?php tideways_xhprof_enable(); my_application(); file_put_contents( sys_get_temp_dir() . DIRECTORY_SEPARATOR . uniqid() . '.myapplication.xhprof', serialize(tideways_xhprof_disable()) );
By default only wall clock time is measured, you can enable there additional metrics passing the
$flags bitmask to
<?php tideways_xhprof_enable(TIDEWAYS_XHPROF_FLAGS_MEMORY | TIDEWAYS_XHPROF_FLAGS_CPU); my_application(); file_put_contents( sys_get_temp_dir() . DIRECTORY_SEPARATOR . uniqid() . '.myapplication.xhprof', serialize(tideways_xhprof_disable()) );
The XHProf data format records performance data for each parent ⇒ child function call that was made between the calls to
It is formatted as an array with the parent and child function names as a key concatenated with =⇒ and an array value with 2 to 5 entries:
wtThe summary wall time of all calls of this parent =⇒ child function pair.
ctThe number of calls between this parent =⇒ child function pair.
cpuThe CPU cycle time of all calls of this parent =⇒ child function pair.
muThe sum of increase in
memory_get_usagefor this parent =⇒ child function pair.
pmuThe sum of increase in
memory_get_peak_usagefor this parent =⇒ child function pair.
TIDEWAYS_XHPROF_FLAGS_MEMORY_ALLOC flag is set, the following additional values are set:
mem.naThe sum of the number of all allocations in this function.
mem.nfThe sum of the number of all frees in this function.
mem.aaThe amount of allocated memory.
TIDEWAYS_XHPROF_FLAGS_MEMORY_ALLOC_AS_MU is set,
TIDEWAYS_XHPROF_FLAGS_MEMORY_ALLOC is activated and, if
TIDEWAYS_XHPROF_FLAGS_MEMORY_MU is not set,
mem.aa is additionally returned in
There is a "magic" function call called "main()" that represents the entry into the profiling. The wall time on this performance data describes the full timeframe that the profiling ran.
Any Profiler needs timer functions to calculate the duration of a function call and the
tideways_xhprof extension is no different.
On Linux you can collect timing information through various means.
The classic, most simple one is the function
gettimeofday, which PHP uses when you call
This function is slower compared to other mechanisms that the kernel provides.
This returns a monotonically increasing number (not a timestamp) at very high precision and much faster than
gettimeofday(). It is the preferred and recommended API to get high precision timestamps. On Xen based virtualizations (such as AWS) this call is much slower than on bare-metal or other virtualizations (Blog post)
- TSC (Time Stamp Counter) API
This is accessible in C using inline assembler. It was the timing API that the original XHProf extension used and it is generally very fast, however depending on the make and generation of the CPU might not be synchronized between cores. On modern CPUs it is usually good to use without having to force the current process to a specific CPU.
Tideways on Linux defaults to using