Contributing to PHP: How to Fix Bugs in the PHP Core

Previously, we covered contributing to PHP’s documentation. Now, we will be covering how to get involved with PHP’s core. To do this, we will be looking at the workflow for fixing a simple bug in the core.

PHP logo
Since submitting new features to PHP has already been explained pretty well, we will not be covering that here. Also, this article does not seek to teach PHP’s internals. For more information on that, please see my previous posts on adding features to PHP.

Resolving Bugs

Fixing bugs in the core is a great way to gain a basic familiarity with PHP’s internals. It only requires a basic knowledge of C and is an easy way to help with the effort of improving PHP. But before doing so, a basic familiarity is required with PHP’s version management.

PHP Version Management Lifecycle

PHP minor versions follow a yearly release cycle, and each minor version has 3 years of support. The first 2 years provide “active support” for general bug fixes, and the final year provides “security support” for security fixes only. After the 3 year cycle has ended, support for that PHP version is dropped.
The currently supported PHP versions can be seen on the php.net website. At the time of writing, PHP 5.5 is in security support, and PHP 5.6 and 7 are in active support.

Fixing a Bug

To demonstrate a basic workflow, let’s resolve bug #71635 from bugs.php.net. The bug report states that there is a segfault when invoking DatePeriod::getEndDate() when no end date is available. So the first thing we will want to do is confirm its validity.
For bugs that look trivial (with little or no environmental setup requirements), we can begin by quickly seeing if the bug can be reproduced in 3v4l. (3v4l is a handy tool that runs a snippet of code on hundreds of PHP versions.) This allows us to see all of the affected PHP versions, which is handy to quickly find out if older, still supported versions of PHP are affected. As we can see, PHP exits with a segfault for all versions 5.6.5 through to 7.0.4.
3v4l result screenshot
Regardless of whether the bug can be replicated in 3v4l or not, we’re going to need to replicate it locally before we can go about fixing it. For this, you’ll need to fork php/php-src and clone your fork locally. If you already did this some time ago, you may need to update your clone, along with retrieving all of the latest tagged releases (with git remote update).
We’re going to work on the PHP 5.6 branch since that’s the lowest version of PHP affected by this bug (whilst still being actively supported). (Had this bug affected PHP 5.5, we would still ignore this version and work against PHP 5.6 due to this bug not being security related.) The standard workflow for submitting bug fixes is to target the fix to the lowest affected (whilst still supported) PHP version. One of the php/php-src developers will then merge the fix upwards as necessary.
So let’s checkout a copy of the PHP 5.6 branch to work in:
git checkout -b fix-dateperiod-segfault upstream/php-5.6
We then build PHP and attempt to reproduce the segfault locally by creating a file (say segfault.php) with the following code:


$period = new DatePeriod(new DateTimeImmutable("now"), new DateInterval("P2Y4DT6H8M"), 2);
var_dump($period->getEndDate());
We then run segfault.php with the freshly built PHP binary:
sapi/cli/php -n segfault.php
(The -n flag means that a php.ini file will not be used for configuration. This is particularly handy to use if you have custom extensions loaded into your default php.ini file, since it will prevent a load of errors from popping up each time you execute a file with a local PHP binary.)
Once confirmed that we can trigger this locally, we can then create a test for it. Let’s call this test filebug71635.phpt and place it in the ext/date/tests/ folder with the following contents:
--TEST--
Bug #71635 (segfault in DatePeriod::getEndDate() when no end date has been set)
--FILE--

date_default_timezone_set('UTC');
$period = new DatePeriod(new DateTimeImmutable("now"), new DateInterval("P2Y4DT6H8M"), 2);

var_dump($period->getEndDate());
?>
--EXPECT--
NULL

Running that single test shows that it does not pass:
make test TESTS=ext/date/tests/bug71635.phpt
We now run a debugger of our choice on the segfault.php file that we created earlier. (I use LLDBbecause that’s what Mac OS X bundles with now, but GDB is another similar debugger that hasoverlapping commands.)
lldb sapi/cli/php a.php
(The -n command has not been used this time, since it seems to mess with lldb.)
Now we’re in the LLDB debugger, we type run to execute the file. It should show where in the code the segfault occurred:
Error display
Whilst the first frame doesn’t seem to show us anything overly meaningful (unless you program in asm), we can see that the program stopped because of an EXC_BAD_ACCESS. It also showed us that the pointer address it attempted to manipulate was 0x0, so we can see that we have a null pointer access.
Using the bt command shows us the backtrace of the segfault (every frame leading up to the segfault). Looking at frame #1 (by entering frame select 1), we are back into C code and can see the line causing the problem:
Problematic line detected
From this, we can infer that the culprit is dpobj->end evaluating to null, and thus attempting to dereference it causes the segfault. So, we place a check above this to see if dpobj->end is a null pointer, and if so, simply return from the function (doing this as early as possible):
PHP_METHOD(DatePeriod, getEndDate)
{
        php_period_obj   *dpobj;
        php_date_obj     *dateobj;

        if (zend_parse_parameters_none() == FAILURE) {
                return;
        }

        dpobj = (php_period_obj *)zend_object_store_get_object(getThis() TSRMLS_CC);
+
+        if (!dpobj->end) {
+                return;
+        }

        php_date_instantiate(dpobj->start_ce, return_value TSRMLS_CC);
        dateobj = (php_date_obj *)zend_object_store_get_object(return_value TSRMLS_CC);
        dateobj->time = timelib_time_ctor();

        *dateobj->time = *dpobj->end;
        if (dpobj->end->tz_abbr) {
                dateobj->time->tz_abbr = strdup(dpobj->end->tz_abbr);
        }
        if (dpobj->end->tz_info) {
                dateobj->time->tz_info = dpobj->end->tz_info;
        }
}
(Returning from a method implicitly makes the function return null (as all internal PHP functions do on failure). This is because the return_value variable (which is accessible in any function definition) holds the function’s actual return value, and it defaults to null.)
So let’s build PHP and run our test again:
make test TESTS=ext/date/tests/bug71635.phpt
It should now pass! Now we can simply commit the updated file and the corresponding bug test, and then submit a PR against the 5.6 branch of php/php-src.

Conclusion

This article has demonstrated a simple workflow used when resolving bugs in the core. Solving bugs is a great starting point to getting involved with PHP’s internals, and it requires very little knowledge of C.
Bug fixing also serves as a nice series of small programming challenges for those who are bored of the algorithmic-based challenges found at Project Euler and similar websites. And with over 5,000 open bug reports, there’s certainly no shortage of bugs to tackle!
Contributing to PHP: How to Fix Bugs in the PHP Core Contributing to PHP: How to Fix Bugs in the PHP Core Reviewed by JohnBlogger on 4:47 PM Rating: 5

No comments: