16 Jun

The Evolution of PHP under Apache

Over the sixteen years we’ve been doing this web hosting thing, we’ve seen a multitude of changes to the way webservers run, and more notably, in how they run scripts under, for instance, languages such as PHP.

 

In the beginning: PHP as CGI

In the beginning running a PHP script was a fairly simple affair.  The Apache webserver of the day used a preforked child based process for handling requests, where the main web server had a pool of child processes, and as each web request came into the server, it would be allocated to a worker, the worker would process that one request, and if a PHP script was involved, a separate CGI process would be spawned out to run PHP and process the scripting part of the page.   There would be a whole slew of settings in your apache configuration that would boil down to “PHP files located in /home/bob/public_html/ get run under the ‘bob’ user account”.

This worked, but it was anything but fast.  As time went on and PHP became more popular, and the sites that it powered became more complex, performance became a big issue, in order to keep things efficient and cost effective, a better way had to be found.

Enter mod_php

Along came mod_php, which sort of up-ended the entire apple cart compared to how we all thought of PHP.   mod_php was an actual module that you linked into Apache itself.  Now your PHP scripts were basically interpreted by the webserver.  No longer did Apache have to ‘spawn out’ another process to render your PHP content.  This was fast.  Then OpCode caching became a thing, with add-ons such as the APC module, where now you could allocate some RAM to be used to persistently cache the PHP code (and possibly other data) in between requests.  Things got extremely fast, and there was much rejoicing on the parts of web hosts everywhere as CPU usage decreased on servers everywhere.

Downside, this meant all PHP scripts needed to run as the same user that Apache ran as.  This meant that everyone’s scripts ran as “nobody”, or possibly “apache”.  Having all the PHP scripts on a server that hosts multiple clients and websites all run under the same security user account was (although I don’t believe anyone initially saw it coming at the time) a huge problem.  The industry started to see a rise in the number of defaced and hijacked sites, and it quickly became evident that the “everything runs as nobody” was to blame.

Malicious actors quickly built PHP scripts that when run, would scour a server, for instance by looking for any *.php files in /home/*/public_html/ on the server, and every time it found a php file it could write to, it would then inject whatever malicious code it desired to spread.  Since all the php scripts on a server ran as the same user, that user account would, by default, need access to all the scripts on the server.   It only took finding one single exploitable script on a website in which they could launch their script to allow the malicious actor to infect (or delete) all the php scripts on the server.  This of course, was not a good solution, as now every script and client account on your server was only as secure as the most out of date, un-patched script on the server!  Something else would need to come along…

(Side Rant:  If, in 2019, you find a web host that is still suggesting that you chown your scripts to “nobody:nobody” or “apache:apache”, be very afraid.  The reasons for this fear should be evident from the last paragraph, but I feel the need to re-iterate the point, because doing so is, well, just a huge mistake security wise!)

SuPHP to the rescue

Then along came SuPHP, which combined most (but not all) of the performance benefits of mod_apache with the flexibility of scripts once again being able to run as individual user accounts.   It was something of a balanced trade off between “fast” and “secure”.

Apache Gains Workers, But They’re Useless To Us

Along with other great advances to the Apache Web Server, eventually it gained the ability to have ‘Workers’ instead of just ‘Preforked Children Processes’.  While child processes worked fine for years, the overhead of ‘every connection has a dedicated child process that feeds it’ involves a fair amount of overhead.

So the Apache team implemented the ‘Worker’ MPM module.  The server still forks a group of child processes in advance, but each child process is capable of serving multiple http connections via Threads.

This brought a sizeable boost in performance on hosting servers (notably a decrease in memory needed to service a given number of http connections), but it had one big drawback… it didn’t work with PHP, so it was no good to us for quite some time.

mod_lsapi:  The best of all worlds?

With the recent migration of our servers over to the CloudLinux platform, we’ve gained the ability to leverage the lsapi Apache/PHP interface to achieve a modern ‘best of all worlds’ approach:

  • Apache runs with the mpm_worker module, allowing one child process to handle multiple connections via threads.
  • lsapi interfaces those workers to php via php-fpm as needed when an php file needs to be processed.
  • php-fpm gives us flexibility, security, and performance all in one.

Wait, what’s PHP-FPM?

PHP-FPM is sort of a ‘php server’ if you will.  It is responsible for maintaining a pool of PHP processes that can be handled your PHP code to process.

The flexibility comes in that we can control the version of PHP on a per site level, which is how we can provide the ability for users to control which version of PHP their code runs against.

On the security front, each user has their own self contained php server process, providing a ‘pool’ of PHP processes at their disposal, each running as that user, so no worries about ‘nobody’ permissions or users being able to access each other scripts.

For performance, well, we ‘persist’ the pools per user for a set amount of ‘idle time’ after the last PHP script was processed.  This means that while on the first PHP request for a given client site, we need to (quickly) spin up a PHP process, that process will live-on after the request is finished (up to our idle timeout), if a second request comes in before that time-out ends, it is serviced by the same process, saving start up time, and more importantly, allowing us to once again benefit from the performance boost of PHP OpCode Caching.

The ability to use a PHP pool, when combined with the per-user resource scheduling available to us in CloudLinux, really opens things up possibility wise, allowing us to fairly distribute resources while maintaining performances across all users on a given server.

For instance, here’s the CPU and RAM utilization over a given day for a typical user on one of our shared linux webhosting servers:

Now, this users site got a pretty normal consistent level of traffic throughout the day, it of course had some spikes here and there, but overall, pretty even-keeled distribution of traffic.   You’ll note that the memory usage is pretty even throughout the day, even though CPU jumped a bit more here and there..  This is because even though the clients site may have needed more processing time, the standard PHP process pool for them was already running, and simply kept going handling the visitor load.

Now, lets see what happens when things get… busy.  This customer had a far more interesting day:

Now, this graph is a bit of an anomaly, not enough to cause a panic, but, well, they were dealing with a slight DDoS attack, in for the form of someone attempting to brute force user logins to their forum.   The traffic was pretty steady throughout the day, but with large spikes in traffic at random times along the way.

While CPU utilization got pretty intense at times, RAM utilization stayed pretty consistent (most of the time)..   there were a couple times throughout the day when the system determined it was unable to provide all the cpu resources demanded by the attacker, so things did get throttled a bit here and there with the ebb and flow of traffic coming in from the attacker, but the ability for the server to maintain a single, persistent pool of PHP processes to handle the requests, complete with OpCode caching and all the other benefits of reusing the same process for multiple requests, allowed the overall impact to memory to stay pretty consistent throughout the day.

The important thing in this scenario is, that with the persistent php process and OpCode caching being available between requests, the overall load on the server generated by this attack was minimized.  The system did it’s best to maintain service to the client’s website throughout the ‘bursty’ attack periods, and no other client sites hosted on the same server saw any impact at all.

This is a huge change from the days of ‘PHP as CGI’, and we’re confident that Apache Workers + mod_lsapi + php-fpm gives us just the right combination of security, flexibility, and performance going forward to best serve our clients.