An overview of apache mpms and php server apis
Apache + Php
When setting up an apache+php server to run your php applications, there are lots of configuration parameters to deal with. The most important is the php Server Api (sapi). It determines how exactly is apache going to run the php scripts. Along with the php sapi another important thing to select is the apache mpm (multi processing module). The mpms determine how apache would deal with a request in general.
There are multiple options available with both the mpms and sapis and each has its own pros and cons. Some configurations are easy to setup and manage but are low on performance, whereas some are high performing but difficult to setup and maintain. Due to so much variety being available server owners often find it difficult to decide which one to select.
To name a few mpms there are prefork, worker, itk etc. On the php sapi side there are 3 basic options which are mod_php, cgi and fastcgi. Apart from these there are few more components available inside apache like suphp, suexec that can modify the behaviour of the standard sapis.
[apache <=> some mpm] <=> some sapi <=> php interpreter/process
When it comes to deciding a particular kind of configuration for the setup, there are many things to consider.
1. How much resource the server has. We need a setup that can work with limited ram and cpu. 2. What security options do we need. If we are setting up multiple websites/domains on same server through apache vhosts then we need to run php scipts in each domain with its own separate UID/user for proper directory access and privileges. 3. How much performance is needed. Is the traffic on the website high or low.
Sometimes the decision is also influenced by the ease of maintainence needed. Some sysadmins might not want to have a complex system that is difficult to maintain in the long run and might be willing to settle for an alternative configuration that is easier to configure and has a gui tool like Cpanel or Directadmin.
We shall also be talking about Cpanel later in the post.
First thing to understand are the apache mpms. MPM or multiprocessing modules are part of the apache code that handle incoming requests. So different mpms handle requests differently. It is important to note that although these are called modules they are not like loadable modules. Rather a mpm is seated right inside the apache binary. So at a time you can have only 1 mpm installed, and you must have atleast 1 mpm installed.
Check the apache documentation on mpms to learn more.
1. mpm Prefork
This is the oldest and the most basic form of mpm that comes with apache and is compatible with any kind of php server api. It implementes a non-threaded, forking based server that spawns new processes and keeps them alive to serve requests. So there is 1 process per request. Apache keeps processes ready before hand so that incoming client requests dont have to wait for the process to be forked and created. Also the same process is reused further.
The documentation explains its working as follows
A single control process is responsible for launching child processes which listen for connections and serve them when they arrive. Apache always tries to maintain several spare or idle server processes, which stand ready to serve incoming requests. In this way, clients do not need to wait for a new child processes to be forked before their requests can be served.
Due to the process spawning nature of prefork, it is also the slowest one and consumes the most resources. However, because its thread-safe, it is the only mpm that can be used with mod_php sapi. We shall talk more about this later.
2. mpm Worker
Next comes the worker mpm. It solves the limitations of its predecessor prefork. It creates multiple processes like prefork, but also creates multiple threads inside each process. Each of this thread can handle a request. Now since threads share the same resources, this mpm is resource friendly. Infact this is the recommended mpm in all cases, except when you are forced to use mod_php.
The documentation describes its working as
A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the ThreadsPerChild directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive.
Mpm worker can be used with any php sapi like cgi, suphp, fastcgi, fpm (except modphp).
There are many php extensions that are not thread safe, therefore if you attempt to run mpm worker with mod_php then the apache server crash from time to time. On ubuntu for example installing modphp would automatically remove mpm worker.
3. Mpm Itk
This is another mpm that is similar to prefork, but has the ability to run separate vhosts (sites) with a separate uid/username. It is non threaded.
It is described as
The ITK Multi-Processing Module (MPM) works in about the same way as the classical "prefork" module (that is, without threads), except that it allows you to constrain each individual vhost to a particular system user. This allows you to run several different web sites on a single server without worrying that they will be able to read each others' files. This is a third-party MPM that is not included in the normal Apache httpd.
So this is another mpm that can work with any php sapi including mod_php. But since its not threaded, not recommended. Also it is not an official apache mpm.
Worker is the best performing mpm since it uses threads. But it is incompatible with only modphp.
Php Server APIs (SAPI)
Now its time to talk about the php server apis. The server api determines how apache connects itself to the php interpreter to execute php files and get the output. Like mpm's here too there are many options and we shall be looking into each one by one and discuss the goods and bads of each.
There are modules or components available for apache that can be loaded on demand to serve various features. For example mod_php is the module that empowers apache to directly load the php interpreter and run php code. The mod_fastcgi module enables apache to talk the fastcgi protocols to fastcgi supporting applications. For example the php-cgi binary is fastcgi enabled.
1. Mod PHP (a.k.a Apache 2.0 Handler or DSO)
This is the simplest php handlers and gets installed by default on an ubuntu system for example when you try to install php from synaptic.
$ sudo apt-get install apache2 php5
ModPHP is an apache module that loads the php interpreter directly and runs php code inside apache. It does not load php as an external process using the php-cgi binary for example. Therefore the php interpreter and the apache process become one. Now when further apache processes are forked out (prefork) each apache process has the php interpreter inside itself.
The flow of control is somewhat like this
apache process (php interpreter) <==> execute php code
With mod php all your php code will run as expected without running into any mysterious error. All your environment variables (those inside $_SERVER) will be setup properly. Basically apache turns into a php interpreter directly.
This has got its own benefits and drawbacks. Lets see what are those.
- Mod PHP works only with a non threaded mpm like prefork. Due to this apache's threading capabilities are totally lost.
- Php scripts will run with only a single user like "www-data" or "nobody". So if you have multiple sites then its not possible to make each vhost to run with its own separate user/uid. A script in 1 vhost can read the files of other vhosts. This is a security concern if you are selling shared hosting.
The only exceptional case where mod_php can be used with separate uid for each vhost is with mpm-itk, which will implement user/uid switching right inside apache. Can be useful but has a limitation that it does not reuse a process and spawns a new process for every single request making it even worse than prefork.
2. CGI - Common Gateway Interface
Next comes CGI, the Common gateway interface. This is a neutral protocol that can be used to connect any webserver with any language interpreter. Here we are connecting Apache with Php. It is implemented inside apache through mod_cgi
With cgi, php execution gets separated from apache environment. Apache would instead call an external program like "php-cgi.exe" and give it the input and take back the output. So a separate php process is created.
Apache (any mpm) <=========> /usr/bin/php-cgi binary [multiple instances per mpm process/thread]
Now with cgi, we can use any apache mpm, that is prefork or worker. So here we can harness the threading capacity of apache. The preferred mpm with cgi is mpm worker.
One drawback of cgi is that since no php process/interpreter is actually persistent, its not possible to use an opcode cache like APC. Also, Cgi creates a php process only upon request, and closes them after the request is process. So there is an overhead to start a new php process on every request. This is overcome in fastcgi, where the php process is kept persistent to be reused.
And phpinfo() will report the Server API as CGI/FastCGI. This is because the php-cgi binary is being used to execute the php code.
Some more theory
CGI inside apache is made available through mod_cgi and mod_cgid. Both are by default built inside apache. So you do not need to install them separately. mod_cgid creates a separate unix-socket based daemon/service to process cgi requests. To find out the cgi daemon use the netstat command
$ netstat -a -n | grep 'cgisock' unix 2 [ ACC ] STREAM LISTENING 299135 /var/run/apache2/cgisock.10009
$ sudo apachectl -M ...... cgi_module (shared) cgid_module (shared) ......
The suphp variation
With multiple vhosts on same server, its desirable to run each vhost with its own username/uid. In cgi its easy to do so with suphp. Suphp becomes the php handler and executes php as cgi.
suPHP is a tool for executing PHP scripts with the permissions of their owners. It consists of an Apache module (mod_suphp) and a setuid root binary (suphp) that is called by the Apache module to change the uid of the process executing the PHP interpreter.
To set the user and group of a particular vhost add suphp settings like this
<VirtualHost 126.96.36.199:80> ...... <IfModule mod_suphp.c> suPHP_UserGroup mike mike </IfModule> ...... </VirtualHost>
Then the php scripts of that particular vhost are run with the given user/group permissions. Suphp implements additional security policies of its own like files must not be writable by group etc.
CGI with suexec
Suexec is an apache feature, that can be used to run cgi scripts with specific user/group privileges. Its more generic compared to suphp (which is only for php). Suexec can be used with fastcgi/fcgid as well.
Technically speaking suexec is a wrapper, it wraps the process creation to change the uid and guid. So with suexec the process creation looks like this
Apache <=> mod_cgid <=> suexec(create process with uid/guid)
To enable suexec for cgi, just add the SuexecUserGroup directive in the relevant vhost and specify the correct username and groupname.
FastCGI is like cgi but can keep multiple processes alive and reuse them. This is an enhancement to cgi. Apache has modules that enable it to talk the fastcgi protocol with fastcgi aware applications like php-cgi. These are mod_fastcgi and mod_fcgid. Apart from this php has also got a fastcgi process manager called fpm that can be used with mod_fastcgi. We shall talk about it in a few minutes.
Since fastcgi keeps processes alive, it consumes more resources in idle state compared to cgi. However fastcgi reuses php processes so is more efficient that cgi when it comes to processing concurrent resources.
It is well understood that when using php + fastcgi (mod_fastcgi or mod_fcgid), the fastcgi mod will create and manage multiple php processes and reuse them efficiently. However there is a lesser understood fact that the php process created by fastcgi is itself capable of create further child php processes and managing them.
So if fastcgi created 10 php processes, then each of these 10 processes can create 4 child php processes each, giving us a total of 10 + 10*4 = 50 processes. Lots of tutorials on fastcgi mention setting up a wrapper script which contains a setting for "PHP_FCGI_CHILDREN". This is what creates the child php processes and not the php processes created by fastcgi.
So process management can happen at 2 levels. 1st at fastcgi, 2nd inside php.
Check if the modules are loaded or not using the apachectl command.
$ sudo apachectl -M ..... fastcgi_module (shared) fcgid_module (shared) .....
If fastcgi needs to execute processes with separate uid/gid, then suexec can be used in conjunction.
The simplest way to setup apache to use fastcgi to talk to php is using mod_fastcgi. Mod_fastcgi is the original fastcgi implementation in apache. It has an inbuild process manager that manages a pool of php processes.
Apache <=> mod_fastcgi <=> internal process manager <=> pool of php processes (php-cgi binary)
Fastcgi it also supports using external process servers like php-fpm. When using and external process manager fastcgi just acts as proxy to relay the request to the application server and does not do any process management.
According to the docs
FastCGI applications under mod_fastcgi are defined as one of three types: static, dynamic, or external. They're configured using the FastCgiServer, FastCgiConfig, and FastCgiExternalServer directives respectively. ... FastCGI static and dynamic applications are spawned and managed by the FastCGI Process Manager, fcgi-pm. The process manager is spawned by Apache at server initialization. External applications are presumed to be started and managed independently.
So with FastCgiExternalServer its possible to specify an external process managing server that communicates over socket connection with the mod_fastcgi.
Mod_fcgid is newer and is an alternative to mod_fastcgi and considered slightly better than mod_fastcgi. It has its own internal process management like mod_fastcgi, but does not support using an external process server like php-fpm. So you cannot use mod_fcgid and php-fpm together. And for the rest, it does the same thing as fastcgi.
However there is 1 major technical difference between mod_fastcgi and mod_fcgid. mod_fcgid cannot do multiplexing.
Now what does that mean ? Recall the concept of php doing internal process management we talked about a few lines above. This is what we would expect
Apache <=> fastcgi handler <= managing => 10 php processes <= each process managing 4 processes each => 40 child processes
For the above policy to work the fastcgi handler must send more than 1 request to the same php process in the 10 process set. mod_fcgid cannot do this. It will forward only 1 request to every process. So having child processes is useless and will create un necessary problems.
Therefore using mod_fcgid ensure that php is not creating and managing child processes. The apache documentation on mod_fcgid mentions this very clearly.
PHP child process management (PHP_FCGI_CHILDREN) should always be disabled with mod_fcgid, which will only route one request at a time to application processes it has spawned; thus, any child processes created by PHP will not be used effectively. (Additionally, the PHP child processes may not be terminated properly.) By default, and with the environment variable setting PHP_FCGI_CHILDREN=0, PHP child process management is disabled. The popular APC opcode cache for PHP cannot share a cache between PHP FastCGI processes unless PHP manages the child processes. Thus, the effectiveness of the cache is limited with mod_fcgid; concurrent PHP requests will use different opcode caches.
Another user at this mailing list says
PHP forks because you tell it to. If you ever set PHP_FCGI_CHILDREN to anything, PHP will fork *that many* processes right away regardless of how many requests you receive. mod_fcgid doesn't know anything about this, so you're essentially spawning processes at two levels (mod_fcgid and PHP itself). My suggestion is to yank our PHP_FCGI_CHILDREN completely. Just leave the process management to mod_fcgid.
mod_fastcgi with php-fpm
Php-FPM is the new style of running php with fastcgi. FPM stands of fastcgi process manager and it is a separate process manager for managing multiple php processes and reusing them. It can be used only with mod_fastcgi.
Php fpm is a process manager that is implemented inside php itself, so you do no need the extra process management done by mod_fastcgi.
Php-fpm achieves the highest level of separation between apache and php. Both of them are separated by a socket based connection and therefore are sitting at 2 different locations talking over a phone line. This has its own benefits.
Apache (any mpm) <=> mod_fastcgi <===== [ Socket communication ] =====> Php-FPM <=> Pool of php processes
If anything goes wrong with php, it would not affect apache in anyway, just that apache would find php missing and report a 500 Internal Server error to its user.
In order to use php-fpm with mod_fastcgi we specify the php-fpm socket as an external server. Then mod_fastcgi just acts as a proxy to forward the request to php-fpm and does not do any kind of process management (since it has got nothing to manage).
This is the most recommended way to use php with fastcgi. It works fine on servers with minimal resources. I tried it on an amazon server with 512mb ram and the performance was good.
mod_proxy_fcgi + php-fpm
And this is the latest one. As mentioned earlier that when using mod_fastcgi + php-fpm, mod_fastcgi does not do any process management and just becomes a request relaying proxy between apache and fpm. So mod_proxy_fcgi is the final version of this technique. That is, it is just a fastcgi speaking proxy that enables apache to talk with fpm.
You can think of it as mod_fastcgi minus its process manager. This is the ideal and perfect way to run php with fastcgi and has the best performance. But it can only be used with newer versions of apache, that is version 2.4 onwards.
Php-fpm supports running different vhosts with different uids. But setting this up requires some configuration.
Php-fpm manages a pool of php processes to serve requests. Now, it can manage multiple separate pools each running with a different uid/gid. Multiple pools can be created by creating fpm configuration files.
Each pool will have a separate socket (mapped to a unix file). And then each apache vhost has to be configured to use its own separate socket. In this way each vhost runs with its own uid.
Setting this up will require quite some effort when running multiple sites/vhosts on a server. However if running your own few websites, then its a do once and forget thing.
Moreover having multiple pools of php processes will significantly increase the resource usage, especially RAM. If one pool has 5 process anytime, and each taking around 4M, the total is minimum 20MB in 1 pool.
If there are 5 such pools it would require 100MB in its idle state and more when traffic rises on any single website. So consider this carefully when setting up multiple pools of fpm.
Using multiple php server apis
It is possible to use multiple php server apis in different vhosts at the same time on a server. Means if you are hosting multiple websites on apache on the same server, each can use a different php sapi. One could run on suphp, another could run on plain mod_fastcgi, and another on mod_fastcgi+php-fpm. You just need to install the correct modules, enable them and put the relevant directives in the vhost configuration block.
Which setup to choose
So now its time to choose the correct apache+php configuration/setup.
All the mpms and sapis mentioned above are in use but the most common combinations are :
1. prefork + mod_php 2. worker + suphp (cgi) 3. worker + mod_fastcgi (with suexec optionally to separate vhost uid) 4. worker + mod_fcgid (with suexec optionally to separate vhost uid) 5. worker + mod_fastcgi + php-fpm
The CGI setup loads php process only upon request, so it consumes lesser memory in idle state. But creating the process upon demand introduces delay. This cannot be used on high traffic websites or large number of concurrent requests.
Fastcgi maintains a pool of php processes so consumes more memory when idle. But it can process requests faster because there is no overhead of creating processes at request time.
Fastcgi cannot be used on low hardware server. Atleast 512 mb ram is recommended to run it on CentOS for example.
Modphp cannot use a threaded mpm, so it is forced to use a process based prefork mpm which is itself slow. Moreover its inability to run php scripts with separate uids and correct permissions causes both usability and security issues.
In shared hosting environments the suphp/cgi setup is very commonly seen because it requires minimal configuration and maintainence and gui tools like cpanel support it very well. And most importantly it gets enough security through suphp.
However if you have your own dedicated server with ample resources then consider fastcgi. Php-FPM is becoming the most preferred and recommended way to deploy php with fastcgi. Php-fpm has been bundled with php recently (as of version 5.3.3) so is waiting to be adopted widely.
What about APC
APC the alternative php cache is a caching solution for php that is inbuilt inside php and provides opcode and data caching features. Opcode caching creates a cache of compiled php code so that the php code does not have to be compiled again and again. This gives a considerable gain the total execution time of php code by the php interpreter.
However apc has its own limitations with variou kinds of apache/php setups. Apc provides 2 things, the opcode cache and user cache (to store application data). In order for both things to work correctly 2 things are necessary
1. Php must stay peristent to keep apc up.
2. There should be only be 1 apc cache.
By design, apc creates a separate cache for each php process. If multiple processes are involved then the user cache cannot be used reliably (like in fcgid or plain mod_fastcgi). And the php process/interpreter must stay persistent for the opcode cache to be reusable(mod_php).
In mod_php+prefork, all apache processes have a copy of php interpreter with them and since all apache processes are forked out of 1 parent process, the same apc cache is shared across all of them. This effectively gives 1 single apc cache. Hence apc works perfect.
With cgi on the other hand, external php process is created by launching the php-cgi binary. And the process dies after processing the request. So the cache cannot stay alive and is not usable. The data is lost immediately after the request is over.
Now coming to fastcgi. It is commonly believed that having persistent php process means apc cache is available and usable. Since fastcgi keeps php processes persistent, apc should work as expected. But since fastcgi spawns separate php processes, each gets its own apc cache. This has some effects.
Since fastcgi reuses the php processes, the opcode cache should work as expected. But remember that fastcgi destroys a process after certain number of requests. Then the cache is gone.
Since each php process has its own separate cache, there is memory being used.
And most importantly the user/data cache (apc_fetch/apc_store functions) is not reliable. One request would store some data using apc_store in 1 process's apc cache and the next request would get processed by another php process which would not have the data in the user cache.
So with fastcgi, apc is there but kind of broken.
Check these links for further detailed discussion on this topic.
But the story does not end here. With php-fpm apc cache becomes usable once again. Fpm has 1 managing php process that forks out child php processes. Hence the entire group shares the same apc cache. This makes the opcode cache and user cache fully usable.
Learn about various options to setup php with apache
Explains how to bring mod_fastcgi and apc together with correct user security