Parse MVC style urls in PHP

If you have used codeigniter or some mvc php framework then you might be aware of urls like

www.yoursite.com/index.php/class/method/param1/param2?a=b&c=d

or

www.yoursite.com/class/method/param1/param2?a=b&c=d
if you use mod rewrite to add the index.php

The htaccess code could be like this :

<IfModule mod_rewrite.c>
	#Start the rewrite engine
	RewriteEngine on
	#If requested thing is not a file or directory , then rewrite
	RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ index.php/$1 [L]
</IfModule>

The above kind of urls need to be parsed inside PHP to extract the class name , method name and parameters.
The parsing approach could be different based on the Server API (sapi) PHP is using.

1. apache2handler

When PHP is running with apache2handler and the url is :

www.yoursite.com/index.php/class/method/param?a=b&c=d

the contents of $_SERVER could be :

Array
(
    [HTTP_HOST] => localhost
    [HTTP_CONNECTION] => keep-alive
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [HTTP_COOKIE] => PHPSESSID=kh02qm9ulc0e9daldl1pe2mbv3
    [PATH] => /usr/local/bin:/usr/bin:/bin
    [SERVER_SIGNATURE] => <address>Apache/2.2.17 (Ubuntu) Server at localhost Port 80</address> 
 
    [SERVER_SOFTWARE] => Apache/2.2.17 (Ubuntu)
    [SERVER_NAME] => localhost
    [SERVER_ADDR] => 127.0.0.1
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 127.0.0.1
    [DOCUMENT_ROOT] => /var/www
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /var/www/mvc/index.php
    [REMOTE_PORT] => 50600
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => a=b&c=d
    [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PATH_INFO] => /class/method/param
    [PATH_TRANSLATED] => /var/www/class/method/param
    [PHP_SELF] => /mvc/index.php/class/method/param
    [REQUEST_TIME] => 1320063355
)

In the above format the following can be used
1. PATH_INFO








2. REQUEST_URI + SCRIPT_NAME

With a url like this :

www.yoursite.com/class/method/param?a=b&c=d

the content of $_SERVER could be :

Array
(
    [REDIRECT_STATUS] => 200
    [HTTP_HOST] => localhost
    [HTTP_CONNECTION] => keep-alive
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [HTTP_COOKIE] => PHPSESSID=kh02qm9ulc0e9daldl1pe2mbv3
    [PATH] => /usr/local/bin:/usr/bin:/bin
    [SERVER_SIGNATURE] => <address>Apache/2.2.17 (Ubuntu) Server at localhost Port 80</address> 
 
    [SERVER_SOFTWARE] => Apache/2.2.17 (Ubuntu)
    [SERVER_NAME] => localhost
    [SERVER_ADDR] => 127.0.0.1
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 127.0.0.1
    [DOCUMENT_ROOT] => /var/www
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /var/www/mvc/index.php
    [REMOTE_PORT] => 38492
    [REDIRECT_QUERY_STRING] => a=b&c=d
    [REDIRECT_URL] => /mvc/class/method/param
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => a=b&c=d
    [REQUEST_URI] => /mvc/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PATH_INFO] => /class/method/param
    [PATH_TRANSLATED] => /var/www/class/method/param
    [PHP_SELF] => /mvc/index.php/class/method/param
    [REQUEST_TIME] => 1320063541
)

In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME

2. CGI

When PHP is running with CGI and the url is :

www.yoursite.com/index.php/class/method/param?a=b&c=d

the contents of $_SERVER could be :

Array
(
    [DOCUMENT_ROOT] => /home/projects/public_html
    [GATEWAY_INTERFACE] => CGI/1.1
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_CONNECTION] => keep-alive
    [HTTP_COOKIE] => __utmz=179618234.1309856897.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=179618234.703966342.1309856897.1309856897.1309856897.1
    [HTTP_HOST] => www.yoursite.com
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [PATH] => /bin:/usr/bin
    [PATH_INFO] => /class/method/param
    [PATH_TRANSLATED] => /home/projects/public_html/mvc/index.php
    [QUERY_STRING] => a=b&c=d
    [REDIRECT_STATUS] => 200
    [REMOTE_ADDR] => 59.93.205.192
    [REMOTE_PORT] => 60102
    [REQUEST_METHOD] => GET
    [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d
    [SCRIPT_FILENAME] => /home/projects/public_html/mvc/index.php
    [SCRIPT_NAME] => /mvc/index.php
    [SERVER_ADDR] => 64.131.72.23
    [SERVER_ADMIN] => [email protected]
    [SERVER_NAME] => www.yoursite.com
    [SERVER_PORT] => 80
    [SERVER_PROTOCOL] => HTTP/1.1
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [UNIQUE_ID] => Tq6V40CDSBcAAA5LtMgAAAAC
    [PHP_SELF] => /mvc/index.php/class/method/param
    [REQUEST_TIME] => 1320064483
    [argv] => Array
        (
            [0] => a=b&c=d
        )
 
    [argc] => 1
)

In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME

With a url like this :

www.yoursite.com/class/method/param?a=b&c=d

the content of $_SERVER could be :

Array
(
    [DOCUMENT_ROOT] => /home/projects/public_html
    [GATEWAY_INTERFACE] => CGI/1.1
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_CONNECTION] => keep-alive
    [HTTP_COOKIE] => __utmz=179618234.1309856897.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=179618234.703966342.1309856897.1309856897.1309856897.1
    [HTTP_HOST] => www.yoursite.com
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [PATH] => /bin:/usr/bin
    [QUERY_STRING] => a=b&c=d
    [REDIRECT_QUERY_STRING] => a=b&c=d
    [REDIRECT_STATUS] => 200
    [REDIRECT_UNIQUE_ID] => Tq6WCUCDSBcAAA5jtfAAAAAI
    [REDIRECT_URL] => /mvc/class/method/param
    [REMOTE_ADDR] => 59.93.205.192
    [REMOTE_PORT] => 60104
    [REQUEST_METHOD] => GET
    [REQUEST_URI] => /mvc/class/method/param?a=b&c=d
    [SCRIPT_FILENAME] => /home/projects/public_html/mvc/index.php
    [SCRIPT_NAME] => /mvc/index.php
    [SERVER_ADDR] => 64.131.72.23
    [SERVER_ADMIN] => [email protected]
    [SERVER_NAME] => www.yoursite.com
    [SERVER_PORT] => 80
    [SERVER_PROTOCOL] => HTTP/1.1
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [UNIQUE_ID] => Tq6WCUCDSBcAAA5jtfAAAAAI
    [ORIG_PATH_INFO] => /class/method/param
    [ORIG_PATH_TRANSLATED] => /home/projects/public_html/mvc/index.php
    [PHP_SELF] => /mvc/index.php
    [REQUEST_TIME] => 1320064521
    [argv] => Array
        (
            [0] => a=b&c=d
        )
 
    [argc] => 1
)

In the above format the following can be used
1. ORIG_PATH_INFO
2. REQUEST_URI + SCRIPT_NAME

3. CGI/FastCGI

FastCGI is known to be buggy with urls like :

www.yoursite.com/index.php/class/method/param?a=b&c=d

The page might simply say :
No input file specified.

or

the contents of $_SERVER could be :


Array
(
    [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
    [PWD] => /usr/local/cpanel/cgi-sys
    [SHLVL] => 0
    [FCGI_ROLE] => RESPONDER
    [UNIQUE_ID] => Tq6bN0FiYKIAAH8iRYMAAAAY
    [HTTP_HOST] => 108.co.in
    [HTTP_CONNECTION] => close
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [SERVER_NAME] => 108.co.in
    [SERVER_ADDR] => 65.98.96.162
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 59.93.205.192
    [DOCUMENT_ROOT] => /home/name108/public_html
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php
    [REMOTE_PORT] => 46355
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => a=b&c=d
    [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PATH_INFO] => /class/method/param
    [PATH_TRANSLATED] => /home/name108/public_html/class/method/param
    [PHP_SELF] => /mvc/index.php/class/method/param
    [REQUEST_TIME] => 1320065847
    [argv] => Array
        (
            [0] => a=b&c=d
        )
 
    [argc] => 1
)

In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME

If page says : No input file specified.
then use a url like :

www.yoursite.com/index.php?/class/method/param?a=b&c=d

the contents of $_SERVER could be :

Array
(
    [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
    [PWD] => /usr/local/cpanel/cgi-sys
    [SHLVL] => 0
    [FCGI_ROLE] => RESPONDER
    [UNIQUE_ID] => Tq6b00FiYKIAAAGjPXAAAAAS
    [HTTP_HOST] => 108.co.in
    [HTTP_CONNECTION] => close
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [SERVER_NAME] => 108.co.in
    [SERVER_ADDR] => 65.98.96.162
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 59.93.205.192
    [DOCUMENT_ROOT] => /home/name108/public_html
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php
    [REMOTE_PORT] => 46432
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => /class/method/param?a=b&c=d
    [REQUEST_URI] => /mvc/index.php?/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PHP_SELF] => /mvc/index.php
    [REQUEST_TIME] => 1320066003
    [argv] => Array
        (
            [0] => /class/method/param?a=b&c=d
        )
 
    [argc] => 1
)

In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME

a url like :

www.yoursite.com/class/method/param?a=b&c=d

might say :
No input file specified.

So the mod rewrite rule can be modified like this :

<IfModule mod_rewrite.c>
	#Start the rewrite engine
	RewriteEngine on
	#If requested thing is not a file or directory , then rewrite
	RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ index.php?/$1 [L]
</IfModule>
Array
(
    [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
    [PWD] => /usr/local/cpanel/cgi-sys
    [SHLVL] => 0
    [FCGI_ROLE] => RESPONDER
    [REDIRECT_UNIQUE_ID] => Tq6hOUFiYKIAAAbWHZgAAAAW
    [REDIRECT_STATUS] => 200
    [UNIQUE_ID] => Tq6hOUFiYKIAAAbWHZgAAAAW
    [HTTP_HOST] => 108.co.in
    [HTTP_CONNECTION] => close
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [SERVER_NAME] => 108.co.in
    [SERVER_ADDR] => 65.98.96.162
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 59.93.205.192
    [DOCUMENT_ROOT] => /home/name108/public_html
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php
    [REMOTE_PORT] => 45872
    [REDIRECT_QUERY_STRING] => /class/method/param
    [REDIRECT_URL] => /mvc/class/method/param
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => /class/method/param
    [REQUEST_URI] => /mvc/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PHP_SELF] => /mvc/index.php
    [REQUEST_TIME] => 1320067385
    [argv] => Array
        (
            [0] => /class/method/param
        )
 
    [argc] => 1
)

As we can see above QUERY_STRING looses the actual GET parameters.

In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME

With a url like :

www.yoursite.com/?/class/method/param?a=b&c=d

the contents of $_SERVER could be :

Array
(
    [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
    [PWD] => /usr/local/cpanel/cgi-sys
    [SHLVL] => 0
    [FCGI_ROLE] => RESPONDER
    [UNIQUE_ID] => Tq6cPEFiYKIAAAIoVYIAAAAW
    [HTTP_HOST] => 108.co.in
    [HTTP_CONNECTION] => close
    [HTTP_CACHE_CONTROL] => max-age=0
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5
    [SERVER_SIGNATURE] => 
    [SERVER_SOFTWARE] => Apache
    [SERVER_NAME] => 108.co.in
    [SERVER_ADDR] => 65.98.96.162
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 59.93.205.192
    [DOCUMENT_ROOT] => /home/name108/public_html
    [SERVER_ADMIN] => [email protected]
    [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php
    [REMOTE_PORT] => 46446
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => /class/method/param?a=b&c=d
    [REQUEST_URI] => /mvc/?/class/method/param?a=b&c=d
    [SCRIPT_NAME] => /mvc/index.php
    [PHP_SELF] => /mvc/index.php
    [REQUEST_TIME] => 1320066108
    [argv] => Array
        (
            [0] => /class/method/param?a=b&c=d
        )
 
    [argc] => 1
)

In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME

Parsing in PHP

When using PATH_INFO or ORIG_PATH_INFO , the code is pretty straightforward :

$uri = $_SERVER['PATH_INFO'];
$segments = explode('/' , $uri);

$class = $segments[0];
$method = $segments[1];

$parameters = array_slice( $segments , 2 , -1);

When using REQUEST_URI + SCRIPT_NAME its needs a bit of extra coding :

$uri = $_SERVER['REQUEST_URI'];

//Check if the SCRIPT_NAME appears in the REQUEST_URI as a whole
if (strpos($uri, $_SERVER['SCRIPT_NAME']) === 0)
{
  //Take the part that appears after the script_name  
  $uri = substr($uri, strlen($_SERVER['SCRIPT_NAME']));
}

//Check if the directory name of SCRIPT_NAME appears in the REQUEST_URI
elseif (strpos($uri, dirname($_SERVER['SCRIPT_NAME'])) === 0)
{
 //Take the part that appears after the script_name  
 $uri = substr($uri, strlen(dirname($_SERVER['SCRIPT_NAME'])));
}

/*
  Test for urls like index.php?/class/method/param?a=b&c=d
  These kind of urls are used for fastcgi and godaddy servers!!
  Also on nginx
*/
if (strncmp($uri, '?/', 2) === 0)
{
  //Remove the first ?/
  $uri = substr($uri, 2);
}

/*
  Now split at the ?
  The first part will be uri
  The Second part will be actual query string
*/
$parts = preg_split('#?#i', $uri, 2);
$uri = $parts[0];
if (isset($parts[1]))
{
    $_SERVER['QUERY_STRING'] = $parts[1];
    parse_str($_SERVER['QUERY_STRING'], $_GET);
}
else
{
    $_SERVER['QUERY_STRING'] = '';
    $_GET = array();
}

//uri is ready

$segments = explode('/' , $uri);

$class = $segments[0];
$method = $segments[1];

$parameters = array_slice( $segments , 2 , -1);

The above method uses REQUEST_URI with SCRIPT_NAME. The approach is to remove the SCRIPT_NAME or its directory from the REQUEST_URI and what is left behind is the path info along with query string. In this method ? might appear twice so it has to be exploded and contents of $_GET has to be changed accordingly.

The REQUEST_URI+SCRIPT_NAME method works in all cases , whereas PATH_INFO/ORIG_PATH_INFO work in only selected cases depending on the PHP Handler being used.

References :
1. http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo

Last Updated On : 9th November 2011

Subscribe to get updates delivered to your inbox

1 Comment + Add Comment

  • great observations and really helped me. thanks

Leave a comment