If you have used codeigniter or some mvc php framework then you might be aware of urls like
www.yoursite.com/index.php/class/method/param1/param2?a=b&c=d
or
www.yoursite.com/class/method/param1/param2?a=b&c=d
if you use mod rewrite to add the index.php
The htaccess code could be like this :
<IfModule mod_rewrite.c> #Start the rewrite engine RewriteEngine on #If requested thing is not a file or directory , then rewrite RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ index.php/$1 [L] </IfModule>
The above kind of urls need to be parsed inside PHP to extract the class name , method name and parameters.
The parsing approach could be different based on the Server API (sapi) PHP is using.
1. apache2handler
When PHP is running with apache2handler and the url is :
www.yoursite.com/index.php/class/method/param?a=b&c=d
the contents of $_SERVER could be :
Array ( [HTTP_HOST] => localhost [HTTP_CONNECTION] => keep-alive [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [HTTP_COOKIE] => PHPSESSID=kh02qm9ulc0e9daldl1pe2mbv3 [PATH] => /usr/local/bin:/usr/bin:/bin [SERVER_SIGNATURE] => <address>Apache/2.2.17 (Ubuntu) Server at localhost Port 80</address> [SERVER_SOFTWARE] => Apache/2.2.17 (Ubuntu) [SERVER_NAME] => localhost [SERVER_ADDR] => 127.0.0.1 [SERVER_PORT] => 80 [REMOTE_ADDR] => 127.0.0.1 [DOCUMENT_ROOT] => /var/www [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /var/www/mvc/index.php [REMOTE_PORT] => 50600 [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => a=b&c=d [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PATH_INFO] => /class/method/param [PATH_TRANSLATED] => /var/www/class/method/param [PHP_SELF] => /mvc/index.php/class/method/param [REQUEST_TIME] => 1320063355 )
In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME
With a url like this :
www.yoursite.com/class/method/param?a=b&c=d
the content of $_SERVER could be :
Array ( [REDIRECT_STATUS] => 200 [HTTP_HOST] => localhost [HTTP_CONNECTION] => keep-alive [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [HTTP_COOKIE] => PHPSESSID=kh02qm9ulc0e9daldl1pe2mbv3 [PATH] => /usr/local/bin:/usr/bin:/bin [SERVER_SIGNATURE] => <address>Apache/2.2.17 (Ubuntu) Server at localhost Port 80</address> [SERVER_SOFTWARE] => Apache/2.2.17 (Ubuntu) [SERVER_NAME] => localhost [SERVER_ADDR] => 127.0.0.1 [SERVER_PORT] => 80 [REMOTE_ADDR] => 127.0.0.1 [DOCUMENT_ROOT] => /var/www [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /var/www/mvc/index.php [REMOTE_PORT] => 38492 [REDIRECT_QUERY_STRING] => a=b&c=d [REDIRECT_URL] => /mvc/class/method/param [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => a=b&c=d [REQUEST_URI] => /mvc/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PATH_INFO] => /class/method/param [PATH_TRANSLATED] => /var/www/class/method/param [PHP_SELF] => /mvc/index.php/class/method/param [REQUEST_TIME] => 1320063541 )
In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME
2. CGI
When PHP is running with CGI and the url is :
www.yoursite.com/index.php/class/method/param?a=b&c=d
the contents of $_SERVER could be :
Array ( [DOCUMENT_ROOT] => /home/projects/public_html [GATEWAY_INTERFACE] => CGI/1.1 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_CONNECTION] => keep-alive [HTTP_COOKIE] => __utmz=179618234.1309856897.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=179618234.703966342.1309856897.1309856897.1309856897.1 [HTTP_HOST] => www.yoursite.com [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [PATH] => /bin:/usr/bin [PATH_INFO] => /class/method/param [PATH_TRANSLATED] => /home/projects/public_html/mvc/index.php [QUERY_STRING] => a=b&c=d [REDIRECT_STATUS] => 200 [REMOTE_ADDR] => 59.93.205.192 [REMOTE_PORT] => 60102 [REQUEST_METHOD] => GET [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d [SCRIPT_FILENAME] => /home/projects/public_html/mvc/index.php [SCRIPT_NAME] => /mvc/index.php [SERVER_ADDR] => 64.131.72.23 [SERVER_ADMIN] => [email protected] [SERVER_NAME] => www.yoursite.com [SERVER_PORT] => 80 [SERVER_PROTOCOL] => HTTP/1.1 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [UNIQUE_ID] => Tq6V40CDSBcAAA5LtMgAAAAC [PHP_SELF] => /mvc/index.php/class/method/param [REQUEST_TIME] => 1320064483 [argv] => Array ( [0] => a=b&c=d ) [argc] => 1 )
In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME
With a url like this :
www.yoursite.com/class/method/param?a=b&c=d
the content of $_SERVER could be :
Array ( [DOCUMENT_ROOT] => /home/projects/public_html [GATEWAY_INTERFACE] => CGI/1.1 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_CONNECTION] => keep-alive [HTTP_COOKIE] => __utmz=179618234.1309856897.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=179618234.703966342.1309856897.1309856897.1309856897.1 [HTTP_HOST] => www.yoursite.com [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [PATH] => /bin:/usr/bin [QUERY_STRING] => a=b&c=d [REDIRECT_QUERY_STRING] => a=b&c=d [REDIRECT_STATUS] => 200 [REDIRECT_UNIQUE_ID] => Tq6WCUCDSBcAAA5jtfAAAAAI [REDIRECT_URL] => /mvc/class/method/param [REMOTE_ADDR] => 59.93.205.192 [REMOTE_PORT] => 60104 [REQUEST_METHOD] => GET [REQUEST_URI] => /mvc/class/method/param?a=b&c=d [SCRIPT_FILENAME] => /home/projects/public_html/mvc/index.php [SCRIPT_NAME] => /mvc/index.php [SERVER_ADDR] => 64.131.72.23 [SERVER_ADMIN] => [email protected] [SERVER_NAME] => www.yoursite.com [SERVER_PORT] => 80 [SERVER_PROTOCOL] => HTTP/1.1 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [UNIQUE_ID] => Tq6WCUCDSBcAAA5jtfAAAAAI [ORIG_PATH_INFO] => /class/method/param [ORIG_PATH_TRANSLATED] => /home/projects/public_html/mvc/index.php [PHP_SELF] => /mvc/index.php [REQUEST_TIME] => 1320064521 [argv] => Array ( [0] => a=b&c=d ) [argc] => 1 )
In the above format the following can be used
1. ORIG_PATH_INFO
2. REQUEST_URI + SCRIPT_NAME
3. CGI/FastCGI
FastCGI is known to be buggy with urls like :
www.yoursite.com/index.php/class/method/param?a=b&c=d
The page might simply say :
No input file specified.
or
the contents of $_SERVER could be :
Array ( [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin [PWD] => /usr/local/cpanel/cgi-sys [SHLVL] => 0 [FCGI_ROLE] => RESPONDER [UNIQUE_ID] => Tq6bN0FiYKIAAH8iRYMAAAAY [HTTP_HOST] => 108.co.in [HTTP_CONNECTION] => close [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [SERVER_NAME] => 108.co.in [SERVER_ADDR] => 65.98.96.162 [SERVER_PORT] => 80 [REMOTE_ADDR] => 59.93.205.192 [DOCUMENT_ROOT] => /home/name108/public_html [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php [REMOTE_PORT] => 46355 [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => a=b&c=d [REQUEST_URI] => /mvc/index.php/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PATH_INFO] => /class/method/param [PATH_TRANSLATED] => /home/name108/public_html/class/method/param [PHP_SELF] => /mvc/index.php/class/method/param [REQUEST_TIME] => 1320065847 [argv] => Array ( [0] => a=b&c=d ) [argc] => 1 )
In the above format the following can be used
1. PATH_INFO
2. REQUEST_URI + SCRIPT_NAME
If page says : No input file specified.
then use a url like :
www.yoursite.com/index.php?/class/method/param?a=b&c=d
the contents of $_SERVER could be :
Array ( [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin [PWD] => /usr/local/cpanel/cgi-sys [SHLVL] => 0 [FCGI_ROLE] => RESPONDER [UNIQUE_ID] => Tq6b00FiYKIAAAGjPXAAAAAS [HTTP_HOST] => 108.co.in [HTTP_CONNECTION] => close [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [SERVER_NAME] => 108.co.in [SERVER_ADDR] => 65.98.96.162 [SERVER_PORT] => 80 [REMOTE_ADDR] => 59.93.205.192 [DOCUMENT_ROOT] => /home/name108/public_html [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php [REMOTE_PORT] => 46432 [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => /class/method/param?a=b&c=d [REQUEST_URI] => /mvc/index.php?/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PHP_SELF] => /mvc/index.php [REQUEST_TIME] => 1320066003 [argv] => Array ( [0] => /class/method/param?a=b&c=d ) [argc] => 1 )
In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME
a url like :
www.yoursite.com/class/method/param?a=b&c=d
might say :
No input file specified.
So the mod rewrite rule can be modified like this :
<IfModule mod_rewrite.c> #Start the rewrite engine RewriteEngine on #If requested thing is not a file or directory , then rewrite RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ index.php?/$1 [L] </IfModule>
Array ( [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin [PWD] => /usr/local/cpanel/cgi-sys [SHLVL] => 0 [FCGI_ROLE] => RESPONDER [REDIRECT_UNIQUE_ID] => Tq6hOUFiYKIAAAbWHZgAAAAW [REDIRECT_STATUS] => 200 [UNIQUE_ID] => Tq6hOUFiYKIAAAbWHZgAAAAW [HTTP_HOST] => 108.co.in [HTTP_CONNECTION] => close [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [SERVER_NAME] => 108.co.in [SERVER_ADDR] => 65.98.96.162 [SERVER_PORT] => 80 [REMOTE_ADDR] => 59.93.205.192 [DOCUMENT_ROOT] => /home/name108/public_html [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php [REMOTE_PORT] => 45872 [REDIRECT_QUERY_STRING] => /class/method/param [REDIRECT_URL] => /mvc/class/method/param [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => /class/method/param [REQUEST_URI] => /mvc/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PHP_SELF] => /mvc/index.php [REQUEST_TIME] => 1320067385 [argv] => Array ( [0] => /class/method/param ) [argc] => 1 )
As we can see above QUERY_STRING looses the actual GET parameters.
In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME
With a url like :
www.yoursite.com/?/class/method/param?a=b&c=d
the contents of $_SERVER could be :
Array ( [PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin [PWD] => /usr/local/cpanel/cgi-sys [SHLVL] => 0 [FCGI_ROLE] => RESPONDER [UNIQUE_ID] => Tq6cPEFiYKIAAAIoVYIAAAAW [HTTP_HOST] => 108.co.in [HTTP_CONNECTION] => close [HTTP_CACHE_CONTROL] => max-age=0 [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Ubuntu/11.04 Chromium/12.0.742.112 Chrome/12.0.742.112 Safari/534.30 [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8 [HTTP_ACCEPT_CHARSET] => UTF-8,*;q=0.5 [SERVER_SIGNATURE] => [SERVER_SOFTWARE] => Apache [SERVER_NAME] => 108.co.in [SERVER_ADDR] => 65.98.96.162 [SERVER_PORT] => 80 [REMOTE_ADDR] => 59.93.205.192 [DOCUMENT_ROOT] => /home/name108/public_html [SERVER_ADMIN] => [email protected] [SCRIPT_FILENAME] => /home/name108/public_html/mvc/index.php [REMOTE_PORT] => 46446 [GATEWAY_INTERFACE] => CGI/1.1 [SERVER_PROTOCOL] => HTTP/1.1 [REQUEST_METHOD] => GET [QUERY_STRING] => /class/method/param?a=b&c=d [REQUEST_URI] => /mvc/?/class/method/param?a=b&c=d [SCRIPT_NAME] => /mvc/index.php [PHP_SELF] => /mvc/index.php [REQUEST_TIME] => 1320066108 [argv] => Array ( [0] => /class/method/param?a=b&c=d ) [argc] => 1 )
In the above format the following can be used
1. REQUEST_URI + SCRIPT_NAME
Parsing in PHP
When using PATH_INFO or ORIG_PATH_INFO , the code is pretty straightforward :
$uri = $_SERVER['PATH_INFO']; $segments = explode('/' , $uri); $class = $segments[0]; $method = $segments[1]; $parameters = array_slice( $segments , 2 , -1);
When using REQUEST_URI + SCRIPT_NAME its needs a bit of extra coding :
$uri = $_SERVER['REQUEST_URI']; //Check if the SCRIPT_NAME appears in the REQUEST_URI as a whole if (strpos($uri, $_SERVER['SCRIPT_NAME']) === 0) { //Take the part that appears after the script_name $uri = substr($uri, strlen($_SERVER['SCRIPT_NAME'])); } //Check if the directory name of SCRIPT_NAME appears in the REQUEST_URI elseif (strpos($uri, dirname($_SERVER['SCRIPT_NAME'])) === 0) { //Take the part that appears after the script_name $uri = substr($uri, strlen(dirname($_SERVER['SCRIPT_NAME']))); } /* Test for urls like index.php?/class/method/param?a=b&c=d These kind of urls are used for fastcgi and godaddy servers!! Also on nginx */ if (strncmp($uri, '?/', 2) === 0) { //Remove the first ?/ $uri = substr($uri, 2); } /* Now split at the ? The first part will be uri The Second part will be actual query string */ $parts = preg_split('#?#i', $uri, 2); $uri = $parts[0]; if (isset($parts[1])) { $_SERVER['QUERY_STRING'] = $parts[1]; parse_str($_SERVER['QUERY_STRING'], $_GET); } else { $_SERVER['QUERY_STRING'] = ''; $_GET = array(); } //uri is ready $segments = explode('/' , $uri); $class = $segments[0]; $method = $segments[1]; $parameters = array_slice( $segments , 2 , -1);
The above method uses REQUEST_URI with SCRIPT_NAME. The approach is to remove the SCRIPT_NAME or its directory from the REQUEST_URI and what is left behind is the path info along with query string. In this method ? might appear twice so it has to be exploded and contents of $_GET has to be changed accordingly.
The REQUEST_URI+SCRIPT_NAME method works in all cases , whereas PATH_INFO/ORIG_PATH_INFO work in only selected cases depending on the PHP Handler being used.
References :
1. http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo
great observations and really helped me. thanks