Php – How to fetch gzipped content over HTTP with file_get_contents

By | January 13, 2023

The file_get_contents function is often used to quickly fetch a http url or resource. Usage is very simple and appears like this

$content = file_get_contents('http://www.google.com/');

However the file_get_contents does not get the contents compressed. It requests the server to send everything in plain text format. Most websites are capable of serving compressed content, if they are asked to do so in the http headers. Compressing the content saves bandwidth and speeds up the transfer process.

So the trick to get compressed content with file_get_contents is to send a specific http header that instructs the remote server to provide compressed content. Then the compressed content has to be uncompressed too to convert to original form. Here is a quick function to do that

function get_url($url)
{
	//user agent is very necessary, otherwise some websites like google.com wont give zipped content
	$opts = array(
		'http'=>array(
			'method'=>"GET",
			'header'=>"Accept-Language: en-US,en;q=0.8rn" .
						"Accept-Encoding: gzip,deflate,sdchrn" .
						"Accept-Charset:UTF-8,*;q=0.5rn" .
						"User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101 Firefox/19.0 FirePHP/0.4rn"
		)
	);

	$context = stream_context_create($opts);
	$content = file_get_contents($url ,false,$context); 
	
	//If http response header mentions that content is gzipped, then uncompress it
	foreach($http_response_header as $c => $h)
	{
		if(stristr($h, 'content-encoding') and stristr($h, 'gzip'))
		{
			//Now lets uncompress the compressed data
			$content = gzinflate( substr($content,10,-8) );
		}
	}
	
	return $content;
}

echo get_url('http://www.google.com/');

The function first sends the "Accept-Encoding" header in the request. Next if the server replies with content encoded with gzip, then it inflates the content back.

About Silver Moon

A Tech Enthusiast, Blogger, Linux Fan and a Software Developer. Writes about Computer hardware, Linux and Open Source software and coding in Python, Php and Javascript. He can be reached at [email protected].

3 Comments

Php – How to fetch gzipped content over HTTP with file_get_contents
  1. Fedge

    $http_response_header doesn’t get defined in your script. You’re missing most of your code here.

    You should test some code and then update this.

Leave a Reply

Your email address will not be published. Required fields are marked *