Disqus Comments Importer Script in PHP

Disqus is a great commenting platform that does it's job really well. Many people use it these days because writing your own commenting system is a tedious and hard work whereas a solution like Disqus or Livefyre makes the entire commenting process a breeze with features like social logins, many levels of nesting, easy replying, quick and realtime commenting, easy integration, intuitive admin panel to manage your comments, email notifications, managing comments via emails directly, etc.

From their website -

DISQUS is a comments platform that helps you build an active community from your website's audience. It has awesome features, powerful tools, and it's easy to install.

Recently I moved from Disqus to my own custom commenting system for several reasons like more control over comment notifications to authors, showing comments count anywhere easily, etc. So I had to figure out a way to move about a hundred comments from Disqus's system to my own Database. Googling did not help me find any good script for such a requirement neither did Github Search yield anything. So, I decided to give it a shot, took some time as I had to do it with http://php.net/dom and had no previous experience but overall it wasn't more than a day's task and most of it was a breeze. I would like to share the code here just incase it helps others!

First of all check out this link - http://docs.disqus.com/developers/export/. It shows the XML Export Format and has other valuable information too.

Exporting the Comments

Exporting the comments is really very easy. Login to your Disqus account and then click your Site from the Site list. You will then see the Tools tab in the menu, click that and finally click the Export tab.

Disqus Admin Panel

Code in Action


function import() {
 date_default_timezone_set('GMT');

 $res = mysqli_connect('localhost', 'my_user', 'my_password', 'my_db');

 $file = '/path/to/disqus/export.xml';
 $doc = new DOMDocument();
 $doc->load($file);
 // echo $doc->saveXML();

 $thread_list = array();
 $threads = $doc->getElementsByTagName('thread');
 foreach( $threads as $thread ) {
  $comment = array();
  $comment['thread_id'] = $thread->getAttribute('dsq:id');
  $comment['url'] = $thread->getElementsByTagName('link')->item(0)->nodeValue;
  $comment['url_parts'] = explode('/', $comment['url']);

  if ( $comment['url_parts'][3] != 'item' ) continue;

  $comment['item_id'] = (int) $comment['url_parts'][4];
  $thread_list[$comment['thread_id']] = $comment;
 }

 // var_dump($thread_list);

 $post_list = array();
 $posts = $doc->getElementsByTagName('post');
 $i = 1;
 foreach( $posts as $post ) {
  $comment = array();
  $comment['id'] = $i;
  $comment['comment_id'] = $post->getAttribute('dsq:id');

  if ( in_array($comment['comment_id'], array('386444814', '389840029', '410354950', '412405419')) ) continue;

  $comment['comment'] = str_replace("
", "rn", $post->getElementsByTagName('message')->item(0)->nodeValue);
  $comment['ip_address'] = $post->getElementsByTagName('ipAddress')->item(0)->nodeValue;
  $comment['created_at'] = strtotime( str_replace(array('T', 'Z'), array(' ', ''), $post->getElementsByTagName('createdAt')->item(0)->nodeValue) );

  $comment['email'] = $post->getElementsByTagName('author')->item(0)->getElementsByTagName('email')->item(0)->nodeValue;
  $comment['name'] = $post->getElementsByTagName('author')->item(0)->getElementsByTagName('name')->item(0)->nodeValue;

  $comment['thread_id'] = $post->getElementsByTagName('thread')->item(0)->getAttribute('dsq:id');
  if ( $post->getElementsByTagName('parent')->item(0) ) {
   $comment['d_parent_id'] = $post->getElementsByTagName('parent')->item(0)->getAttribute('dsq:id');
   $comment['parent_id'] = $post_list[$comment['d_parent_id']]['id'];
  }

  $comment['item_id'] = $thread_list[$comment['thread_id']]['item_id'];
  if ( !$comment['item_id'] ) continue;

  $post_list[$comment['comment_id']] = $comment;
  ++$i;
 }

 // var_dump($post_list);

 foreach($post_list as $post) {
  $name = mysqli_real_escape_string($res, $post['name']);
  $email = mysqli_real_escape_string($res, $post['email']);
  $url = mysqli_real_escape_string($res, $post['url']);
  $comment = mysqli_real_escape_string($res, $post['comment']);
  $parent_id = (int) $post['parent_id'];
  $depth = (int) $post['depth'];
  $ip_address = mysqli_real_escape_string($res, $post['ip_address']);

  $sql = "INSERT INTO comment SET item_id = $post[item_id], name = '$name', email = '$email', url = '$url', comment = '$comment', ip_address = INET_ATON('$ip_address'), parent_id = $parent_id, depth = $depth, created_at = FROM_UNIXTIME($post[created_at])";
  // echo $sql;
  mysqli_query($res, $sql);
 }
}







The disqus export file is kind of divided into 2 sections. First there is the XML for threads and then there is the XML for various comments (called post) that is related to different threads. So, in this code we first form a big array of the threads and then a big array of the comments. Then it's pretty easy to insert the comments data into the Database.

There are some confusing spots in the code, let's go through them once.

if ( $comment['url_parts'][3] != 'item' ) continue; - You will notice this piece of code. Ok, so in my website there are 2 different types of resources. One is items and another is posts. So i have URLs like http://example.com/item/[item_id] and http://example.com/post/[post_id]. With this piece of code I made sure that i am only fetching the comments that belongs to my items and not posts. You can run the same code again for posts and have a comment_type column in your comment table.

$comment['url_parts'] = explode('/', $comment['url']); - Disqus links your comments to your Permalinks (Item/Post URL). So exploding the URL and getting the 4th index of the array was the best way for me to obtain the Item IDs.


if ( $post->getElementsByTagName('parent')->item(0) ) {
$comment['d_parent_id'] = $post->getElementsByTagName('parent')->item(0)->getAttribute('dsq:id');
$comment['parent_id'] = $post_list[$comment['d_parent_id']]['id'];
}

Every non-top level Disqus Comment in the XML Export has its parent_id. So if a parent_id exists, we store them in d_parent_id key and in parent_id we store "our system's" item's parent id. Hope that makes sense.

$comment['id'] = $i; - This data acted as the Auto Increment (Primary Key) value in the Database table. This is the same value stored in parent_id for comments that have a parent above.

I really hope the entire code serves to be useful for a lot of people looking for a script to import disqus comments in their own system. Lets discuss more in the comments :)

Last Updated On : 25th January 2012

Subscribe to get updates delivered to your inbox

1 Comment + Add Comment

  • Good

Leave a comment