Writing a Script to Validate HTML/CSS of a Webpage in Ruby

In this post, I am going to write HTML and CSS Validators in Ruby and PHP. W3C HTML and CSS Validators basically check the validity of the URL supplied to it. It is always a good idea to write completely valid HTML and CSS for several reasons.

The W3C HTML Validator checks the markup validity of Web Documents in HTML, XHTML, etc. while the W3C CSS Validator does the same for Web Documents in CSS. These validators are basically software programs that check whether your documents conform to the Web Standards or not.

The following scripts basically reads the Response Headers of the GET request to the W3C API URLs - http://validator.w3.org/check?uri={URL} and http://jigsaw.w3.org/css-validator/validator?uri={URL}

# HTML and CSS Validators

def validate_helper(url, type)
  url = "http://validator.w3.org/check?uri=#{URI.escape(url)}" if type.downcase == 'html'
  url = "http://jigsaw.w3.org/css-validator/validator?uri=#{URI.escape(url)}" if type.downcase == 'css'
  headers = Net::HTTP.get_response(URI.parse(url)).to_hash
  # p headers

  status = headers['x-w3c-validator-status'][0].downcase

  if ['invalid', 'valid'].include? status
    error_count, warning_count = *{'x-w3c-validator-errors' => [0], 'x-w3c-validator-warnings' => [0]}.merge(headers).values_at('x-w3c-validator-errors', 'x-w3c-validator-warnings').map(&:first)
    puts "#{type.upcase} : #{url} : #{error_count} error(s), #{warning_count} warning(s)"
  end
end

def validate(url)
  require 'net/http'

  if url.nil? || url.strip.length < 1
    return
  end

  validate_helper url, 'html'
  validate_helper url, 'css'
end

validate 'http://binarytides.com'

Below is the PHP equivalent of the Ruby code above -

<?php
# HTML and CSS Validators

function validate_helper($url, $type) {
  if( $type == 'html' ) $url = "http://validator.w3.org/check?uri=" . urlencode($url);
  if( $type == 'css' ) $url = "http://jigsaw.w3.org/css-validator/validator?uri=" . urlencode($url);

  $headers = get_headers($url, 1);
  // var_dump($headers);

  $status = strtolower( $headers['X-W3C-Validator-Status'] );

  if( in_array($status, array('valid', 'invalid')) ) {
    $default = array('X-W3C-Validator-Errors' => 0, 'X-W3C-Validator-Warnings' => 0);

    $res = array_merge( $default, $headers );
    echo strtolower($type) . " : $url : {$res['X-W3C-Validator-Errors']} error(s), {$res['X-W3C-Validator-Warnings']} warning(s)";
  }
}

function validate($url) {
  if( !isset($url) || strlen($url) < 1 ) {
    return;
  }

  validate_helper($url, 'html');
  validate_helper($url, 'css');
}

validate('http://binarytides.com');






While the Ruby code uses Net::HTTP.get_response to fetch the headers sent by the validators in response to our HTTP Request, the PHP code uses get_headers.


Last Updated On : 10th November 2011

Subscribe to get updates delivered to your inbox

5 Comments + Add Comment

  • worked for markup validations but didnt worked with css. please help. I am getting error as :
    Message: get_headers(http://jigsaw.w3.org/css-validator/validator?uri=http%3A%2F%2Fwww.tatvic.com%2F) [function.get-headers]: failed to open stream: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

    Filename: controllers/w3c.php

    for the line $headers = get_headers($url, 1);

  • thanx …superb one….tried a lot for this …

  • I want to insert a input form to insert the URL instead of using this ” validate(‘http://binarytides.com’);”
    Can you show us how to ?

    • You could just send a POST request to validate.php (that contains the code) from a simple form.
      So something like this in the HTML –

      <form method="post" action="validate.php">
      <input type="text" name="url">
      <input type="submit" name="submit" value="Validate">
      </form>

      The form would then submit to http://your-domain.com/validate.php

      Then instead of validate('http://binarytides.com'); you can do validate($_POST['url']);

      Cheers

  • Nice tutorial…. thanks !!!

Leave a comment