Php – Fix “Input is not proper UTF-8, indicate encoding” error when loading xml

By | March 26, 2013

When loading xml files in php through simplexml_load_string or domDocument class, sometimes an error like this might popup

Warning: DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !
OR
Warning: simplexml_load_string(): Entity: line 93: parser error : Input is not proper UTF-8, indicate encoding !

The error occurs when the xml has some invalid characters that do not fit in the utf-8 character set. The solution to fix this error is quite simple. Just convert the entire xml string to ut8 first and then load.

$xml = simplexml_load_string( utf8_encode($rss) );

The utf8_encode function will convert the string to proper utf8 and invalid characters would be fixed, making the xml parseable by simplexml or domdocument.

About Silver Moon

A Tech Enthusiast, Blogger, Linux Fan and a Software Developer. Writes about Computer hardware, Linux and Open Source software and coding in Python, Php and Javascript. He can be reached at [email protected].

8 Comments

Php – Fix “Input is not proper UTF-8, indicate encoding” error when loading xml
  1. AP

    The “strait answer” that worked for me was encoding the load_string.
    simplexml_load_string(utf8_encode($xml));

    I was already encoding in the API call, that had fixed an issue before, but adding it here fixed my issue.

    1. Guilherme Akio Sakae

      Hi Gina,

      Here’s my line of code: simplexml_load_string(utf8_encode(strip_tags($_product->getDescription())));

      In my case the product description contains some invalid chars and some html tags so I have to use the strip_tags function to remove them and then a can use the other 2 functions to proper encode the string.

      Hope this help’s you.

Leave a Reply

Your email address will not be published. Required fields are marked *