~ overflow ~

json decode fails on non utf-8

by z3n on Mar.16, 2010, under Coding, Tips & Hints

Problem:

When sending a non utf-8 string as json, the decoding fails.

Solution:

PHP works as utf-8 as default, since i’m using strings with accents (áéíóú..) those are taken as iso-8859-1. Client-side script will not send as utf-8, not even if you force it, so the best solution is convert the json object’s encoding. You may also want to encode your string as plain chars (I use base64) to avoid issues with IE.

Code would look like this:

$json=json_decode(iconv(‘ISO-8859-1′,’UTF-8′,base64_decode($input)),true);

If you’re working with different charsets just change the iso-8859-1, remember that if you’re working with multibyte chars, such as japanese, chinese, etc, you will need to use the mb functions instead.

Sources:

Pablo Viquez (A solution pretty much like mine but for sending data instead)

:, , , ,


2 comments for this entry:
  1. MC

    I try to convert an array into json by using json_encode.

    The code is as following:

    // code is coming ————————–

    $marker1 = array(‘name’=>’name1′, ‘id’ => 1);
    $marker2 = array(‘name’=>’中文’, ‘id’ => 2 );

    // prepare the array
    $markers = array(‘markers’=>array($marker1,$marker2));

    // convert to json
    $output = json_encode($markers);

    // set header for json document
    if(!headers_sent())
    {
    header(‘Content-Type: application/json; charset=utf-8′, true,200);
    }

    print($output);

    As a result,
    all is perfect except the chinese characters.
    It becomes sth like

    \u00e4\u00b8\u00ad

    What should I do to make it correct?

    Regards,
    MC

  2. z3n

    Hi,

    for ISO-8859-1 i use iconv function like this:

    $x=json_decode( // (3) decode json
    iconv( // (2) convert accents
    ‘ISO-8859-1′,’UTF-8′,base64_decode($_POST['d']) // (1) decode base64
    )
    ,true);

    where $_POST['d'] is the value sent. Since you’re dealing with multibyte chars you will need the mb_convert_encoding($x,”ISO-8859-1″,”UTF-8″); instead of the iconv. Note that your charset isen’t ISO-8859-1, search the charset code page to know which one is the right one.

    - z3n

Leave a Reply

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!