Fork me on GitHub
Current releases are 2.1.1 (2 Aug 2014) and 2.0.18.13 (5 Aug 2014)
Ready for 2.1? Find out if your favorite plugins are compatible.
The 2.0 branch will no longer receive updates after Dec 2014.

Character encoding problem after moving from Vanilla 1 to Vanilla 2

npnp
edited January 2011 in Vanilla 1.0 Help
I migrated my forum from Vanilla 1 to Vanilla 2. I used the vanilla2export.php to move the data over. Previous non-ASCII data is no longer displaying correctly. It showed fine in Vanilla 1.

My tables are UTF-8. I can see the correct data when using an SQL client in UTF-8 mode. The data shows up the same way in the Vanilla 2 tables as it did in the Vanilla 1 tables. Additionally, the non-forum part of my site also does UTF-8 correctly.

When I add a new comment that includes non-ASCII characters, it shows up correctly on the forum, but incorrectly in the SQL client.

My pages are being with HTTP headers specifying UTF-8. I've seen this page, which says that there are various settings for character encoding. It's for Vanilla 1, but looking at my config files, I see that I have no character-encoding-related settings in config.php, and config-default.php has:

$Configuration['Database']['CharacterEncoding'] = 'utf8';
$Configuration['Database']['ExtendedProperties']['Collate'] = 'utf8_unicode_ci';
$Configuration['Garden']['Charset'] = 'utf-8';


What do I need to do to make this work right?

Comments

  • After some more digging, I found that despite the fact the tables are UTF-8, the Vanilla 1 connections were being made in latin1. The comments from Vanilla 2 are encoded correctly.

    My MySQL client was connecting with the DB's default encoding, which was latin1. If I ran "set names utf8" then I saw Vanilla 2's stuff showing up correctly, and Vanilla 1's stuff incorrectly. My non-Vanilla code also had the same problem.

    To fix, I made this PHP script to convert the old data. The IDs used represent the cut off between the Vanilla 1 data and the Vanilla 2 data.

    <?php
    # make iconv shut up
    #error_reporting(E_WARNING);

    header( 'Content-Type: text/html; charset=UTF-8' );

    $db = mysql_connect('localhost', 'root', 'password');
    mysql_query("USE mydb");
    mysql_query("SET NAMES latin1");
    #$result = mysql_query("SELECT DiscussionID, Name FROM GDN_Discussion WHERE DiscussionID <= 24460");
    $result = mysql_query("SELECT DiscussionID, Body FROM GDN_Discussion WHERE DiscussionID <= 24460");
    #$result = mysql_query("SELECT CommentID, Body FROM GDN_Comment WHERE CommentID <= 50053");
    $stuff = array();
    while ($row = mysql_fetch_row($result)) {
    $stuff[$row[0]] = $row[1];
    }
    mysql_query("SET NAMES utf8");
    #$result = mysql_query("SELECT DiscussionID, Name FROM GDN_Discussion WHERE DiscussionID <= 24460");
    $result = mysql_query("SELECT DiscussionID, Body FROM GDN_Discussion WHERE DiscussionID <= 24460");
    #$result = mysql_query("SELECT CommentID, Body FROM GDN_Comment WHERE CommentID <= 50053");
    while ($row = mysql_fetch_row($result)) {
    if ($stuff[$row[0]] == $row[1]) {
    unset($stuff[$row[0]]);
    #echo $row[0].'<br>';
    }
    }

    foreach ($stuff as $key => $value) {
    $value = iconv('UTF-8', 'UTF-8//IGNORE', $value);
    # echo $key.' "'.htmlspecialchars($value).'"<br>';
    # mysql_query(sprintf("UPDATE GDN_Discussion SET Name = '%s' WHERE DiscussionID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key)));
    mysql_query(sprintf("UPDATE GDN_Discussion SET Body = '%s' WHERE DiscussionID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key)));
    # mysql_query(sprintf("UPDATE GDN_Comment SET Body = '%s' WHERE CommentID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key)));
    }
    mysql_close($db);
    ?>
Sign In or Register to comment.