MantisBT - ATutor
View Issue Details
0004107ATutorUTF-8public2010-01-11 09:292010-09-22 11:52
harris 
harris 
normalmajoralways
closedfixed 
 
2.0 
SVN
0004107: UTF-8 connection problem
Detailed report here:
http://www.atutor.ca/view/12/19399/1.html [^]

Fix:
1. Change the current content back to utf8 using utf8 connection.
2. Add a check in mysql_connect.inc.php to check for connection, if any of the following is NOT utf-8, change it to UTF-8.
character_set_client (charset of the client's stmt)
character_set_connection (server should translate the bytes in this charset)
collation_connection (same as above)
character_set_results (charset of server's returned stmt)
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html [^]
No tags attached.
parent of 0004137closed harris Russian characters not rendered correctly 
Issue History
2010-01-11 09:29harrisNew Issue
2010-01-11 09:29harrisAffects version => SVN
2010-01-11 11:15harrisNote Added: 0004020
2010-01-12 00:17tasmiNote Added: 0004021
2010-01-12 04:52harrisNote Added: 0004022
2010-01-12 04:59harrisNote Added: 0004023
2010-01-12 05:30tasmiNote Added: 0004024
2010-01-12 05:55harrisNote Edited: 0004020
2010-03-01 10:57harrisRelationship addedparent of 0004137
2010-05-04 07:15harrisNote Added: 0004228
2010-07-06 07:56harrisStatusnew => resolved
2010-07-06 07:56harrisFixed in Version => 2.0
2010-07-06 07:56harrisResolutionopen => fixed
2010-07-06 07:56harrisAssigned To => harris
2010-07-06 07:56harrisNote Added: 0004406
2010-09-22 11:52gregStatusresolved => closed

Notes
(0004020)
harris   
2010-01-11 11:15   
(edited on: 2010-01-12 05:55)
simple fix (needs shell/admin access)

This following will export all the data using latin1 connection.
> mysqldump -u root -p <atutor_db_name> --default-character-set="latin1" > atutor.sql

Open atutor.sql, replace
/*!40101 SET NAMES latin1 */; to
/*!40101 SET NAMES utf8 */;

Then, import the database back in using UTF8 connection.
> mysql -u root -p <atutor_db_name> --default-character-set="utf8" < atutor.sql

(0004021)
tasmi   
2010-01-12 00:17   
If the mysql server was not properly configured as UTF-8 during the installation, some tables might have latin1 as character set (those that not specify the charset in the CREATE TABLE statement), so I think there is more safe to replace all "latin1" with "utf8".
(0004022)
harris   
2010-01-12 04:52   
Good point, but that might create another problem. Simply changing latin1 to utf8 on the table structure will only change the table itself. The existing content will not be converted. These table might need to converted separately.
(0004023)
harris   
2010-01-12 04:59   
Forgot to mention this, after all these steps, open include/lib/mysql_connect.inc.php

change line 16:
    $db = @mysql_connect(DB_HOST . ':' . DB_PORT, DB_USER, DB_PASSWORD);

to
    $db = @mysql_connect(DB_HOST . ':' . DB_PORT, DB_USER, DB_PASSWORD);
    mysql_query("SET NAMES 'utf8'", $db);
(0004024)
tasmi   
2010-01-12 05:30   
I've just test your solution in my db wich have a lot of encoding mismatches (db, tables, and field are defined as UTF-8 but content really is encoded in latin1), and it works perfect.

Looking at the file generated by mysqldump, it's saved as ASCII y the special chars are "bad encoded" (é,ó,ñ, etc.) but the import using utf8 as default charset works fine! Now the database ir full UTF-8 (db, tables, fields, and content).

It works replacing only SET NAMES, and replacing all "latin1" (the new social tables doesn't indicate the charset in the CREATE TABLE statement, so I had them in latin1 by default).

Replacing only the SET NAMES will convert all the content, but if you have tables or field defined in latin1, they will remain as it. Replacing all latin1 you ensure that not only the content, even the structure will be UTF-8.

Anyway, replacing only SET NAMES will convert successfully all the content in the db :)

One tip:
If you are using PHP5, it is recommended to use the mysql_set_charset() function instead of mysql_query('SET NAMES...')
http://php.net/manual/en/function.mysql-set-charset.php [^]
(0004228)
harris   
2010-05-04 07:15   
This should be fixed in 2.0.
(0004406)
harris   
2010-07-06 07:56   
svn: 10065

As of 2.0, all fresh install will set the mysql connection to utf8 to avoid the problem. If the users are upgrading, then the system will not set the utf8 connection because data would be corrupted.

If old system needs to use the utf8 connection, then they have to convert it manually using either my way or tasmi's way.