1. Work on 5.0
long blob binary for all files (Sylvie)
Improve detection (Nyloth)
Admin panel -> new panel "server" (Nyloth)
Migration script
2. Goal
Check UTF-8 usage in Tiki.
Show that current code on branch 4 / 5 and trunk are not good, as also
explained here.
The most difficult is to show that in despite of what you see in your
web browser data in database are not well stored.
3. Parameter check
First we need to check that our config is full UTF-8 in order to make test.
3.1. Shell
[+]3.2. Mysql
[+]4. Testcase
This array show the different situation and the test result.
Tiki Version | Database structure | Mysql Connector | Test | Visual Result | Database Result
|
3.X (ACTUAL) | ADODB | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
3.X | PDO with UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
3.X | PDO without UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | Double encoding in UTF-8
|
4.X | ADODB | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
4.X | PDO with UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
4.X (ACTUAL) | PDO without UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | Double encoding in UTF-8
|
5.X | ADODB | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
5.X | PDO with UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | OK
|
5.X (ACTUAL) | PDO without UTF-8 param | UTF-8 | Create page name 让弗朗索瓦 | OK | Double encoding in UTF-8
|
(*) Bold color represent current code on SVN
As you can see, the current situation is not good at all. Tiki Website that are in version 4 and 5 will have problem with UTF-8 data in
database.
They will be double-encoded in database because :
- Tiki's PHP code work with UTF-8 content (input from users for example) and will query the database using this UTF-8 content
- PDO, the new abstraction layer, does not know that the content is already in UTF-8 and send this content to MySQL without announcing it as an UTF-8 content
- MYSQL receives this data from Tiki PDO and thinks it's not UTF-8. So, it wrongly converts it one more time into UTF-8 because the underlying DB structure (DB / tables) is in UTF-8.
In fact you will store a double encoded UTF-8 data in the database.
5. Solution(s)
They are many situations
- Modify /db/tiki-db-pdo.php
Index: db/tiki-db-pdo.php =================================================================== --- db/tiki-db-pdo.php (revision 27261) +++ db/tiki-db-pdo.php (working copy) @@ -29,6 +29,8 @@ try { //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki); $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki); + if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql') + $dbTiki->exec("SET CHARACTER SET utf8"); $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL); $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING); $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);
- Make a mysql backup in latin1 with mysqldump or phpmyadmin
mysqldump --default-character-set=latin1 -uuser_name -ppassword -h host db_name > dump.sql
- Recreate database in UTF-8
mysql -uuser_name -ppassword -h host db_name < dump.sql
- Modify /db/tiki-db-pdo.php
Index: db/tiki-db-pdo.php =================================================================== --- db/tiki-db-pdo.php (revision 27261) +++ db/tiki-db-pdo.php (working copy) @@ -29,6 +29,8 @@ try { //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki); $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki); + if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql') + $dbTiki->exec("SET CHARACTER SET utf8"); $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL); $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING); $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);
- Make a mysql backup in latin1 with mysqldump or phpmyadmin
mysqldump --default-character-set=latin1 -uuser_name -ppassword -h host db_name > dump.sql
- Recreate database in UTF-8
mysql -uuser_name -ppassword -h host db_name < dump.sql
- Modify /db/tiki-db-pdo.php
Index: db/tiki-db-pdo.php =================================================================== --- db/tiki-db-pdo.php (revision 27261) +++ db/tiki-db-pdo.php (working copy) @@ -29,6 +29,8 @@ try { //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki); $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki); + if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql') + $dbTiki->exec("SET CHARACTER SET utf8"); $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL); $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING); $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);
Related links