我尝试将一些表从一个 MySQL 数据库迁移到另一个 MySQL 数据库,但遇到了错误:
ERROR 1062 (23000) at line 108: Duplicate entry 'außer' for key 'PRIMARY'
我试图找出为什么在目标数据库中运行
mysql> select 'außer' = 'auser';
+--------------------+
| 'außer' = 'auser' |
+--------------------+
| 1 |
+--------------------+
1 row in set (0.07 sec)
如你所见,MySQL 认为它们两个是相同的,我检查了配置变量
mysql> show variables like 'coll%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
mysql> show variables like 'character%';
+--------------------------+------------------------------------------+
| Variable_name | Value |
+--------------------------+------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /rdsdbbin/mysql-5.5.8.R1/share/charsets/ |
+--------------------------+------------------------------------------+
然后,我回到原始数据库并尝试
mysql> select 'außer' = 'auser';
+--------------------+
| 'außer' = 'auser' |
+--------------------+
| 0 |
+--------------------+
1 row in set (0.00 sec)
mysql> show variables like 'coll%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
mysql> show variables like 'haracter%';
Empty set (0.00 sec)
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
MySQL 的原始版本是5.0.77,迁移目标是5.5.8。我不知道这怎么会发生。为什么它们比较字符串的方式不同?我该如何解决这个问题?谢谢。
答案1
正如http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html这似乎是正确的行为:
utf8_general_ci 也适用于德语和法语,只是“ß”等于“s”,而不等于“ss”。如果您的应用程序可以接受这一点,则应使用 utf8_general_ci,因为它速度更快。否则,请使用 utf8_unicode_ci,因为它更准确。