Skip to content

How to Convert UTF8 to UTF8MB4

UTF8MB4

Since MySQL UTF8 supports only up to 3-byte characters, so we can't store 4-byte characters in UTF8 columns. This will be a problem for serving visitors who come from multi-byte language (1 to 4 bytes) countries.

For more flexible and extendable in the future, MySQL DBA should consider to convert an entire database from UTF8 to UTF8MB4 in order to accommodate up to 4 bytes characters.

The very first step is to set MySQL server configuration.

MySQL Server Configuration

According to the official document:
MySQL :: MySQL 5.7 Reference Manual :: 10.1.5 Configuring the Character Set and Collation for Applications
Specify character settings at server startup. To select a character set and collation at server startup, use the --character-set-server and --collation-server options. For example, to specify the options in an option file, include these lines:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci

These settings apply server-wide and apply as the defaults for databases created by any application, and for tables created in those databases.
We should follow the manual to add two lines in the file.

[root@test ~]# vi /etc/my.cnf
...
[mysqld]
...
# Startup MySQL server with utf8mb4
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4
...

Please restart mysqld service for further configurations.

[root@test ~]# service mysqld restart
Stopping mysqld:                                           [  OK  ]
Starting mysqld:                                           [  OK  ]

MySQL Clients Configuration

You have to SET NAMES for the connections to utf8mb4. Take PHP code snippet as an example:

...
$options = array(PDO::ATTR_PERSISTENT => true, PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci', PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION);
...

Database Character Set Conversion.

You can use mysql command-line to execute the conversion.

[root@test ~]# mysql -u root -p -D db1 -e "alter database `db1` character set = utf8mb4 collate = utf8mb4_unicode_ci"
Enter password:

Drop all foreign key constrains before altering tables.

Before you execute the sql file, you have to make sure all the reference constrains are all dropped. Otherwise, you will get ERROR 1832 on string type column foreign constrains. For example.

ERROR 1832 (HY000): Cannot change column 'facebook_url': used in a foreign key constraint 'urls_fk'

All Tables Character Set Conversion.

It would be a hard work to identify all tables and do the conversion one by one when there're thousands of tables in the database. Next, I introduce a smarter way to output all executable sql statements to a file by one line command.

[root@test ~]# mysql -u root -p -D db1 -s -e "select concat('alter table ', table_name, ' convert to character set utf8mb4 collate utf8mb4_unicode_ci;') from information_schema.tables where table_schema = 'db1'" > utf8mb4.sql
Enter password:

Let's see the content of the sql file:

[root@test ~]# cat utf8mb4.sql
alter table table1 convert to character set utf8mb4 collate utf8mb4_unicode_ci;
alter table table2 convert to character set utf8mb4 collate utf8mb4_unicode_ci;
...

Let's do the actual conversion on all tables.

[root@test ~]# mysql -u root -p -D db1 < utf8mb4.sql
Enter password:

During the process, you might see ERROR 1071 at line xxx. To solve the problem, you may refer to this post: How to Resolve ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes

Recreate all dropped reference constrains.

Leave a Reply

Your email address will not be published. Required fields are marked *