How to convert an entire database to a new character set collation

This post talks about how you can convert an entire database to a specific collection.

Why you might want to change the collation of all tables

One of the most common reasons for changing many tables collations is to support a newly expanded character set. Many databases have been created with a standard ASCII character set, latin1. This means that if you want to start supporting Unicode characters you will need to change the collation.

Changing a single tables character set collation

To change a single table to a different you can run the following command

ALTER TABLE tablename CONVERT TO CHARACTER SET <character set> COLLATE <collaction>;

This will update the table to the new character set and collation. In doing so this will apply appropriate conversion of data types from one type to another.

For example to convert a table to utf8_unicode_ci you can run the following command.

ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Before performing this operation you will want to back up the tables you are modifying. This is because when a conversion between character sets is made there may be some lost data. This especially applies if you are converting from a highly expressive character set to a less expressive one. For example if you are converting from a utf8 character set to a ascii one many of the higher valued characters will be converted or lost.

This may cause you to lose some of the content stored in the database. It is recommended if converting to a less expressive collection to manually convert it. This will mean you are in control of how the data is converted before applying the database change.

Changing all tables collation

To change all tables to a specific collation you can use the below snippet to create the queries you need.

SELECT CONCAT('ALTER TABLE `', TABLE_NAME,
'` CONVERT TO CHARACTER SET <character set> COLLATE <collation>;') AS mySQL
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA= "schema name"
AND TABLE_TYPE="BASE TABLE"
limit 10000

This was taken from an answer on stack overflow and creates the statements needed to convert all tables. Once the queries have been created you can decide which tables you want to convert if you don’t want to convert all of them.

Using these queries you will hopefully be able to convert the table or tables to a new character set and collation.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.