Work with MySQL character set and collation

  Peter        2012-06-17 07:07:28       11,416        0    

For non-English websites, they often have to deal with character set and collation if they want to store data to and read data from databases with other languages. Character set tells the database which kind of character encoding scheme to use to store or read data, collation can be simply understood as a subset of character set, it tells the database how to sort data.

We talk about working with character set and collation of MySQL today.  In MySQL, if we want to store Chinese, Japanese or other languages other than English, we may need to set the relative character set for the database, tables and columns. Also, when we connect to MySQL. we may need to set the character set for the connection. Now I summarize some commands used  to see what are the character set and collation of our database and how to change them as needed. On command prompt window, we need to log in to the mysql client with the mysql -u [username] -p command first.

Now we may want to check some variables about character set and collation for our database client and server, for example, connection character set. We can type following commands:

SHOW VARIABLES LIKE 'char%';
SHOW VARIABLES LIKE 'collation%';

The command will give us some information like
+--------------------------+---------------------------------------------------------+
| Variable_name            | Value                                                   |
+--------------------------+---------------------------------------------------------+
| character_set_client     | latin1                                                  |
| character_set_connection | latin1                                                  |
| character_set_database   | latin1                                                  |
| character_set_filesystem | binary                                                  |
| character_set_results    | latin1                                                  |
| character_set_server     | latin1                                                  |
| character_set_system     | utf8                                                    |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.1\share\charsets\ |
+--------------------------+---------------------------------------------------------+

We can easily understand that the character set we are using for the database engine. also we can change these variables by using

SET variable_name=value  /* SET character_set_connection=utf8; */

Next come to the database character set and collation, we run

SHOW CREATE DATABASE database_name

We can find our default character set in the comment of the output. If we want to change the character set and collation of the database, we run

ALTER DATABASE database_name CHARACTER SET charset_name COLLATE collation_name

We can also set the character set and collation when we create the new database

CREATE DATABASE database_name CHARACTER SET charset_name COLLATE collation_name

For database tables, the commands are similar, we run

SHOW CREATE TABLE table_name

At the end of the output, we may find the DEFAULT CHARSET or COLLATE, if we want to change them, we run

ALTER TABLE table_name CONVERT TO CHARACTER SET charset_name COLLATE collation_name

we can also set the character set and collation when we create a table, we run

CREATE TABLE table_name (column_list) CHARACTER SET charset_name COLLATE collation_name

For columns, we need to run

SHOW FULL COLUMNS IN table_name

the third column is the collation. We can change them with

ALTER TABLE table_name MODIFY col_name data_type CHARACTER SET charset_name COLLATE collation_name

By knowing all the commands above, you may be able to handle MySQL character set and collation. If you use programming languages to connect to MySQL to store and read data, you may also need to set the character encoding scheme in relative languages such as PHP.

Finally one tip for you: If you store Chinese or other non-English data in MySQL database, sometimes you may find they are displayed as question marks in the command console. You can have a try to export the data to an external sql file and open the sql file with a text editor, you may be surprised that you can see your Chinese again.  This means your data are stored properly but somehow the command console cannot display them correctly.

MYSQL  CHARACTER SET  COLLATION  CHINESE  QUESTION MARK 

       

  RELATED


  0 COMMENT


No comment for this article.



  RANDOM FUN

So what is NOR?