Nov 21, 2013

f Comment

MySQL: How Do I Replace Non-ASCII Characters with ASCII Characters?

Amazon Let's say you have strange invisible characters in your MySQL database. How do you replace them with other characters such as a space? In other words, how do you substitute an ASCII character for a non-ASCII character in pure MySQL commands?

My MySQL server version is 5.5.20-log MySQL Community Server (GPL)

Step 1: Locate the non-ASCII character

Let's find records with non-ASCII characters in them. These characters are often invisible or abnormal when you select them in the MySQL client.

Let's assume table name is entry, column in question is english_translation, primary key column is entry_id. Here's the MySQL command to locate the non-ASCII character.
mysql> SELECT entry_id FROM entry WHERE NOT HEX(english_translation) REGEXP '^([0-7][0-9A-F])*$' limit 1;
+----------+
| entry_id |
+----------+
|     4008 |
+----------+
1 row in set (0.00 sec)
As you can see the record with entry_id 4008 has this problem. Read on to see how to solve the next piece of the puzzle.

Step 2: Identify the non-ASCII character

Let's see the hex representation of the non-ASCII character. Here's the MySQL command to identify the non-ASCII character.
mysql> SELECT english_translation, hex(english_translation) from entry where ent
ry_id=4008;
+----------------------+---------------------------------------------------------+
| english_translation  | hex(english_translation)                                |
+----------------------+---------------------------------------------------------+
| overlook ? neglect ? | 6F7665726C6F6F6B20C2A0206E65676C65637420C2A020          |
+--------------------------------+-----------------------------------------------+
1 row in set (0.00 sec)
You can see that C2A0 is the culprit here. 20 is hex for space. For some reason C2A0 is surrounded by two spaces when it is supposed to be three sequential spaces. Read on to see how we replace the non-ASCII character with an ASCII character.

Step 3: Replace the non-ASCII character with an ASCII character

Let's say I want to replace C2A0 with a space. I will replace C2A0 with 20 in the hex representation and then un-hex it to get the original ASCII representation.

Here's the MySQL command.
mysql> update entry set english_translation = unhex(replace(hex(english_translation),'C2A0','20')) where entry_id = 4008;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
Let's check our converted result with the following MySQL command.
mysql> select english_translation from entry where entry_id = 4008;
+--------------------------------+
| english_translation            |
+--------------------------------+
| overlook   neglect             |
+--------------------------------+
1 row in set (0.00 sec)
Looks like we've fixed our problem. Questions? Let me know!

Obviously you can replace the non-ASCII character with any ASCII character you want. Simply find the hex representation of that ASCII character and use it in the MySQL command.

Related Article

MySQL Fixing Strange Invisible UTF-8 Characters in SECONDS!
Please leave a comment here!
One Minute Information - by Michael Wen
ADVERTISING WITH US - Direct your advertising requests to Michael