How to Use MySQL Regular Expressions
Introduction
MySQL provides a powerful way to perform complex text-based searches through the regular expressions function REGEXP. This string matching algorithm expresses a search through a sequence of symbols and characters to find a pattern in a text.
You can use regular expressions to match, locate, and manage text-based records in a MySQL database, saving a lot of time when working with complex searches in a huge database.
Apart from searching, you can also use REGEXP with a combination of other MySQL functions to replace and rearrange fields in a database table. You can also extract substrings from long text, split a string into human-readable tokens and verify/validate the structure of a string to check if it meets your set format.
This guide illustrates the power of the MySQL REGEXP function using examples. After completing it, you'll be able to implement different combinations of regular expressions in any MySQL database to achieve your desired results.
Prerequisites
To complete this tutorial, you need the following:
1. Create a Test Database
SSH to your server and log in to MySQL as root.
$ sudo mysql -u root -p
Key in your MySQL root password and press Enter to continue. Then, once you get the mysql>
prompt, type the command below to create a test database.
mysql> CREATE DATABASE regex_db;
Output.
Query OK, 1 row affected (0.01 sec)
Switch to the database.
mysql> USE regex_db;
Output.
Database changed
Next, create a customers
table.
mysql> CREATE table customers (
customer_id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(255)
) ENGINE = InnoDB;
Output.
Query OK, 0 rows affected (0.04 sec)
Insert some records into the customers
table.
mysql> INSERT INTO customers (first_name, last_name, email) VALUES ('JANE', 'SMITH', 'jane@example.com');
INSERT INTO customers (first_name, last_name, email) VALUES ('MARY', 'ROE', '-');
INSERT INTO customers (first_name, last_name, email) VALUES ('JOHN', 'DOE', 'john_doe@example.com');
INSERT INTO customers (first_name, last_name, email) VALUES ('MARY', 'FREDRICK', '');
INSERT INTO customers (first_name, last_name, email) VALUES ('BANNIE', 'ISAAC', 'bannie_isaac@example.com');
Output.
...
Query OK, 1 row affected (0.01 sec)
Execute the SELECT
statement below to verify the records.
mysql> SELECT
customer_id,
first_name,
last_name,
email
FROM customers;
You should see the following list of customers.
+-------------+------------+-----------+--------------------------+
| customer_id | first_name | last_name | email |
+-------------+------------+-----------+--------------------------+
| 1 | JANE | SMITH | jane@example.com |
| 2 | MARY | ROE | - |
| 3 | JOHN | DOE | john_doe@example.com |
| 4 | MARY | FREDRICK | |
| 5 | BANNIE | ISAAC | bannie_isaac@example.com |
+-------------+------------+-----------+--------------------------+
5 rows in set (0.00 sec)
With the sample database and table in place, you'll now test the different regular expressions supported by MySQL.
2. Match the Beginning of a String
To match a record beginning with a certain pattern, use the Caret character ^
. For instance, to match all first_name's
that start with the letter J
, execute the command below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '^J';
You should see a list of customers matching your pattern, as shown below.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
| 3 | JOHN | DOE |
+-------------+------------+-----------+
2 rows in set (0.00 sec)
While this is for demonstration purposes, you can use the above example in a real-life scenario. For instance, if you're in the manufacturing industry, you can use this approach to select all batch numbers that begin with certain characters from a database to recall defective items.
In the next step, you'll see how you can match the end of a string.
3. Match End of a String
Use the Dollar $
character to match the position of a string right after the last character. For instance, to match all last_name's
ending with the letters OE
, run the statement below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE last_name REGEXP 'OE$';
Only MARY ROE's
AND JOHN DOE's
names end with the letters OE
and they should be listed as follows.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 2 | MARY | ROE |
| 3 | JOHN | DOE |
+-------------+------------+-----------+
2 rows in set (0.00 sec)
You can use this approach, for instance, to retrieve all customers' records from a specific zip code if their address information is stored in a single column.
To test this, add address
and zip
columns to the customers
table.
mysql> ALTER TABLE customers
ADD address VARCHAR(255)
AFTER email ,
ADD zip VARCHAR(20)
AFTER address;
Output.
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Next, update the customer's address information.
mysql> UPDATE customers SET address ='123 FIRST RD. TOWN 1 AZ 8659' WHERE customer_id = '1';
UPDATE customers SET address ='456 2ND RD. TOWN 2 AZ 12345' WHERE customer_id = '2';
UPDATE customers SET address ='789 3RD RD. TOWN 3 AZ 8659' WHERE customer_id = '3';
UPDATE customers SET address ='555 FIRST RD. TOWN 1 AZ 8659' WHERE customer_id = '4';
UPDATE customers SET address ='987 FIFTH RD TOWN 2 1414' WHERE customer_id = '5';
Output.
...
Rows matched: 1 Changed: 1 Warnings: 0
Make sure the table is updated.
mysql> SELECT * FROM customers;
Output
+-------------+------------+-----------+--------------------------+------------------------------+------+
| customer_id | first_name | last_name | email | address | zip |
+-------------+------------+-----------+--------------------------+------------------------------+------+
| 1 | JANE | SMITH | jane@example.com | 123 FIRST RD. TOWN 1 AZ 8659 | NULL |
| 2 | MARY | ROE | - | 456 2ND RD. TOWN 2 AZ 12345 | NULL |
| 3 | JOHN | DOE | john_doe@example.com | 789 3RD RD. TOWN 3 AZ 8659 | NULL |
| 4 | MARY | FREDRICK | | 555 FIRST RD. TOWN 1 AZ 8659 | NULL |
| 5 | BANNIE | ISAAC | bannie_isaac@example.com | 987 FIFTH RD TOWN 2 1414 | NULL |
+-------------+------------+-----------+--------------------------+------------------------------+------+
5 rows in set (0.00 sec)
As you can see, the address details, including the street, town, state, and zip, are held in one column. For example, to get all customers from the zip code 8659
, execute the command below.
mysql> SELECT
customer_id,
first_name,
last_name,
address,
zip
FROM customers
WHERE address REGEXP '8659$';
Output.
+-------------+------------+-----------+------------------------------+------+
| customer_id | first_name | last_name | address | zip |
+-------------+------------+-----------+------------------------------+------+
| 1 | JANE | SMITH | 123 FIRST RD. TOWN 1 AZ 8659 | NULL |
| 3 | JOHN | DOE | 789 3RD RD. TOWN 3 AZ 8659 | NULL |
| 4 | MARY | FREDRICK | 555 FIRST RD. TOWN 1 AZ 8659 | NULL |
+-------------+------------+-----------+------------------------------+------+
3 rows in set (0.00 sec)
You can also use MySQL REGEXP
results in an UPDATE
statement to edit records. For instance, in this case, you can now update the customer's zip
code column to the correct value using the statement below.
mysql> UPDATE customers
SET zip = '8659'
WHERE address REGEXP '8659$';
Output.
Rows matched: 3 Changed: 3 Warnings: 0
Confirm the new changes.
mysql> SELECT
*
FROM customers;
Output
+-------------+------------+-----------+--------------------------+------------------------------+------+
| customer_id | first_name | last_name | email | address | zip |
+-------------+------------+-----------+--------------------------+------------------------------+------+
| 1 | JANE | SMITH | jane@example.com | 123 FIRST RD. TOWN 1 AZ 8659 | 8659 |
| 2 | MARY | ROE | - | 456 2ND RD. TOWN 2 AZ 12345 | NULL |
| 3 | JOHN | DOE | john_doe@example.com | 789 3RD RD. TOWN 3 AZ 8659 | 8659 |
| 4 | MARY | FREDRICK | | 555 FIRST RD. TOWN 1 AZ 8659 | 8659 |
| 5 | BANNIE | ISAAC | bannie_isaac@example.com | 987 FIFTH RD TOWN 2 1414 | NULL |
+-------------+------------+-----------+--------------------------+------------------------------+------+
5 rows in set (0.00 sec)
As you can see, the MySQL REGEXP
function is essential when it comes to finding and rearranging text-based column values. In the next step, you'll search for values that match characters enclosed with square brackets.
4. Match Characters Between Square Brackets
You can match any character listed between the square brackets in MySQL using the regular expression. For instance, to list all first_name's
containing the character H
or Y
, use the syntax below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '[HY]';
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 2 | MARY | ROE |
| 3 | JOHN | DOE |
| 4 | MARY | FREDRICK |
+-------------+------------+-----------+
3 rows in set (0.00 sec)
Similarly, use the command below to locate all last_name's
that contain either the characters R
or D
followed by OE
.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE last_name REGEXP '[RD]OE';
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 2 | MARY | ROE |
| 3 | JOHN | DOE |
+-------------+------------+-----------+
2 rows in set (0.00 sec)
The two searches above are helpful when you're not sure about the exact name you're looking for in a database, but you at least happen to know some of the characters it contains. Next, you'll search a range of characters.
5. Match a Range of Characters
You can use a dash between a range of characters to locate a string using REGEXP
. For instance, to retrieve all first_name's
containing letters a-d
, use the syntax below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '[a-d]';
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
| 2 | MARY | ROE |
| 4 | MARY | FREDRICK |
| 5 | BANNIE | ISAAC |
+-------------+------------+-----------+
4 rows in set (0.00 sec)
To match any zip
codes that contain numbers 0-9
, run the SQL command below.
mysql> SELECT
customer_id,
first_name,
last_name,
zip
FROM customers
WHERE zip REGEXP '[0-9]';
Output.
+-------------+------------+-----------+------+
| customer_id | first_name | last_name | zip |
+-------------+------------+-----------+------+
| 1 | JANE | SMITH | 8659 |
| 3 | JOHN | DOE | 8659 |
| 4 | MARY | FREDRICK | 8659 |
+-------------+------------+-----------+------+
3 rows in set (0.00 sec)
In a production environment, you can use the above approaches to validate records. For instance, if you'd like to send text messages to your customers, you can filter records that only contain numeric phone numbers. Next, you'll match substrings.
6. Match a Substring
You can match a substring at any position in MySQL using regular expressions. For instance, find all names that contain the substring AN
in that order by running the query below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP 'AN';
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
| 5 | BANNIE | ISAAC |
+-------------+------------+-----------+
2 rows in set (0.00 sec)
Next, find a substring at a specific position by using the syntax below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '^..NE';
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
+-------------+------------+-----------+
1 row in set (0.00 sec)
In most cases, you'll use substring to locate and replace erroneously entered data in MySQL. Next, you'll see how to use the character classes.
7. Match POSIX Character Classes
The MySQL REGEXP
clause supports searching by character classes. This means you can search and return records that contain either alphabetical or numerical characters or even both.
For instance, use the syntax below to look for alphanumeric characters in the first_name
field.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '[:alnum:]';
Since all first_name's
contain alphanumeric characters, they should be listed as follows.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
| 2 | MARY | ROE |
| 3 | JOHN | DOE |
| 4 | MARY | FREDRICK |
| 5 | BANNIE | ISAAC |
+-------------+------------+-----------+
5 rows in set (0.01 sec)
To search the alphabetical characters only, use the syntax below.
mysql> SELECT
customer_id,
first_name,
last_name
FROM customers
WHERE first_name REGEXP '[:alpha:]' ;
Output.
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
| 1 | JANE | SMITH |
| 2 | MARY | ROE |
| 3 | JOHN | DOE |
| 4 | MARY | FREDRICK |
| 5 | BANNIE | ISAAC |
+-------------+------------+-----------+
5 rows in set (0.00 sec)
To search digits only, for instance, to locate zip
codes with numeric characters, run the SQL command below.
mysql> SELECT
customer_id,
first_name,
last_name,
zip
FROM customers
WHERE zip REGEXP '[:digit:]';
Output.
+-------------+------------+-----------+------+
| customer_id | first_name | last_name | zip |
+-------------+------------+-----------+------+
| 1 | JANE | SMITH | 8659 |
| 3 | JOHN | DOE | 8659 |
| 4 | MARY | FREDRICK | 8659 |
+-------------+------------+-----------+------+
3 rows in set (0.00 sec)
Although this is not a definitive list of all MySQL regular expressions, it should help you grasp the basic concepts.
Conclusion
In this guide, you've created a sample database and a table. You've also tested the MySQL REGEXP function using different combinations. Use the knowledge in this guide when you want to perform complex searches in your next MySQL project.