Suppose we have a table named TEAMS like this:
mysql> select * from teams;
+----+-------------------+
| id | name |
+----+-------------------+
| 1 | New York Yankees |
| 2 | Seattle Mariners |
| 3 | Chicago White Sox |
| 4 | New York Mets |
+----+-------------------+
4 rows in set (0.00 sec)
As we can see, at most 3 words in column NAME. And we'd like to split strings into individual words as every rows. Now, we have two questions to answer.
- How can we split the strings sequentially?
- How can we stack individual words as rows?
My solution is combined with a string function substring_index() and a set operator UNION to stack the same table 3 times.
mysql> select id, substring_index(substring_index(name, ' ', 1), ' ', -1) name from teams
-> union
-> select id, substring_index(substring_index(name, ' ', 2), ' ', -1) name from teams
-> union
-> select id, substring_index(substring_index(name, ' ', 3), ' ', -1) name from teams;
+----+----------+
| id | name |
+----+----------+
| 1 | New |
| 2 | Seattle |
| 3 | Chicago |
| 4 | New |
| 1 | York |
| 2 | Mariners |
| 3 | White |
| 4 | York |
| 1 | Yankees |
| 3 | Sox |
| 4 | Mets |
+----+----------+
11 rows in set (0.00 sec)
Furthermore, if you want an ordered list, you can append order by clause to the statement.
mysql> select id, substring_index(substring_index(name, ' ', 1), ' ', -1) name from teams
-> union
-> select id, substring_index(substring_index(name, ' ', 2), ' ', -1) name from teams
-> union
-> select id, substring_index(substring_index(name, ' ', 3), ' ', -1) name from teams
-> order by 1, 2;
+----+----------+
| id | name |
+----+----------+
| 1 | New |
| 1 | Yankees |
| 1 | York |
| 2 | Mariners |
| 2 | Seattle |
| 3 | Chicago |
| 3 | Sox |
| 3 | White |
| 4 | Mets |
| 4 | New |
| 4 | York |
+----+----------+
11 rows in set (0.01 sec)
If you just want a distinct list, you can remove column id:
mysql> select substring_index(substring_index(name, ' ', 1), ' ', -1) name from teams
-> union
-> select substring_index(substring_index(name, ' ', 2), ' ', -1) name from teams
-> union
-> select substring_index(substring_index(name, ' ', 3), ' ', -1) name from teams
-> order by 1;
+----------+
| name |
+----------+
| Chicago |
| Mariners |
| Mets |
| New |
| Seattle |
| Sox |
| White |
| Yankees |
| York |
+----------+
9 rows in set (0.01 sec)
This approach is pretty scalable. In similar, you can handle N-word string in the same manner by stacking the same table N-times with UNION. Moreover, you can change the delimiter from a white space into a pipe, a comma, a dot, or a colon, etc.