ERROR 1139 (42000)
In the following example, we'd like to select the city name which are regular-expressly like (rlike or regexp) the string 'New York+', but the statement failed due to ERROR 1139.
mysql> select * from regions where city rlike concat('(^| )', 'New York+', '[ ,.!?]+');
ERROR 1139 (42000): Got error 'repetition-operator operand invalid' from regexp
After looking up the official documentation, we get the information about special characters as following: MySQL :: MySQL 8.0 Reference Manual :: 9.1.1 String Literals.
Within a string, certain sequences have special meaning unless the NO_BACKSLASH_ESCAPES SQL mode is enabled. Each of these sequences begins with a backslash (“”), known as the escape character. MySQL recognizes the escape sequences shown in Table 9.1, “Special Character Escape Sequences”. For all other escape sequences, backslash is ignored
...
Table 9.1 Special Character Escape Sequences
Escape Sequence Character Represented by Sequence
\0 An ASCII NUL (0x00) character
\' A single quote (“'”) character
\" A double quote (“"”) character
\b A backspace character
\n A newline (linefeed) character
\r A carriage return character
\t A tab character
\Z ASCII 26 (Control+Z); see note following the table
\\ A backslash (“”) character
\% A “%” character; see note following the table
\_ A “_” character; see note following the table
That is, MySQL does not treat plus signs (+) or question mark (?) as special characters, thus they are just normal text and stored in tables without specially treated. But from regular expression's perspective (e.g. rlike or regexp), they are special, very special!
The only way that you can do is to escape them manually by adding a backslash preceding every confusing character which are not mentioned in the above. Sounds great, let's think how to escape. How?
The only working function is replace(). Here is an example that seems to be the right way to correct the problem, but it doesn't.
mysql> select * from regions where city rlike concat('(^| )', replace('New York+', '+', +'), '[ ,.!?]+');
ERROR 1139 (42000): Got error 'repetition-operator operand invalid' from regexp
You can see that we replaced the '+' character with '+' to prevent the error, but it failed to do so.
The fact is that you have to escape your backslash first, then MySQL engine can recognize your clear intention of escaping.
To make the backslash survive through the string recognition, you should double the backslash.
mysql> select * from regions where city rlike concat('(^| )', replace('New York+', '+', \+'), '[ ,.!?]+');
This time, it works.
Please note that, the real key concepts to this problem are as:
- Manually escape every confusing character ignored by MySQL for regular expression clauses in statements.
- Add double backslashes to indicate MySQL engine to escape the following character.