正则表达式无法匹配手机号

某用户询问,查询语句中使用以下正则表达式时无法匹配到手机号

select mobile from ods_fwglypt_fwglypt_explainers 
where regex_match(mobile,'^1((34[0-8])|(8\d{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])\d{1}))\d{7}$');

  1. 在SinoDB中使用正则表达式匹配数字时,不能使用\d,要使用[0-9],所以以上正则表达式需要改成
regex_match(mobile,'^1((34[0-8])|(8[0-9]{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])[0-9]{1}))[0-9]{7}$');
  1. 测试发现,当存储手机号的字段为字符型时,在zh_cn.utf8库下使用正则表达式无法匹配到手机号,在数据库默认字符集en_US.8859-1下,可以匹配到手机号。当存储手机号的字段为数值型时则没有这种问题。
  • zh_cn.utf8字符集下测试结果:

[informix@vm84145 ~]$ env |grep LOCALE
CLIENT_LOCALE=zh_cn.utf8
DB_LOCALE=zh_cn.utf8
[informix@vm84145 ~]$ dbaccess testdb -

Database selected.

> select * from t56;


         c1                   c2 c3           

          1          15980271817 15980271222 
          1          13580271817 13580271122 
          1               135802 13580271123 
          1          13580242347 13580271123 
          7          15980271817 15980271817 
          8          11111111111 15980271817 

6 row(s) retrieved.

> info columns for t56;

Column name          Type                                    Nulls

c1                   integer                                 yes
c2                   bigint                                  yes
c3                   varchar(12,0)                           yes

> select * from t56 where regex_match(c2,'^1((34[0-8])|(8[0-9]{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])[0-9]{1}))[0-9]{7}$');


         c1                   c2 c3           

          1          15980271817 15980271222 
          1          13580271817 13580271122 
          1          13580242347 13580271123 
          7          15980271817 15980271817 

4 row(s) retrieved.

> select * from t56 where regex_match(c3,'^1((34[0-8])|(8[0-9]{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])[0-9]{1}))[0-9]{7}$');


         c1                   c2 c3           


No rows found.


  • en_US.8859-1字符集下测试结果:

[informix@vm84145 ~]$ source setLocale.env 
[informix@vm84145 ~]$ env |grep LOCALE
CLIENT_LOCALE=en_US.8859-1
DB_LOCALE=en_US.8859-1
[informix@vm84145 ~]$ rlwrap dbaccess testdb3 -

Database selected.

> select * from t56;


         c1                   c2 c3           

          1          15980271817 15980271817 
          1          13580271817 15980271817 
          1          13580271817 15980aaaad7 
          1          13580271817 15987       

4 row(s) retrieved.

> info columns for t56;


Column name          Type                                    Nulls

c1                   integer                                 yes
c2                   bigint                                  yes
c3                   varchar(12,0)                           yes
> select * from t56 where regex_match(c2,'^1((34[0-8])|(8[0-9]{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])[0-9]{1}))[0-9]{7}$');


         c1                   c2 c3           

          1          15980271817 15980271817 
          1          13580271817 15980271817 
          1          13580271817 15980aaaad7 
          1          13580271817 15987       

4 row(s) retrieved.

> select * from t56 where regex_match(c3,'^1((34[0-8])|(8[0-9]{2})|(([35][0-35-9]|4[579]|66|7[35678]|9[1389])[0-9]{1}))[0-9]{7}$');


         c1                   c2 c3           

          1          15980271817 15980271817 
          1          13580271817 15980271817 

2 row(s) retrieved.

>