罪魁祸首是:
\w+([-']\w+)*
\w+
将匹配数量和因为没有.
出现,这将只匹配3
在3.14
。将选项稍微移动一点,使其\$?\d+(\.\d+)?%?
在上述正则表达式部分之前(以便首先尝试在数字格式上进行匹配):
(?x)([A-Z]\.)+|\$?\d+(\.\d+)?%?|\w+([-']\w+)*|[+/\-@&*]
或以扩展形式:
pattern = r'''(?x) # set flag to allow verbose regexps
([A-Z]\.)+ # abbreviations, e.g. U.S.A.
| \$?\d+(\.\d+)?%? # numbers, incl. currency and percentages
| \w+([-']\w+)* # words w/ optional internal hyphens/apostrophe
| [+/\-@&*] # special characters with meanings
'''