Handbook of Commonly Used Regular Expressions

This morning, I saw that a former colleague shared a link to a collection of commonly used regular expressions compiled by “Efficient Operations and Maintenance”. It seems quite comprehensive and can cover most of the scenarios in general work.
Here is the link: https://mp.weixin.qq.com/s/o3SnwUlm5YXRrJ5ESzd0Bw
The following is some of the content:

Expressions for Validating Characters

Chinese characters: ^[\u4e00 - \u9fa5]{0,}$

English letters and numbers: ^[A - Za - z0 - 9]+$ or ^[A - Za - z0 - 9]{4,40}$

All characters with a length of 3 - 20: ^.{3,20}$

Strings composed of 26 English letters: ^[A - Za - z]+$

Strings composed of 26 uppercase English letters: ^[A - Z]+$

Strings composed of 26 lowercase English letters: ^[a - z]+$

Strings composed of numbers and 26 English letters: ^[A - Za - z0 - 9]+$

Strings composed of numbers, 26 English letters, or underscores: ^\w+$ or ^\w{3,20}

Chinese, English, numbers, including underscores: ^[\u4E00 - \u9FA5A - Za - z0 - 9_]+$

Chinese, English, numbers, but not including symbols like underscores: ^[\u4E00 - \u9FA5A - Za - z0 - 9]+$ or ^[\u4E00 - \u9FA5A - Za - z0 - 9]{2,20}$

Characters that can include ^%&’,;=?$" etc.: [^%&’,;=?$\x22]+

Characters that prohibit the inclusion of : [^\x22]+

Others
.* matches any character except \n. /[\u4E00 - \u9FA5]/ Chinese characters
/[\uFF00 - \uFFFF]/ Full - width symbols
/[\u0000 - \u00FF]/ Half - width symbols

Expressions for Special Requirements

Email address: ^\w+([- +.]\w+)*@\w+([-.]\w+).\w+([-.]\w+)$

Domain name: [a - zA - Z0 - 9][-a - zA - Z0 - 9]{0,62}(/.[a - zA - Z0 - 9][-a - zA - Z0 - 9]{0,62})+/.?

Internet URL: [a - zA - Z]+://[^\s]* or ^http://([\w - ]+.)+[\w - ]+(/[\w -./?%&=]*)?$

Mobile phone number: ^(13[0 - 9]|14[5|7]|15[0|1|2|3|5|6|7|8|9]|18[0|1|2|3|5|6|7|8|9])\d{8}$

Telephone number (“XXX - XXXXXXX”, ”XXXX - XXXXXXXX”, ”XXX - XXXXXXX”, ”XXX - XXXXXXXX”, ”XXXXXXX” and ”XXXXXXXX): ^((\d{3,4}-)|\d{3.4}-)?\d{7,8}$

Domestic telephone number (0511 - 4405222, 021 - 87888822): \d{3}-\d{8}|\d{4}-\d{7}

ID number (15 - digit, 18 - digit numbers): ^\d{15}|\d{18}$

Short ID number (ending with numbers or the letter x): ^([0 - 9]){7,18}(x|X)?$ or ^\d{8,18}|[0 - 9x]{8,18}|[0 - 9X]{8,18}?$

Whether the account is legal (starting with a letter, allowing 5 - 16 bytes, allowing letters, numbers, and underscores): ^[a - zA - Z][a - zA - Z0 - 9_]{4,15}$

Password (starting with a letter, with a length between 6 and 18, can only contain letters, numbers, and underscores): ^[a - zA - Z]\w{5,17}$

Strong password (must contain a combination of uppercase and lowercase letters and numbers, cannot use special characters, with a length between 8 and 10): ^(?=.\d)(?=.[a - z])(?=.*[A - Z]).{8,10}$

Date format: ^\d{4}-\d{1,2}-\d{1,2}

12 months of a year (01 - 09 and 1 - 12): ^(0?[1 - 9]|1[0 - 2])$

31 days of a month (01 - 09 and 1 - 31): ^((0?[1 - 9])|((1|2)[0 - 9])|30|31)$

Money Input Format

There are four forms of money representation that we can accept: “10000.00” and “10,000.00”, and “10000” and “10,000” without “cents”: ^[1 - 9][0 - 9]*$

This represents any number that does not start with 0. However, it also means that the character “0” does not pass. So we use the following form: ^(0|[1 - 9][0 - 9]*)$

A 0 or a number that does not start with 0. We can also allow a minus sign at the beginning: ^(0|-?[1 - 9][0 - 9]*)$

This represents a 0 or a possibly negative number that does not start with 0. Let the user start with 0. Remove the minus sign because money can’t be negative. What we need to add next is to explain the possible decimal part: ^[0 - 9]+(.[0 - 9]+)?$

It must be noted that there should be at least 1 digit after the decimal point. So “10.” does not pass, but “10” and “10.2” do: ^[0 - 9]+(.[0 - 9]{2})?$

In this way, we stipulate that there must be two digits after the decimal point. If you think this is too strict, you can do this: ^[0 - 9]+(.[0 - 9]{1,2})?$

This allows the user to write only one decimal digit. Now we should consider the commas in the numbers. We can do this: ^[0 - 9]{1,3}(,[0 - 9]{3})*(.[0 - 9]{1,2})?$

1 to 3 digits, followed by any number of commas + 3 digits. The comma becomes optional instead of mandatory: ^([0 - 9]+|[0 - 9]{1,3}(,[0 - 9]{3})*)(.[0 - 9]{1,2})?$


The following is the content updated on June 5, 2023:
It seems that with the emergence of ChatGPT, we almost don’t need to spend a lot of time learning this kind of knowledge that requires memorization. A general understanding is enough, and we can leave the rest to ChatGPT.