Below, there are some useful regular expressions for matching C-like primitive data values.
Before presenting the regular expressions I’ll introduce you first some named patterns:
/* universal character name */
UCN (\\u[0-9a-fA-F]{4}|\\U[0-9a-fA-F]{8})
/* exponent part of a floating point number */
EXP ([Ee][-+]?[0-9]+)
/* length part of an integer number */
ILEN ([Uu](L|l|LL|ll)?|(L|l|LL|ll)[Uu]?)
Having the above named patterns as basic building blocks we can describe more complex information.
For integer numbers:
/* integer in octal form */
0[0-7]*{ILEN}?
/* integer in decimal form */
[1-9][0-9]*{ILEN}?
/* integer in hexadecimal form */
0[Xx][0-9a-fA-F]+{ILEN}?
For floating point numbers:
/* floating point number in decimal form */
([0-9]*\.[0-9]+|[0-9]+\.){EXP}?[flFL]?
[0-9]+{EXP}[flFL]?
/* floating point number in hexadecimal form */
0[Xx]([0-9a-fA-F]*\.[0-9a-fA-F]+|[0-9a-fA-F]+\.?)[Pp][-+]?[0-9]+[flFL]?
For character and string literals:
/* character literal */
\'([^'\\]|\\['"?\\abfnrtv]|\\[0-7]{1,3}|\\[Xx][0-9a-fA-F]+|{UCN})+\'
/* string literal */
L?\"([^"\\]|\\['"?\\abfnrtv]|\\[0-7]{1,3}|\\[Xx][0-9a-fA-F]+|{UCN})*\"
The following regular expression is for matching identifiers and does not describe a primitive data value. However, because it is a very important regular expression in most compilers and is related with primitive data types and values we present it:
/* identifier */
([_a-zA-Z]|{UCN})([_a-zA-Z0-9]|{UCN})*


