Home > Articles, Compilers & Interpreters > Regular expressions for matching data values in compiler lexers.

Regular expressions for matching data values in compiler lexers.

Below, there are some useful regular expressions for matching C-like primitive data values.

Before presenting the regular expressions I’ll introduce you first some named patterns:

/* universal character name */
UCN (\\u[0-9a-fA-F]{4}|\\U[0-9a-fA-F]{8})

/* exponent part of a floating point number */
EXP ([Ee][-+]?[0-9]+)

/* length part of an integer number */
ILEN ([Uu](L|l|LL|ll)?|(L|l|LL|ll)[Uu]?)

Having the above named patterns as basic building blocks we can describe more complex information.

For integer numbers:

/* integer in octal form */
0[0-7]*{ILEN}?

/* integer in decimal form */
[1-9][0-9]*{ILEN}?

/* integer in hexadecimal form */
0[Xx][0-9a-fA-F]+{ILEN}?

For floating point numbers:

/* floating point number in decimal form */
([0-9]*\.[0-9]+|[0-9]+\.){EXP}?[flFL]?
[0-9]+{EXP}[flFL]?

/* floating point number in hexadecimal form */
0[Xx]([0-9a-fA-F]*\.[0-9a-fA-F]+|[0-9a-fA-F]+\.?)[Pp][-+]?[0-9]+[flFL]?

For character and string literals:

/* character literal */
\'([^'\\]|\\['"?\\abfnrtv]|\\[0-7]{1,3}|\\[Xx][0-9a-fA-F]+|{UCN})+\'

/* string literal */
L?\"([^"\\]|\\['"?\\abfnrtv]|\\[0-7]{1,3}|\\[Xx][0-9a-fA-F]+|{UCN})*\"

The following regular expression is for matching identifiers and does not describe a primitive data value. However, because it is a very important regular expression in most compilers and is related with primitive data types and values we present it:

/* identifier */
([_a-zA-Z]|{UCN})([_a-zA-Z0-9]|{UCN})*
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: