Home > Articles, Compilers & Interpreters > Regular expressions for matching data values in compiler lexers.

Regular expressions for matching data values in compiler lexers.

Below, there are some useful regular expressions for matching C-like primitive data values.

Before presenting the regular expressions I’ll introduce you first some named patterns:

/* universal character name */
UCN (\\u[0-9a-fA-F]{4}|\\U[0-9a-fA-F]{8})

/* exponent part of a floating point number */
EXP ([Ee][-+]?[0-9]+)

/* length part of an integer number */
ILEN ([Uu](L|l|LL|ll)?|(L|l|LL|ll)[Uu]?)

Having the above named patterns as basic building blocks we can describe more complex information.

For integer numbers:

/* integer in octal form */

/* integer in decimal form */

/* integer in hexadecimal form */

For floating point numbers:

/* floating point number in decimal form */

/* floating point number in hexadecimal form */

For character and string literals:

/* character literal */

/* string literal */

The following regular expression is for matching identifiers and does not describe a primitive data value. However, because it is a very important regular expression in most compilers and is related with primitive data types and values we present it:

/* identifier */
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: