Archive

Posts Tagged ‘lexer’

How to create an abstract syntax tree while parsing an input stream.

May 25, 2014 2 comments

In this article I’ll show you how you can create the abstract syntax tree (AST) of an input stream while parsing it. A parser usually reads various tokens from its input by using a lexer as a helper coroutine and tries to match various grammar rules that specify the syntax of a language (the source language).

Read more…

Techniques for resolving common grammar conflicts in parsers.

May 17, 2014 Leave a comment

In this article I’ll present to you some common conflicts that usually occur in Bison grammars and ways of resolving these. At first, conflicts in Bison context are situations where a sequence of input can be parsed in multiple ways according to the specified BNF grammar rules.

Read more…

How to unify similar tokens when constructing parsers and lexers.

May 12, 2014 Leave a comment

In this article I’ll show you how syntactically similar tokens can be handled in a unified way in order to simplify both lexical and syntactic analysis. The trick is to use one token for several similar operators in order to keep down the size of the grammar in the parser and to simplify the regular expression rules in the lexer.

Read more…

Using literal character tokens when designing lexers and parsers.

May 11, 2014 Leave a comment

Sometimes while I exploring the source code of various free software Flex lexers and Bison parsers I see name declarations for single character tokens.

Read more…

Implementing the “include” directive to support nested input files.

May 11, 2014 Leave a comment

Many programming languages and computer files have a directive, often called “include” (as well as “copy” and “import”), that causes the contents of a second file to be inserted into the original file. These included files are called copybooks or header files. They are often used to define the physical layout of program data, pieces of procedural code and/or forward declarations while promoting encapsulation and the reuse of code.

Read more…

Regular expressions for matching data values in compiler lexers.

May 10, 2014 Leave a comment

Below, there are some useful regular expressions for matching C-like primitive data values.

Read more…

Ignoring multiline comments with start states in compilers with Flex.

May 10, 2014 Leave a comment

In a previous article I have presented a way for ignoring multiline comments with an old fashion way.

In this article I’ll demonstrate a more elegant Flex-like way for ignoring multiline comments.

Read more…