Part I - Compiler

In part I of this book, we will focus on building the compiler. We will worry about debugging later. In fact, most programming languages are developed this way. The more cleverness one put into the compiler, the more difficult it is to write the debugger. In order to make our book and project tractable, we will build a very simple compiler.

Overall structure

Like any other compiler - our compiler is divided into a front-end and a back-end. The front-end consist of a scanner and a parser and build the parse tree from the source text. The backend consist of a code generator and will generate the binary from the parse tree. In the following, we will talk about the components at high level.

Scanner

A scanner, also known as lexical analyzer, is our first step in the compilation. It takes file as an input and generate a sequence of tokens. A token is simply a part of the input that is classified. For example, the word function is a keyword, and the number 10 is an integer literal.

Parser

The parser is our second step in the compilation. Using the stream of symbols built from the scanner, the parser build a parse tree that maps the structure of the program.

Code generator

The code generator will walk the parse tree and generate labeled code. We will generate the instructions required to execute the program, and for convenience, we imagine that we can use label and branch to the label instead of the address, since the real address is just unknown at this point of time.

Then we will walk the generated code to eliminate the labels and bind the branch/call instruction to address instead. This is roughly what a linker or assembler usually do.

Moving forward

In the next chapter, we will actually start coding. We will start from the scanner.

results matching ""

    No results matching ""