pcc/ internals

The "internals" pages document the design and implementation of ccom (pcc's compiler).

Even though much of the compiler is rewritten, reading the original document still gives valuable information about the compiler structure.

The compiler is conceptually structured in two parts: pass1 which is language-dependent, does parsing, typechecking and build trees, and pass2 which is mostly language-independent.

The main data structure used through the whole compiler is the NODE. The parse trees are built out of NODEs and it is the underlying data structure when generating code.

The instruction generator searches for largest coverable tree, as described in the Dragon book for example, but with additions from me to find the most efficient instruction when there are multiple choices.

The Sethi-Ullman calculations are pretty much straightforward, can be found in almost any book on compilers. Its implementation in the original pcc were hand-crafted in target-dependent code, now it's only used to find the sub-tree evaluation order (and not to store trees) so the calculations are done based on the instructions generated.

The register allocator is implemented with its basis in the paper "Iterated Register Coalescing", ACM Transactions, No 3, May 1996. I've tried to keep it close to the pseudo-code.

The addition to the register allocator to handle multiple register classes (with overlapping regs) is mostly done by me. I haven't documented it, but some of the basics comes from the paper

A generalized algorithm for graph-coloring register allocation, ACM SIGPLAN Notices, v.39 n.6, May 2004, but i implemented a simpler and faster variant :-)

Most of the other code and ideas has its origin in either the dragon book or the Modern Compiler Implementation in Java book.

TODO: once cvsweb is setup, point to specific code as appropriate in this internals documentation.