Back to wiki

RCS History

/ TOP / how-it-works.mdwn

Revision 1.1


The process of taking a C program and converting it into a binary executable happens in quite a few steps.  The basic process is:

    pre-processor -> compiler -> assembler -> linker

This process is performed by _pcc_.  You can see each of these steps being performed by invoking _pcc_ with _-v_:

    $ pcc -v helloworld.c

_pcc_ is actually a "wrapper" program which invokes other programs to perform stages of the compilation process.

__The C source file__

The C source file contains the following information:

- C code - this is the code which is compiled into the executable
- pre-processor directives - these directives instruct the pre-processor to include files and expand macros
- pragmas appear similar to pre-processor directives but are interpreted by the C compiler to control the generation of instructions

__The pre-processor__

The pre-processor is the first step of the compilation process.  It reads the C source file and processes the pre-processor directives.  The most-familiar directives are #include and #define.

You can see the output after the pre-processing step by invoking _pcc_ with _-E_:

    $ pcc -E helloword.c

The pre-processed result is called _helloword.i_.

__The compiler proper__

The real compilation work is performed by the compiler proper.  The compiler takes the pre-processed C source and generates assembly language code suitable for the system assembler.  The style and syntax of assembly language varies greatly between machines and operating systems.

You can see the output after the compilation step by invoking _pcc_ with _-S_:

    $ pcc -S helloworld.c

The compiled result is called _helloword.s_.

__The assembler__

The assembler takes the assembly language program and creates object files.  Object files contain machine code, but not in a format suitable for executing yet.  These object files can be stored in libraries for later use.

You can see the output after the assembler step by invoking _pcc_ with _-c_:

    $ pcc -c helloworld.c

The compiled result is called _helloworld.o_.

__The linker__

The final step in the compilation process is to link the object file with the system library and startup files to generate the executable binary.

The system library is generally called libc and it is a library of other object files containing machine code.  The startup files are object files containing machine code to create an environment for the program.  Ever wondered where _argc_ and _argv_ come from?  The startup code obtains this information from the operating system and builds the parameters before invoking _main()_.

__Putting it all together__

Consider the following command-line using _pcc_ to compile a program:

    $ pcc -g -O -I/usr/local/include -D_DEBUG -Wl,-r/usr/local/lib prog.c

The first option is _-g_ which instructs the compilation process to compile with debugging information.  This options is not needed by the pre-processor, but is used by the compiler, assembler and sometimes the linker to put debugging information in the executable.  This information can be used by a symbolic debugger.

The second option is _-O_ which enables compiler optimizations.  It is only used by the compiler and is not needed by the pre-processor, assembler nor linker.

The third option is _-I/usr/local/include_ which specifies an additional directory to find include files.  Since the pre-processor handles the #include directive this options is only used by the pro-processor and is ignored by the compiler, assembler and linker.

The fourth option is _-D_DEBUG which defines the pre-processor macro _DEBUG.  It is only used by the pre-processor and is not needed by the compiler, assembler nor linker.

The fifth option is _-Wl,-r/usr/local/lib_ which instructs the linker to record this directory in the executable for use by the dynamic linker.  It's only used by the linker and is not needed by the pre-processor, compiler nor assembler.

From this analysis, we can appreciate that the steps required to achieve compilation and how all those command-line arguments work together to generate a binary executable.

If _pcc_ ever has a problem generating a binary executable, then the problem must be attributed to one of these steps of the build process.



Powered by rcshistory.cgi 0.3