Compiler
Overview
A compiler is a specialized type of computer program that transforms source code written in a programming language into another computer language. The most common reason for transforming source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language, such as assembly language or machine code.
History
The first compiler was developed by Grace Hopper, in the early 1950s, for the A-0 System. The term "compiler" was coined by Hopper. Early compilers were written in assembly language. The first self-hosting compiler — capable of compiling its own source code in a high-level language — was created in 1962 for the ALGOL 60 language.
Compiler Construction
The process of creating a compiler is itself a subject of study within computer science. Compiler construction involves several phases, each with its own sub-phases. The phases include preprocessing, lexical analysis, parsing, semantic analysis, code generation, and code optimization.
Preprocessing
A preprocessor performs macro substitution, language extension, and some minimal level of syntax checking. It also handles the inclusion of header files, and does other processing on the source code before it is passed to the next stage.
Lexical Analysis
The lexical analyzer, or scanner, partitions the input text into a sequence of tokens. A token is a sequence of characters that represent a single logical entity, such as an identifier, a keyword, or an operator.
Parsing
The parser takes the tokens produced during lexical analysis and constructs an abstract syntax tree (AST). The AST represents the grammatical structure of the program.
Semantic Analysis
Semantic analysis is the phase in which the compiler adds semantic information to the AST and builds the symbol table. This phase performs semantic checks such as type checking, object binding, or definite assignment.
Code Generation
The code generation phase translates the AST into executable or byte code. This involves resource allocation, scheduling, and the actual translation into machine code.
Code Optimization
Code optimization is the final phase of compiler. Its purpose is to improve the intermediate code so that the output runs faster and takes less space.
Types of Compilers
There are several different types of compilers, each suited to different kinds of programming languages and hardware environments.
Single-pass Compilers
A single-pass compiler is a type of compiler that generates machine code directly from the source code in a single pass through the source code.
Multi-pass Compilers
A multi-pass compiler breaks down the process of translating source code into machine code into multiple passes, each of which performs a specific task.
Just-in-time Compilers
A just-in-time (JIT) compiler is a type of compiler that translates a program's source code into machine code just prior to its execution.
Compiler Design
Compiler design can define an end to end solution or tackle a defined subset that interfaces with other compilation tools e.g. preprocessors, assemblers, linkers.
Compiler Correctness
Compiler correctness is the branch of software engineering that deals with trying to show that a compiler behaves according to its language specification. Techniques include developing the compiler using formal methods and using rigorous testing (often called compiler validation) on an existing compiler.