Реинженеринг/Компиляторы

Материал из Викиучебника — открытых книг для открытого мира

Эта глава создана чтобы дать базовые знания об работе компиляторов (что и как компилятор делает - достаточно сложная тема, требующая рассмотрения в отдельной викикниге). Мы начнем с составления базового словарика по данной теме, и рассмотрим общую структуру компилятора.

Ключевые слова[править]

Компилятор
Компилятор - это программа преобразующая инструкции на одном языке в аналогичные инструкции на другом. Есть общее заблуждение в том, что компилятор всегда преобразует высокоуровневые конструкции в машинный язык - это не всегда соответствует действительности. Достаточно многие компиляторы преобразуют код на одном языке в код на другом языке, например в код ассемблера. Общие примеры компилируемых языков: C/C++, Fortran, Ada, и Visual Basic.
Интерператор
An interpreter is a program that executes a file of instructions in human readable form. Such programs or "scripts" are not compiled, but are instead interpreted at runtime. The process of interpreting a script every time it is executed takes more time than running a compiled script, but the trade-off is ease of use. Common examples of interpreted languages are: Perl, Python, Lisp, and PHP.
Virtual Machine
A virtual machine is a program that executes bytecode on a local machine. Since bytecode is not machine-dependent, only the virtual machine needs to be adapted to a target machine for the bytecode to run.
Source Language
The source language is what the compiler "compiles." For instance, a C compiler compiles the C language.
Intermediate Representation
When a compiler receives an input code file in the source language, it performs several steps. First, the file is read in and tokenized: parts of the code are changed to tokens, and those tokens are arranged internally in such a fashion as to help the compiler with its other tasks.
Target Language
The target language is what the compiler is supposed to produce. A C compiler frequently sets its target language to be either assembly language, or native machine code, for instance.
Target Platform
A subsection of compilers, called a "cross-compiler" is a program that takes high-level code input, and outputs instructions (usually in assembly or machine code) for a machine that is different from the machine the compiler runs on. For instance, a developer on an Intel machine may write code to be used on a Sparc target platform.

Front end: Source to Intermediate Representation[править]

The front end of a compiler is a module that reads in the source code data, tokenizes it, and converts the code into an intermediate representation. In a standard layered approach to compiler design, the front end encompasses the "Lexical Analyzer" and the "Parser" modules of the compiler.

Some common front ends are produced by Lex and Yacc (and variants).

Intermediate representations vary for each compiler, but frequently take the shape of either a tree or an instruction stack.

Back end: Target Code Generation[править]

Once the input file has been scanned and parsed into the intermediate representation, the code generator begins its job of outputting the target code. Code generators may be either a passive translator, or may be an active optimizing generator.

For a discussion of optimizations that occur during Code Generation, see Reverse Engineering/Code Optimization, and the chapters on Interleaving and Unintuitive Instructions.

Литература[править]

  • Aho, Alfred V. et al. "Compilers: Principles, Techniques and Tools," Addison Wesley, 1986. ISBN: 0321428900
  • Steven Muchnick, "Advanced Compiler Design & Implementation," Morgan Kaufmann Publishers, 1997. ISBN 1-55860-320-4
  • Compiler Construction