Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization
Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization
Contents [hide]
- 1 Six Phases of Compiler
- 2 Lexical Analysis (Scanning)
- 3 Syntax Analysis (Parsing)
- 4 Semantic Analysis
- 5 Intermediate Code Generation
- 6 Code Optimization
- 7 Code Generation
- 8 Summary of Compiler Phases
- 9 Final Notes
- 10 Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization
- 11 Module 1 Compiler Phases of a Compiler ( Structure of …
- 12 Phases of Compiler
Six Phases of Compiler
A compiler translates high-level code into machine code through six main phases:
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Code Optimization
Code Generation
Lexical Analysis (Scanning)
Input: Source Code (High-level language)
Output: Tokens
Purpose: Converts the source code into tokens (smallest units of the program).
Performs:
Removes whitespaces, comments
Identifies keywords, identifiers, operators, symbols
Stores tokens in Symbol Table
Example:
Input: int x = 10;
Output:
Syntax Analysis (Parsing)
Input: Tokens from Lexical Analysis
Output: Parse Tree (Syntax Tree)
Purpose: Checks if the code follows the grammar rules (using parsing techniques like LL(1), LR(1)).
Performs:
Builds Parse Tree
Detects Syntax Errors
Example:
Input: int x = 10;
Output: Parse Tree
If we write int = x 10;
, the compiler will show a syntax error.
Semantic Analysis
Input: Parse Tree
Output: Annotated Parse Tree
Purpose: Checks meaning and correctness of the program.
Performs:
Type Checking (int x = “hello”; is invalid)
Variable declaration checking
Function parameter matching
Example:
Code:
Error: Type Mismatch
Intermediate Code Generation
Input: Annotated Parse Tree
Output: Intermediate Code (3-address code, AST)
Purpose: Generates a machine-independent representation of the program.
Performs:
Converts code to Intermediate Representation (IR)
Uses Three-Address Code (TAC)
Example:
Code:
TAC Output:
Code Optimization
Input: Intermediate Code
Output: Optimized Intermediate Code
Purpose: Makes the code faster and memory-efficient.
Performs:
Constant Folding → 5 + 3
→ 8
Dead Code Elimination → Removes unreachable code
Loop Optimization → Moves invariant expressions out of loops
Example:
Before Optimization:
After Optimization:
Code Generation
Input: Optimized Intermediate Code
Output: Target Machine Code (Assembly/Binary)
Purpose: Converts IR to actual machine code for execution.
Performs:
Register Allocation
Instruction Selection
Machine-specific code generation
Example:
IR:
Assembly Code (x86):
Summary of Compiler Phases
Phase | Input | Output | Purpose |
---|---|---|---|
Lexical Analysis | Source Code | Tokens | Tokenization |
Syntax Analysis | Tokens | Parse Tree | Grammar Checking |
Semantic Analysis | Parse Tree | Annotated Parse Tree | Type Checking |
Intermediate Code Generation | Annotated Parse Tree | Intermediate Code | Machine Independence |
Code Optimization | Intermediate Code | Optimized Code | Faster Execution |
Code Generation | Optimized Code | Machine Code | Final Execution |
Final Notes
Lexical, Syntax, and Semantic Analysis → Detect errors.
Intermediate Code & Optimization → Improve efficiency.
Code Generation → Final machine-executable code.
Would you like a detailed example on any specific phase?