DIZNR INTERNATIONAL

Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization

Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization

https://www.gyanodhan.com/video/7B2.%20GATE%20CSEIT/Compiler%20Design/280.%20Six%20Phases%20of%20compiler%20-%20Lexical%20%20%20Syntax%20%20%20Semantic%20%20%20Intermediate%20code%20%20generation%20%20%20code%20optimiza.mp4

 Six Phases of Compiler

A compiler translates high-level code into machine code through six main phases:

Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Code Optimization
Code Generation

 Lexical Analysis (Scanning)

Input: Source Code (High-level language)
Output: Tokens
Purpose: Converts the source code into tokens (smallest units of the program).
Performs:
 Removes whitespaces, comments
 Identifies keywords, identifiers, operators, symbols
 Stores tokens in Symbol Table

Example:
Input: int x = 10;
Output:

TOKEN( int ), TOKEN( x ), TOKEN( = ), TOKEN( 10 ), TOKEN( ; )

 Syntax Analysis (Parsing)

Input: Tokens from Lexical Analysis
Output: Parse Tree (Syntax Tree)
Purpose: Checks if the code follows the grammar rules (using parsing techniques like LL(1), LR(1)).
Performs:
 Builds Parse Tree
 Detects Syntax Errors

Example:
Input: int x = 10;
Output: Parse Tree

Assignment
/ | \
int x 10

 If we write int = x 10;, the compiler will show a syntax error.

 Semantic Analysis

Input: Parse Tree
Output: Annotated Parse Tree
Purpose: Checks meaning and correctness of the program.
Performs:
 Type Checking (int x = “hello”; is invalid)
 Variable declaration checking
 Function parameter matching

Example:
Code:

int x = "hello"; // Error: String assigned to integer

Error: Type Mismatch

 Intermediate Code Generation

Input: Annotated Parse Tree
Output: Intermediate Code (3-address code, AST)
Purpose: Generates a machine-independent representation of the program.
Performs:
 Converts code to Intermediate Representation (IR)
 Uses Three-Address Code (TAC)

Example:
Code:

a = b + c * d;

TAC Output:

t1 = c * d
t2 = b + t1
a = t2

 Code Optimization

Input: Intermediate Code
Output: Optimized Intermediate Code
Purpose: Makes the code faster and memory-efficient.
Performs:
Constant Folding5 + 38
Dead Code Elimination → Removes unreachable code
Loop Optimization → Moves invariant expressions out of loops

Example:
Before Optimization:

int x = 2 * 10; // Multiplication every time

After Optimization:

int x = 20; // Precomputed at compile-time

 Code Generation

Input: Optimized Intermediate Code
Output: Target Machine Code (Assembly/Binary)
Purpose: Converts IR to actual machine code for execution.
Performs:
Register Allocation
Instruction Selection
Machine-specific code generation

Example:
IR:

t1 = c * d
t2 = b + t1
a = t2

Assembly Code (x86):

MOV R1, c
MUL R1, d
ADD R1, b
MOV a, R1

 Summary of Compiler Phases

Phase Input Output Purpose
Lexical Analysis Source Code Tokens Tokenization
Syntax Analysis Tokens Parse Tree Grammar Checking
Semantic Analysis Parse Tree Annotated Parse Tree Type Checking
Intermediate Code Generation Annotated Parse Tree Intermediate Code Machine Independence
Code Optimization Intermediate Code Optimized Code Faster Execution
Code Generation Optimized Code Machine Code Final Execution

 Final Notes

Lexical, Syntax, and Semantic AnalysisDetect errors.
Intermediate Code & OptimizationImprove efficiency.
Code GenerationFinal machine-executable code.

 Would you like a detailed example on any specific phase?

Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization

Module 1 Compiler Phases of a Compiler ( Structure of …

Phases of Compiler

Here’s a comprehensive overview of the six phases of a compiler, crucial for understanding how source code is translated into executable code:


Six Phases of a Compiler

Each phase takes the output of the previous phase and refines it toward machine code. The main phases are:


1. Lexical Analysis (Scanner)

  • Purpose: Converts the sequence of characters (source code) into a sequence of tokens.

  • Output: Tokens (identifiers, keywords, symbols).

  • Tool Used: Lex

  • Example:

    c
    int a = 5;

    Tokens: int, a, =, 5, ;


2. Syntax Analysis (Parser)

  • Purpose: Analyzes the token sequence to check if it follows the grammar of the programming language.

  • Output: Parse tree (also called syntax tree).

  • Tool Used: Yacc, Bison

  • Error Detected: Syntax errors like missing semicolons or braces.

  • Example: Ensures that expressions like a + b * c are parsed correctly.


3. Semantic Analysis

  • Purpose: Ensures the meaning of the program is correct (i.e., checks for semantic errors).

  • Tasks:

    • Type checking

    • Scope resolution

    • Variable declarations

  • Output: Annotated syntax tree or abstract syntax tree (AST)

  • Example: Checks if variables are declared before use.


4. Intermediate Code Generation

  • Purpose: Converts the syntax tree or AST into an intermediate representation (IR).

  • IR: Easier to optimize and closer to machine language than source code.

  • Formats: Three-address code, postfix notation.

  • Example:

    ini
    t1 = a * b
    t2 = t1 + c

5. Code Optimization

  • Purpose: Improves the intermediate code to make it faster or smaller.

  • Types:

    • Peephole optimization: Local improvements

    • Loop optimization: Reduce redundant computations

  • Output: Optimized intermediate code

  • Example: Replace x = x * 2 with x = x << 1 (bitwise shift)


6. Code Generation

  • Purpose: Converts intermediate code into target (machine or assembly) code.

  • Output: Executable code or assembly code.

  • Tasks:

    • Register allocation

    • Instruction selection

    • Address translation

  • Example:

    css
    MOV R1, a
    MUL R1, b
    ADD R1, c

Summary Table

Phase Input Output Focus
Lexical Analysis Source code Tokens Pattern matching
Syntax Analysis Tokens Parse tree Grammar
Semantic Analysis Parse tree AST Meaning, type checking
Intermediate Code Gen AST Intermediate code Simplified representation
Code Optimization Intermediate code Optimized code Performance improvement
Code Generation Optimized code Machine/Assembly code Hardware-specific translation

Let me know if you want diagrams or examples in C or Python for each phase!

Six Phases of compiler – Lexical /Syntax /Semantic /Intermediate code generation/code optimization

UNIT- I Introduction to Compiling

COMPILER DESIGN