# A Short Course Compiler Grammar Tutorial Study Notes with Examples

**What is Compiler Grammar**

In 1956, Noam Chomsky gave a model of grammar. A grammar is defined as 4-tuples (V_{N}, Ʃ, P, S),

Where, **V _{N}** is finite non-empty set of non-terminals.

**Ʃ** is finite non-empty set of input terminals.

**P** is finite set of product rules.

**S** is the start symbol.

**What is Chomsky Hierarchy**

The Chomsky hierarchy is a containment hierarchy of classes of formal grammars that generate formal languages. The formal grammars consist of a finite set of terminal symbols, a finite set of non-terminal symbols, a set of production rules with a left and a right hand side consisting of a word of these symbols and a start symbol.

The Chomsky hierarchy consists of following four levels.

**Type 0 Grammar** (Unrestricted Grammar)

These are unrestricted grammars which include all formal grammars. These grammars generate exactly all language that can be recognized by a Turing machine.

Rules are of the form α → β.

Where, α and β are arbitrary strings over a vocabulary V and α ≠ ^ (null).

**Type 1 Grammar **(Context Sensitive Grammar)

Language defined by type-1 grammars are accepted by linear bounded automata.

Rules are of the form α A β → α B β

Where, A ϵ V_{N }

α , β, B ϵ (V_{N }U Ʃ)

B ≠ ^

**Type 2 Grammar** (Context-free Grammar)

Language defined by type-2 grammars are accepted by push-down automata.

Rules are of the form A → α

Where, A ϵ V_{N}

α ϵ (V_{N }U Ʃ)^{*}

**Type 3 Grammar** (Regular Grammar)

Language defined by type-3 grammars are accepted by finite state automata.

Rules are of the form A → ^

A → α

A → α β

Where, A, B ϵ V_{N }and a ϵ Ʃ

**Regular Expression**

Regular expression mean to represent certain sets of strings in some algebraic fashion.

A regular expression over the alphabet E is defined as follows

- ɸ is a regular expression corresponding to the empty language ɸ.
- ^ is a regular expression corresponding to the language { ^}.
- For each symbol a ϵ E, a is a regular expression corresponding to the language {a}.

**Regular Language **

The Language accepted by FA are regular language and these language are easily described by simple expressions called regular expressions.

For any regular expression r and s over E, Corresponding to the language L_{r }and L_{S} respectively, each of the following is a regular expression corresponding to the language indicated.

- (r s) corresponding to the language L
_{r }L - (r + S) corresponding to the language L
_{r }U L - r* corresponding to the language L
_{r}.

Some examples of regular expression are

L (01) = {0,1}

L (01 +0) = {01,0}

L(0(1+0))={01,00}

L(0^{*}) = {Ɛ, 0, 00, 000, …..}

L((0 +10)^{*} (^ + 1) = all string of 0’s and 1’s without two consecutive 1’s.

**Key Points**

- If L
_{1 }and L_{2}are regular language in E^{*}, then L_{1 }U L_{2, }L_{1 }∩ L_{2, }L_{1 }– L_{2 }and L_{1}^{’}(complement of L_{1 }), are all regular languages. - Pumping lemma is a useful tool to prove that a certain language is not regular.
- The
**Myhill-Nerode theorem**tells that the given language might be regular or not regular.

**Regular Set**

A set represented by a regular expression is called regular set.

E.g. If Ʃ = {a, b} is an alphabet, then

**Different Regular Sets and their Expressions**

Sets |
Regular Expression |

{} |
ɸ |

{a} |
a |

{a ,b} |
a+b |

{ ɛ, a, aa, aaa,….} |
a* |

{a, aa, aaa,….} |
aa* = a^{+} |

**Identities for Regular Expressions**

The following points are the some identities for regular expressions.

ɸ + L = L + ɸ = L ɸ L = L ɸ = ɸ

^R = R^ = R ^^{*} = ^ and ɸ^{* }= Ɛ

R+ R = R, where R is the regular expression RR* = R*R = R^{*}

(R*)* = R* R*R* = R*

(P + Q)^{*} = (P^{*}Q^{*})^{* }(P^{*} + Q^{*})^{*}, where P and Q are regular expressions.

R (P + Q) = RP + RQ and (P + Q) R = PR + QR

R (P + Q) = RP + RQ and (P + Q) R = PR + QR

P (QP)^{* }= (PQ)^{*}P

**Regular Expression and Finite Automata**

NFA with ^-moves Let r_{1} and r_{2 }be the two regular expressions over Ʃ_{1} Ʃ_{2} and E_{1} E2 are two NFA for r_{1} and r_{2} respectively.

**Regular Expression to ^-NFA** (union)

**For **E_{1} U E_{2}

**Regular Expression to ^-NFA** (Concatenation)

**For **E_{1} E_{2}

**Regular Expression to ^-NFA** (Clousre)

**For **E^{*}

** ****Construction of Regular Expression from DFA**

If a DFA recognizes the regular language L_{1} then there exists regular expression, which describes L.

**Arden’s Theorem**

- If P and Q are two expressions over an alphabet Ʃ such that P does not contain ^, then the following equation R = Q + RP.
- The above equation has a unique solution i.e. , R=QP
^{*}. Arden’s theorem is used to determine the regular expression represented diagrams.

The following points are assumed regarding position diagrams.

- The transition system does not have any ^.
- It has only one initial (starting) state.

**Properties of Regular Languages**

Regular Language are closed under following properties.

**Union****Concatenation**

**Kleene closure**

**Complementation****Transpose****Intersection**

**If A is polynomial time reducible to B (i.e., A ≤ B) and B is in P, then Also in P.****If A is NP-complete problem, then it is member of P if and only if P = NP.****The complexity class NP-complete is the intersection of NP and NP-hard classes.**

**Key Points**

- All NP problems can be solved, if an NP problem can be reduced to an NP- hard problem before solving it in polynomial time.
- If A is NP- complete and there is a polynomial time reduction of A to B, then B is NP-complete.
- If there is a language L such that L ϵ P, then complement of L is also in P.
- P complexity class is closed under union, intersection, complementation, kleene closure and concatenation.
- NP complexity class is closed under union, intersection, concatenation, and Kleene closure.

**Design of Automaton**

The design can be illustrated by under following three categories

**Sequence of string is fixed**

If the sequence of string is fixed that suppose an automaton is designed to accepted a string with abbacb.

**The value of string is changed by changing input value**

Suppose, an automaton is designed to accepted the string abba or bbba.

**When input values are repeated**

Suppose an automaton is to be designed for accepting all string generated by input a. That means it has to accept a, aa, aaa, ……

This will be designed as

** **

Now, if we have to design a machine which accepts all inputs containing string ab

# Context Free Language Tutorial Notes Study Material with Examples

Follow Us on Social Platforms to get Updated : **twiter, ****facebook, Google Plus**

Learn Context Free Language in Handbook Series: **Click here **