Finite
state machine programs and state transition diagrams
We
begin with a kind of finite state machine program called a deterministic
finite acceptor (DFA).
Its function is to accept certain strings over its input alphabet.
A DFA consists
of:
- an input
alphabet S
- a finite
set Q whose elements are called states
- a transition
function d which maps state-input pairs to states
- a distinguished
state (often called q0), the start (or initial) state
- F, a
subset of Q, called the set of final states
A string is given as input to the DFA.
- It begins
in the start state scanning the first character of the input string.
- It continues
to execute by following the transition given by d for each character
in the input string.
- When
the end of the string is reached:
- if
the state is a final state, the string is accepted
- if
the state is not a final state, the string is not accepted
The language of a DFA M is denoted L(M) and is the set of all strings
accepted by M.
State
transition diagrams
A DFA and
its execution are made easier to visualize by means of state transition
diagrams.
The state
transition diagram of a DFA is a labeled digraph.
- The vertices
are the states (elements of Q)
- The edges
are the transitions.
- They
are labeled with the input character that causes the transition.
An example state transition diagram of a DFA that accepts all strings
over {a, b} ending in a:
- Final
states are indicated by a circle in a circle.
- The start
(initial) state has an edge to it coming from no vertex.
- The language
of this DFA is the set of all strings over {a, b} ending in a.
Regular expressions and regular languages
Regular expressions
are used to represent languages.
- They
are built up from basic symbols (Æ, L, and individual characters
of an alphabet) and the regular operations.
- The regular
operations are:
- concatenation
- union
(denoted either + or È)
- Kleene
closure (*)
Regular languages
- The languages
that can be represented by regular expressions are called the regular
languages.
A formal definition of regular expression follows, but here are some examples:
a
b
ab
a+b
(a+b)*
(a+b)*a
- represents
the set of strings over {a, b} ending in a (the language of the DFA
above)
b(a+b)*
- represents
the set of strings over {a, b} starting with b
Formal recursive definition of regular expression:
Base cases:
- Æ
is a regular expression
- L is
a regular expression
- an individual
character is a regular expression
Recursive cases:
- if r1
and r2 are regular expressions, then (r1)+(r2) is a regular expression
- if r1
and r2 are regular expressions, then (r1)(r2) is a regular expression
- if r
is a regular expression, then (r)* is a regular expression
In practice many parentheses are omitted by adoption of precedence rules.
Precedence
of the regular operations:
- *
- concatenation
- +
The language of a regular expression, L(r), is also called the language
represented or generated by r.
- L(Æ)
= Æ
- L(L)
= {L}
- L(c)
= {c} where c is a character
- L((r1)+(r2))
= L(r1)ÈL(r2)
- L((r1)(r2))
= L(r1)L(r2)
- L((r)*)
= (L(r))*
Fact:
- The class
of languages which are accepted by some DFA is the same as the class
of languages representable by regular expressions.
- These
are the regular languages.
Examples:
- (a+b)*aba(a+b)*
generates the set of all strings over {a, b} containing aba as a substring.
- (a+b)*aba
- The
language of this regular expression is the set of all strings over
{a, b} ending in aba.
- 0(0+1)*1
generates the set of strings over {0, 1} starting with 0 and ending
with 1.
|
| Review
Questions & Answers 1.
What are the components of a DFA?
2.
What is the language of a DFA?
3.
What are the regular operations?
4.
What are the regular laguages?
5.
What is the relation of DFAs to the regular languages?
Answers:
1.
A DFA consists of:
- an input
alphabet S
- a finite
set Q whose elements are called states
- a transition
function d which maps state-input pairs to states
- a distinguished
state (often called q0), the start (or initial) state
- F, a
subset of Q, called the set of final states
2. The language of a DFA M is denoted L(M) and is the set of all
strings accepted by M.
3.
The regular operations are:
- concatenation
- union
(denoted either + or È)
- Kleene
closure (*)
4. The languages that can be represented by regular expressions.
5.
Every regular language is accepted by some DFA and every language accepted
by a DFA is regular.
|
Practice
Test & Answers
1.
Is the set of stings over {0, 1} starting with 00 and ending in 11 a regular
language? Why?
2.
Describe the language generated by the regular expression (a+b)*aba(a+b)*bb.
3.
What is a regular expression for the set of strings over {a, b, c} with
abc as a prefix?
4.
What is a regular expression for the set of strings over {a, b, c} with
abc as a sufffix?
5.
What is a regular expression for the set of strings over {a, b, c} with
abc as a prefix or suffix?
Answers:
1.
Yes, because it is generated by the regular expression 00(0+1)*11.
2.
The set of strings over {a, b} with aba as a substring and bb as a suffix.
3.
abc(a+b+c)*
4.
(a+b+c)*abc
5.
abc(a+b+c)* + (a+b+c)*abc
|