Informally, a context free grammar (CFG), like any grammar, is a collection of rules that together can be used to define how units can be built into larger and larger structures until sentences are formed.
Formally, a CFG can be defined as a four-tuple
where
is a set of nonterminals,
is a set of terminals,
is the start symbol where
, and
is
a set of production rules of the form
where
and
.
An example CFG is shown in Figure 2.2.
From the ninth rule in Figure 2.2 the expression ``Jeff''
is classified as a
, (alternatively the nonterminal
is said to cover the string ``Jeff''). The twelfth and
seventh rules determine that the non-terminal
covers the
string ``flies'' and from rules one and three it can be
concluded that the expression ``Jeff flies'' is a sentence
(following the convention that
is the root symbol).
The classification of sentences by a CFG can be summarised in a parse-tree. The parse tree for ``Jeff flies'' is given in Figure 2.3. Each parent node, together with its children, correspond to a rule application while the leaves of the tree correspond to the terminals of the grammar.
The symbol '
' in the first rule for
is used
to denote that there are no elements on the right hand side of that
rule, that is to say that
may cover the empty string.
To determine the parse tree for a particular string the rules are
applied in one of two particular orders; starting with the string
and trying to build up the symbol
(bottom-up derivation),
or starting with
and trying to apply the rules that will
cover the string (top-down derivation).
If there is no possible parse tree for a string, given a CFG, then that string is not accepted by the grammar and is said to be syntactically incorrect.