27. Backus-Naur Form (BNF)

Computer languages need to be defined in terms of their grammar and syntax. Data encoding also need to be defined on some way so that programmers can validate data input and allocate the appropriate data type to the information.

The question is how is this defining to be done?

The obvious first choice is to write out the rule in English or other spoken language. For example try to define what an 'Integer' data type consists of.

Effort 1

"The Integer data type represents whole numbers. Each integer is made up of one or more characters ranging from zero (0) to nine (9). Other than memory limitations, an integer can be of any length."

Not bad for a first effort. But there are problems with this approach

  • Only English speakers can read and understand it.
  • Very verbose, it takes an awful lot of words to define something as basic as an integer

This is why Mr Backus and Mr Naur came up with a more elegant solution nearly fifty years ago.

Their method is now called the 'Backus-Naur Form' or BNF.

BNF uses a small set of symbols to describe a grammar rule.

The authors of any programming language (e.g. C++, Visual Basic) need to write down the grammatical rules of that language. They often follow the BNF notation to describe these rules. The compiler author can then read this document and make their compiler program conform to it.

Some of the BNF notation include:

 < > used to enclose a syntactic item 
 ::= means "is defined by" or "consists of"
 | when placed between two items means an OR choice between them
 { } curly brackets means zero or more repetitions of the content.    

Example:

A programmer is about to develop a new database. However, other people will be writing other parts of the program and will need to know the grammatical rules that will be used to form a postal address.

The original programmer might write the grammatical rules in a document for others to follow

BNF grammar example

In English this says "A postal address consists of a name, a street address and a post code with a semi-colon separating them".

Thus, any programmers wishing to use 'postal address' as part of their module need to follow this rule. If they don't, a syntax error will be generated.

The <name> item has the following rule

Backus-Naur Form BNF example

In English this becomes "A name is defined by a first name, followed by a comma, then possibly a middle name, followed by a comma and finally a surname"

As you can see, writing a grammar rule using BNF is a lot more concise and accurate than stating the same rule in English.

English (or any other language!) is wonderful for human communication but pretty hopeless to guarantee accuracy and objectivity!

The other handy thing is that BNF can easily be understood internationally, so making the rule available to compiler writers anywhere in the world..

Example 2 Defining a digit

Backus-Naur Form BNF example 2

So a digit is any number between zero and nine

 

Example 3 Defining an integer

Backus-Naur Form BNF example 3

Note that <integer> appears twice in the rule

This is called a 'recursive' statement. A recursive statement makes use of itself as part of its definition.

In this case the rule, stated in English becomes "An integer consists of a single digit or a single digit followed by another integer."

So you can see that this allows an integer to be any length and yet it is a very concise statement in BNF.

 

Challenge see if you can find out one extra fact on this topic that we haven't already told you

Click on this link: BNF notation

 

Copyright © www.teach-ict.com