The Use of Conditional Grammars for Specifying and Verifying Communication Protocols

Petr Matousek
Dept. of Computer Science and Informatics
University of Technology Brno
Bozetechova 2, 612 66 Brno
Czech republic
E-mail: matousp@dcse.fee.vutbr.cz

Keywords: protocol, formal method, conditional grammar, verification, communication

Abstract

Formal specification of communication protocols is not a new problem. It seems to be a routine to formally specify protocols using regular grammars, finite state machines, extended FSM, Petri nets or other tools. These can partially satisfy designer's requirements for protocol design but mostly they can be applied only on a simple version of a protocol. To describe real-world application protocols with protocol parameters, time constraints, synchronization in distributed environment seems to be difficult by these means. The paper presents a new approach to protocol specification based on conditional grammars. It demonstrates that proposed formal specification is both intuitive and descriptive, and it offers uniform formal approach to different classes of protocols. This specification is a good base for following simulation and verification.

1 Introduction

Importance of Internet grows year to year. Many services are built to be accessed via Internet. They are based on a standard client-server architecture communicating by a protocol. The protocol defines communication sequences interchanged among two or more partners.

For successful building of such system based on communication protocols formal specification and further analysis of the specified protocol is required. The specification should unambiguously describe the system, show its main features and behavior and - on the other hand - reveal possible bottle-necks and flaws in design. Formal specification is a good base for system verification and simulation.

Mostly the formal specification and testing is applied only on a simple version of a protocol, because standard formal methods for protocol description (regular grammars, finite state machines, Petri nets) are not suitable for the specification of a full protocol, e.g. with protocol parameters, time constraints etc. By that simplification the protocols are examined only from structural points of view.

The paper presents a new formal framework for the protocol specification based on conditional grammars. It is a part of the author's PhD. project ``Formal description and verification of a class of high-level protocols with real-time constraints''.

The first part of this paper introduces the motivation and a part of background theory of the presented work. In next parts the proposed theory is presented and demonstrated on an example. The last part of the paper resumes benefit of the proposed approach in protocol verification and depicts future steps of the research.

2 Theoretical background

Considering the protocol as a dialogue consisting of sentences it is natural to use formal grammars to formally specify it. The sentences characterize the communication and the interaction between peer entities.

The grammar is formally characterized by a tuple

G = (N, T, P, S),

where N is a final set of nonterminals, T is a final set of terminals, P is a set of rules and S is the starting nonterminal (S Î V).

Considering communication protocols, the set T is composed of symbols sent by both communicating parties (i.e. commands). Both set of production rules and set of nonterminals are constructed according to expected communication sequences. For a class of high-level communication protocols we expect a half-duplex communication channel, and the protocol to be of type request/response, i.e. every command sent by one partner (called request) yields appropriate response (confirmation, data, etc.).

Mostly only single commands without any additional information are supposed. This can model a simple structure of the system using, for instance, a finite state machine or regular grammar. However, response depends not only on a request command but mostly on request command parameters that determine an appropriate answer of the system.

The proposed formalism is based on conditional grammars. Its design is descriptive and easy to read. It provides formalism to define parameters as a part of terminals. Every rule contains a condition - to make behavior deterministic according to protocol parameters. Consequently, only those productions are applied on a sentential form that satisfy a corresponding condition evaluated by protocol parameters.

A global context conditional grammar (GCC-grammar) of degree i is according to [] a triple

(G,FOR,PER),

where G=(N,T,P,S) is a context free grammar, FOR Í (NÈT), and PER Í (V ÈN)⁺ such that y Î PER implies |x| £ i.

A context free grammar is a quadruple G = (N,T,P,S), where N, V and S have the same meaning as written above and every production in P is of the form A ® x , where A Î N, x Î (N ÈT)^*.

A production as A ® x Î P can be applied to a word w only when

A global context conditional grammar works with static conditions, i.e. the definition of GCC grammar contains both sets FOR and PER, and conditions depends only on a current sentential form. For real world protocols those conditions that depend not only on a current sentential form but also on protocol parameters and the state of internal memory of the system are required.

The principle of the internal control is not new. In their work [,pp. 98-100], the authors introduce a cooperating distributed grammar system with memories (mCD grammar system). The system is based on semi-conditional grammars. The productions depend on a left-side nonterminal and as well on permitting and forbidding context conditions and the state of a memory string. The memory is implemented as a queue or a stack containing both terminals and nonterminals. Then rewriting is regulated by existence of a certain symbol in the memory string.

In our approach we propose a different view on context conditions. Just as in GCC grammar or in mCD grammar we test existence of a substring u in a given set or a memory string. Unlike GCC grammar our ``set FOR'' is not static but it is dynamically ``updated''. In contrast to mCD grammar we allow searching a substring along the whole sentential form, not only at the beginning or at the end.

Conditions in our case are represented in form of dynamic memory string - a sentential form that is rewrited by special grammar. This sentential form contains parameters of the protocol. Instead of rewriting the first or the last symbol of the memory string (as proposed by mCD grammar systems) we allow rewriting of any symbol within the memory string. Productions of this grammar supports adding of a parameter, deleting or testing the existence of a parameter in the memory string.

A memory grammar is a quadruple G_m = (N_m = {S_m, Symbol(x)},T_m = {x},P_m,S_m), where P_m is a final set of productions of three types:
- [(r₁)] S_m ® S_m Symbol(x) ... addition rule
- [(r₂)] Symbol(x) ® e ... deletion rule
- [(r₃)] Symbol(x) ® Symbol(x) ... existential rule

In our proposal a parameter of a protocol command is a value associated with a terminal representing protocol command - for instance add_user(x) represents command add_user and its parameter x. The value is a part of a given sentential form closed in braces (). In one derivation step the values can be assigned to nonterminals as attributes (cf. attribute grammars []). Besides, they are evaluated and assigned using productions of context conditions.

For the use of parameters we define a grammar with parameters (GP) as a quadruple

G₁ = (N₁,T₁,P₁,S₁),

where N₁ is a final set of nonterminals, T₁ a final set of terminals, P₁ is a set of semantic productions, and S₁ is the starting symbol. Productions are of form

(A ® x; Æ), where A Î N₁, x Î (N₁ ÈT₁)^*, or

(A ® xy(a)z; C.par = a), where A Î N₁, x,z Î (N₁ÈT₁)^*, y(a) Î T₁×N₁ and C Î N₁ is a right side nonterminal of this production.

A production can contain a semantic part that assigns a value of parameter a to a nonterminal on the right site of a production. Values of parameters are stored in a similar way like attributes in the attribute grammar.

Roughly speaking, the GP grammar is a context free grammar with some internal semantic actions.

Let u,v Î (N₁ ÈT₁)^*. Then we say u directly derives v according to A ® y or A ® y(a), denoted by u Þ v [p], if exists p Î P₁, p = A ® y or A ® y(a) and u = xAz, v = xyz or v = xy(a)z for some x,y,z Î (N₁ÈT₁)^*, y(a) Î T₁×N₁.

Now, we define a grammar system with conditions and parameters (GCP). A GCP grammar system is a tuple G = (G,G₁,S), where

[(i)] G₁ is GP grammar G₁ = (N₁,T₁,P₁,S₁) as defined above

[(ii)] G is a quadruple (N,T,P,S), where
- N = {N₁ È{S} ÈN_m = {S_m, Symbol(a)}}
- T = {T₁ È{a}}
- P is a final set of production of form
  1. p = (A ® x; Æ; r), or
  2. p = (A ® xy(a)z; C.par = a; r)
  where C Î N is a substring of x, A Î N, x,z Î (NÈT)^*, y(a) Î (N ×T) and r is either one production from a set R
  - [(r₁)] {(S_m® S_m Symbol(x)),
  - [(r₂)] (Symbol(x) ® e),
  - [(r₃)] (Symbol(x) ® Symbol(x))},
  or empty set Æ.

[(iii)] S is the starting symbol in a form (S₁; S_m)

Let u,v Î (N ÈT)^*, m,m¢ Î {S_m, Symbol(a)}^*, u = (xAz;m), v = (xyz;m¢) for some x,y,z Î (NÈT)^*. Then we say u directly derives v according to p = (A ® y; Æ,r) or p = (A ® y(a);C.par = a;r), denoted by u Þ v [p], iff p Î P and one of the following conditions holds:

(i) r is empty set Æ, or

(ii) exists r Î R such that m Þ m¢[r].

In the standard manner, we can define Þⁱ (for i ³ 0), Þ^*, Þ⁺.

The language of G, denoted by L(G), is defined by

L(G) = {(w;m) Î (T^*;m Î N_m^*) | (S;S_m) Þ^* (w;m)}.

Language L(G) describes a protocol sentence accepted by the protocol defined using grammar G. In comparison with other approaches the specification includes parameters of the protocol and the memory that represents the internal state of the protocol. This brings us advantages as follows:

A protocol is specified using simple context free grammar which is easy to create and read.

Protocol parameters are added to a grammar definition and to a sentential form in a descriptive form.

Context conditions are defined in a form of productions over memory that contains parameters. A string of parameters (the memory) represents dynamic set of context conditions. Using special grammar we are not limited by certain number and type of parameters and before applying any rule we can test their existence in a memory string.

Memory string evolves in the course of rewriting. At the end it represents a state of a system after protocol operation. This can be used for server/client simulation.

3 Case study

The formalism introduced in the previous part is demonstrated on a simple communication protocol of user database. Following requests are implemented: ADD_USER, REMOVE_USER, BYE, and corresponding answers: USER_ADDED, USER_REMOVED, ALREADY_EXISTS, NO_SUCH_USER.

Request commands ADD_USER, REMOVE_USER contain a parameter - username. According to the value of the parameter corresponding answer is generated.

For instance, possible answers for command ADD_USER are USER_ADDED or ALREADY_EXISTS. Regular grammar would apply one of these answers in a non-deterministic way. But the answer depends good deal on an argument (parameter) of the command. Considering e.g. argument username we can determine the answer after evaluation of that argument.

Context free grammar G=(N,T,P,S) of the system is defined as follows:

[(i)] N = {A,B,C,S}

[(ii)] T = {add_user, remove_user, bye, user_added, user_removed, already_exist, no_such_user}

[(iii)] P is a set of productions:
1. S ® hello A
2. A ® add_user B | remove_user C | bye
3. B ® user_added A | already_exists A
4. C ® user_removed A | no_such_user A

Now we add parameters (semantic action) to this grammar so a new GP grammar is G₁ = (N₁ = {NÈ{a}},T,P₁,S₁). Set of productions P₁ is following:

[(a)] {(S₁ ® hello A;Æ),

[(b)](A ® add_user(a) B; B.par = a),
(A ® remove_user(a) C; C.par = a),
(A ® bye;Æ),

[(e)] (B ® user_added A; Æ),
(B ® already_exists A;Æ),

[(d)](C ® user_removed A ; Æ),
(C ® no_such_user A ; Æ)}

A GCP system of this system is a triple G = (G,G₂,S), where

[(i)] G₁ = (N = {N₁È{S_m,User(x)}},T₁ È{x},P,S) is an GP grammar with productions P as follows:
1. {(S₁ ® hello A;Æ;Æ),
2. (A ® add_user(a)  B; B.par = a;Æ),
  (A ® remove_user(a)  C;
        C.par = a;Æ),
  (A ® bye;Æ;Æ),
3. (B ® user_added  A; Æ;
        S_m ® User(B.par)  S_m),
  (B ® already_exists  A;Æ;
        User(B.par) ® User(B.par)),
4. (C ® user_removed  A ; Æ;
        User(C.par) ® e),
  (C ® no_such_user  A ; Æ;Æ)}

[(ii)] S = (S₁;S_m)

Here, there are some examples of derivation process:

(S₁;S_m) Þ (hello A; S_m) Þ (hello add_user(a) B;S_m) Þ (hello add_adduser(a) user_added A; User(a) S_m) Þ (hello add_adduser(a) user_added bye; User(a) S_m)

(S₁;S_m) Þ (hello A; S_m) Þ (hello add_user(a) B; S_m) Þ (hello add_adduser(a) user_added A; User(a) S_m) Þ (hello add_adduser(a) user_added remove_user(a) C; User(a) S_m) Þ (hello add_adduser(a) user_added remove_user(a) user_removed A; S_m) Þ (hello add_adduser(a) user_added remove_user(a) user_removed bye; S_m)

4 Conclusion

This paper introduces a new uniform technique for formal specification of high-level communication protocols. It is based on conditional grammars theory. The paper shows a complete theory and demonstrates the proposed approach on a simple example. Because of its descriptive power seems to be a good tool for real-world protocol specification.

Using GCP grammar we can describe a communication protocol more precisely - parameters are part of the language. In addition context conditions specify how communication partners behave depending on parameters - that is very useful for implementation. Because of the formal description we can use various verification techniques to prove that the communication system responses as expected. Future work is focused on protocol verification using PVS (Prototype Verification System) and definition of real-time constraints as a part of the GCP language.

Acknowledgment

The work presented here in the paper is a part of a project supported by grant FRVS no. 99/2001 ``Selection of methods for formal specification and verification of communication protocols'' of Czech Ministry for Education. The author is indebted to Dr. Alexander Meduna and Martin Svec for their valuable suggestions and advises about the grammar systems.

References

[]: Aho, A.V., Sethi, R., Ullman, J.D.:Compilers. Principles, Techniques, and Tools. Addison-Wesley publishing Company, 1986.
[]: Crow, J., Owre, S., Shankar, N., Srivas, M.: A Tutorial Introduction to PVS, SRI Int., 1995. Available at URL http://www.csl.sri.com/sri-csl-fm.html.
[]: Csuhaj-Varjú, E., Dassow, J., Kelemen, J., Paun, G.:Grammar Systems - A Grammatical Approach to Distribution and Cooperation, Budapest, 1992.
[]: Matousek, P.: Protocol Proving Using PVS: A Case Study, to appear in the Proceedings of Conference MOSIS 2001.
[]: Meduna, A.: Global Context Conditional Grammars, J. Inform. Process. Cybern. EIK 27 (1991) 3, pp.159 - 165.
[]: Murphy, S.L., Shankar, A.U.:Service Specification and Protocol Construction for the Transport Layer, ACM Communication architectures and protocols, 1988, Pages 88-97.
[]: Tarnay, K.:Protocol Specification and Testing, Akademia Kiado es Nyomda Vallalat, Budapest, 1991.

Biography

Petr Matousek was born in Ústí nad Labem, Czech republic and went to University of Technology Brno, where he received in 1997 an MSc. degree in Informatics and Computer Science. In 1997-1998 he worked at CERN, Geneva in IT division on the project of global distributed LDAP directory. Since 1998 he continues his PhD. studies at University of Technology, Brno, focused on formal description and verification of protocols with real-time constraints. He externally teaches at the University.
Contact - http://www.fee.vutbr.cz/~matousp.

File translated from T_EX by T_TH, version 2.78.
On 11 Jun 2001, 13:52.