# Regulated Grammars and Automata

Authors: | Alexander Meduna and Petr Zemek |

Title: | Regulated Grammars and Automata |

Publisher: | Springer, New York, US |

ISBN: | 978-1-4939-0368-9 |

Publication Date: | 2014-03-17 |

Details: | Hardcover, 694 pages |

## Authors

**Alexander Meduna**, Full Professor of Computer Science at the Brno University of Technology, received his PhD from this university in 1988. He has taught theoretical computer science at various European, Asian, and American universities, including the University of Missouri, where he spent a decade teaching advanced topics of the formal language theory. He is the author of the books *Automata and Languages* (Springer, 2000) and *Elements of Compiler Design* (Taylor and Francis, 2008) and a co-author of *Grammars with Context Conditions and Their Applications* (Wiley, 2005) and *Scattered Context Grammars and their Applications* (WIT Press, 2010). He has published over seventy papers related to the subject of this book.

**Petr Zemek** received his PhD from the Brno University of Technology in 2014 under the supervision of Alexander Meduna. He has published many studies on regulated grammars and automata in international conferences and distinguished computer science journals such as *Acta Informatica*, *Theoretical Computer Science* and *International Journal of Computer Mathematics*.

## Book

### Motivation and Subject

Language processors have become an inseparable part of our daily life. For instance, all the sophisticated modern means of communication, such as Internet with its numerous information processing tools, are based upon them to some extent, and indisputably, literary billions of people use these means on a daily basis. It thus comes as no surprise that the scientific development and study of languages and their processors fulfill a more important role today than ever before. Naturally, we expect that this study produces concepts and results that are as reliable as possible. As a result, we tend to base this study upon mathematics as a systematized body of unshakable knowledge obtained by exact and infallible reasoning. In this respect, we pay our principal attention to *formal language theory* as a branch of mathematics that formalizes languages and devices that define them strictly rigorously.

This theory defines languages mathematically as sets of sequences consisting of symbols. This definition encompasses almost all languages as they are commonly understood. Indeed, natural languages, such as English, are included in this definition. Of course, all artificial languages introduced by various scientific disciplines can be viewed as formal languages as well; perhaps most illustratively, every programming language represents a formal language in terms of this definition. Consequently, formal language theory is important to all the scientific areas that make use of these languages to a certain extent.

The strictly mathematical approach to languages necessitates introducing *formal language models* that define them, and formal language theory have introduced a great variety of them over its history. Most of them are based upon rules by which they repeatedly rewrite sequences of symbols, called strings. Despite their diversity, they can be classified into two basic categories—generative and recognition language models. Generative models, better known as *grammars*, define strings of their language so their rewriting process generates them from a special start symbol. On the other hand, recognition models, better known as *automata*, define strings of their language by rewriting process that starts from these strings and ends in a special set of strings, usually called final configurations.

Like any branch of mathematics, formal language theory has defined its language models generally. Unfortunately, from a practical viewpoint, this generality actually means that the models work in a completely non-deterministic way, and as such, they are hardly implementable and, therefore, applicable in practice. Being fully aware of this pragmatic difficulty, formal language theory have introduced fully deterministic versions of these models; sadly, their application-oriented perspectives are also doubtful. First and foremost, in an ever-changing environment in which real language processors work, it is utterly naive, if not absurd, that these deterministic versions might adequately reflect and simulate real language processors applied in such pragmatically oriented areas as various engineering techniques for language analysis. Second, in many case, this determinism decreases the power of their general counterparts—another highly undesirable feature of this strict determinism.

Considering all these difficulties, formal language theory have introduced yet another version of language models, generally referred to as *regulated language models*, which formalize real language processors perhaps most adequately. In essence, these models are based upon their general versions extended by an additional mathematical mechanism that prescribes the use of rules during the generation of their languages. From a practical viewpoint, an important advantage of these models consists in controlling their language-defining process and, therefore, operating in a more deterministic way than general models, which perform their derivations in a quite unregulated way. Perhaps even more significantly, the regulated versions of language models are stronger than their unregulated versions. Considering these advantages, it comes as no surprise that formal language theory has paid an incredibly high attention to *regulated grammars and automata*, which represent the principal subject of the present book.

### Purpose

Over the past quarter century, literally hundreds studies were written about regulated grammars, and their investigation represents an exciting trend within formal language theory. Although this investigation has introduced a number of new regulated grammatical concepts and achieved many remarkable results, all these concepts and results are scattered in various conference and journal papers. The principle *theoretical purpose* of the present book is to select crucially important concepts of this kind and summarize key results about them in a compact, systematic and uniform way.

From a more practical viewpoint, as already stated, the developers of current and future language processing technologies need a systematized body of mathematically precise knowledge upon which they can rely and build up their methods and techniques. The *practical purpose* of this book is to provide them with this knowledge.

### Focus

The material concerning regulated grammars and automata is so huge that it is literally impossible to cover it completely. Considering the purpose of this book, we restrict our attention to four crucially important topics concerning these grammars and automata—their power, properties, reduction, and convertibility.

As obvious, the *power* of the regulated language models under consideration represents perhaps the most important information about them. Indeed, we always want to know the family of languages that these models define.

A special attention is paid to algorithms that arrange regulated grammars and automata so they satisfy some prescribed *properties* while the generated languages remain unchanged because many language processors strictly require their satisfaction in practice. From a theoretical viewpoint, these properties frequently simplify proofs demonstrating results about these grammars and automata.

The *reduction* of regulated grammars and automata also represents an important investigation area of this book because their reduced versions define languages in a succinct and easy-to-follow way. As obvious, this reduction simplifies the development of language processing technologies, which then work economically and effectively.

Of course, the same languages can be defined by different language models. We obviously tend to define them by the most appropriate models under given circumstances. Therefore, whenever discussing different types of equally powerful language models, we also study their mutual *convertibility*. More specifically, given a language model of one type, we explain how to convert it to a language model of another equally powerful type so both the original model and the model produced by this conversion define the same language.

We prove most of the results concerning the topics mentioned above *effectively*. That is, within proofs demonstrating them, we give algorithms that describe how to achieve these results. For instance, we often present conversions between equally powerful models as algorithms, whose correctness is then rigorously verified. In this way, apart from their theoretical value, we actually demonstrate how to implement them.

### Organization

The text is divided into nine parts, each of which consists of several chapters. Every part starts with an abstract that summarizes its chapters. Altogether, the book contains twenty-two chapters.

### Approach

This book represents a theoretically oriented treatment of regulated grammars and automata. We introduce all formalisms concerning these grammars with enough rigor to make all results quite clear and valid. Every complicated mathematical passage is preceded by its intuitive explanation so that even the most complex parts of the book are easy to grasp. As most proofs of the achieved results contain many transformations of regulated grammars and automata, the present book also maintains an emphasis on algorithmic approach to regulated grammars and automata under discussion and, thereby, their use in practice. Several worked-out examples illustrate the theoretical notions and their applications.

### Use

Primarily, this book is useful to all researchers, ranging from mathematicians through computer scientists up to linguists, who deal with language processors based upon regulated grammars or automata.

Secondary, the entire book can be used as a text for a two-term course in regulated grammars and automata at a graduate level. The text allows the flexibility needed to select some of the discussed topics and, thereby, use it for a one-term course on this subject.

Tertiary and finally, serious undergraduate students may find this book useful as an accompanying text for a course that deals with formal languages and their models.

### Samples

## Errata

- Ordered by appearance in the text.
- Last updated on 2017-05-03.
- Send additional errors and comments to: meduna@fit.vutbr.cz

### List of Errors

- Page 21, Chapter 3 (Rudiments of Formal Language Theory), Section 3.3 (Grammars)
- In Definition 3.3.1, instead of
*x = x_1ux_2, y = y_1vy_2*, there should be*x = x_1ux_2, y = x_1vx_2*. The subsequent occurrences of*y_1*and*y_2*should be removed from the definition. - Reported 2017-05-02 by Radek Vít of Brno University of Technology.

- Page 511, Chapter 15 (Self-Regulating Automata), Section 15.1.1 (Definitions and Examples)
- In Definition 15.1.2, instead of “j = 0, 1, …, n”, there should be “j = 0, 1, …, n-1”.
- Reported 2018-11-06 by Roman Andriushchenko of Brno University of Technology.

- Page 511, Chapter 15 (Self-Regulating Automata), Section 15.1.1 (Definitions and Examples)
- In Example 15.1.3, 1-first-SFA
*M*does not accept*ab*, so*L(M)*should be*{a^n b^n | n > 1}*. Note that we get original*L(M)*by adding*(2,3)*into the last component of*M*. - Reported 2018-11-06 by Roman Andriushchenko of Brno University of Technology.

- Page 576, Chapter 17 (Jumping Finite Automata), Section 17.4 (Closure Properties)
- Theorem 17.4.15 does not hold. For example, language
*L = {a}**belongs to**GJFA**, but*K = {bc}**does not belong to**GJFA**. Language*K*is obtained from*L*by a finite substitution that maps*a*to*bc*. - Reported 2015-03-16 by Vojtěch Vorel of the Charles University.

- Page 576, Chapter 17 (Jumping Finite Automata), Section 17.4 (Closure Properties)
- Corollary 17.4.16 does not hold. This fact follows from the above example (notice that the finite substitution in there is, in fact, a homomorphism).
- Reported 2015-03-16 by Vojtěch Vorel of the Charles University.

- Page 577, Chapter 17 (Jumping Finite Automata), Section 17.4 (Closure Properties)
- Theorem 17.4.19 does not hold for
**GJFA**. A GJFA*M*and a homomorphism*h*can be constructed so that*h^-1(L(M))*does not belong to**GJFA**. - Reported 2015-03-16 by Vojtěch Vorel of the Charles University.

The authors' thanks go to the readers who pointed out the errors.

## Springer Website

To buy the book, visit this website.