Journal articleHOLÍK Lukáš, JANKŮ Petr, LIN Anthony W., RUMMER Philipp and VOJNAR Tomáš. String constraints with concatenation and transducers solved efficiently. Proceedings of the ACM on Programming Languages. New York: Association for Computing Machinery, 2018, vol. 2018, no. 2, pp. 96127. ISSN 24751421. Available from: http://doi.acm.org/10.1145/3158092  Publication language:  english 

Original title:  String constraints with concatenation and transducers solved efficiently 

Title (cs):  Řetězcová omezení s konkatenací a převodníky řešena efektivně 

Pages:  96127 

Place:  US 

Year:  2018 

URL:  http://doi.acm.org/10.1145/3158092 

Journal:  Proceedings of the ACM on Programming Languages, Vol. 2018, No. 2, New York, US 

ISSN:  24751421 

DOI:  10.1145/3158092 

Keywords 

Alternating Finite Automata, Decision Procedure, IC3, String Solving 
Annotation 

String analysis is the problem of reasoning about how strings are
manipulated by a program. It has numerous applications including
automatic detection of crosssite scripting, and automatic testcase
generation. A popular string analysis technique includes symbolic
executions, which at their core use constraint solvers over the string
domain, a.k.a. string solvers. Such solvers typically reason about
constraints expressed in theories over strings with the concatenation
operator as an atomic constraint. In recent years, researchers started
to recognise the importance of incorporating the replaceall operator
(i.e. replace all occurrences of a string by another string) and, more
generally, finitestate transductions in the theories of strings with
concatenation. Such string operations are typically crucial for
reasoning about XSS vulnerabilities in web applications, especially for
modelling sanitisation functions and implicit browser transductions
(e.g. innerHTML). Although this results in an undecidable theory in
general, it was recently shown that the straightline fragment of the
theory is decidable, and is sufficiently expressive in practice. In this
paper, we provide the first string solver that can reason about
constraints involving both concatenation and finitestate transductions.
Moreover, it has a completeness and termination guarantee for several
important fragments (e.g. straightline fragment). The main challenge
addressed in the paper is the prohibitive worstcase complexity of the
theory (doubleexponential time), which is exponentially harder than the
case without finitestate transductions. To this end, we propose a
method that exploits succinct alternating finitestate automata as
concise symbolic representations of string constraints. In contrast to
previous approaches using nondeterministic automata, alternation offers
not only exponential savings in space when representing Boolean
combinations of transducers, but also a possibility of succinct
representation of otherwise costly combinations of transducers and
concatenation. Reasoning about the emptiness of the AFA language
requires a statespace exploration in an exponentialsized graph, for
which we use model checking algorithms (e.g. IC3). We have implemented
our algorithm and demonstrated its efficacy on benchmarks that are
derived from crosssite scripting analysis and other examples in the
literature. 
BibTeX: 

@ARTICLE{
author = {Luk{\'{a}}{\v{s}} Hol{\'{i}}k and Petr Jank{\r{u}}
and W. Anthony Lin and Philipp Rummer and
Tom{\'{a}}{\v{s}} Vojnar},
title = {String constraints with concatenation and
transducers solved efficiently},
pages = {96127},
journal = {Proceedings of the ACM on Programming Languages},
volume = 2018,
number = 2,
year = 2018,
ISSN = {24751421},
doi = {10.1145/3158092},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso88592?id=11615}
} 
