TY - GEN
T1 - Accurate string constraints solution counting with weighted automata
AU - Sherman, Elena
AU - Harris, Andrew
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - As an important extension of symbolic execution (SE), probabilistic symbolic execution (PSE) computes execution probabilities of program paths. Using this information, PSE can prioritize path exploration strategies. To calculate the probability of a path PSE relies on solution counting approaches for the path constraint. The correctness of a solution counting approach depends on the methodology used to count solutions and whether a path constraint maintains a one-to-one relation with program input values. This work focuses on the latter aspect of the solution counting correctness for string constraints. In general, maintaining a one-to-one relation is not always possible, especially in the presence of non-linear constraints. To deal with this issue, researchers that work on PSE for numerical domains either analyze programs with linear constraints, or develop novel techniques to handle solution counting of non-linear constraints. For the string domain, however, previous work on PSE mainly focuses on efficient and accurate solution counting for automata-based string models and has not investigated whether a one-to-one relationship between the strings encoded by automata and input string values is preserved. In this work we demonstrate that traditional automata-based string models fail to maintain one-to-one relations and propose to use the weighted automata model, which preserves the one-to-one relation between the path constraint it encodes and the input string values. We use this model to implement a string constraint solver and show its correctness on a set of non-trivial synthetic benchmarks. We also present an empirical evaluation of traditional and proposed automata solvers on real-world string constraints. The evaluations show that while being less efficient than traditional automata models, the weighted automata model maintains correct solution counts.
AB - As an important extension of symbolic execution (SE), probabilistic symbolic execution (PSE) computes execution probabilities of program paths. Using this information, PSE can prioritize path exploration strategies. To calculate the probability of a path PSE relies on solution counting approaches for the path constraint. The correctness of a solution counting approach depends on the methodology used to count solutions and whether a path constraint maintains a one-to-one relation with program input values. This work focuses on the latter aspect of the solution counting correctness for string constraints. In general, maintaining a one-to-one relation is not always possible, especially in the presence of non-linear constraints. To deal with this issue, researchers that work on PSE for numerical domains either analyze programs with linear constraints, or develop novel techniques to handle solution counting of non-linear constraints. For the string domain, however, previous work on PSE mainly focuses on efficient and accurate solution counting for automata-based string models and has not investigated whether a one-to-one relationship between the strings encoded by automata and input string values is preserved. In this work we demonstrate that traditional automata-based string models fail to maintain one-to-one relations and propose to use the weighted automata model, which preserves the one-to-one relation between the path constraint it encodes and the input string values. We use this model to implement a string constraint solver and show its correctness on a set of non-trivial synthetic benchmarks. We also present an empirical evaluation of traditional and proposed automata solvers on real-world string constraints. The evaluations show that while being less efficient than traditional automata models, the weighted automata model maintains correct solution counts.
KW - Probabilistic symbolic execution
KW - Quantitative program analysis
KW - String constraints
UR - http://www.scopus.com/inward/record.url?scp=85078887594&partnerID=8YFLogxK
U2 - 10.1109/ASE.2019.00049
DO - 10.1109/ASE.2019.00049
M3 - Conference contribution
AN - SCOPUS:85078887594
T3 - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
SP - 440
EP - 452
BT - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Y2 - 10 November 2019 through 15 November 2019
ER -