Runtime predictive analysis

From HandWiki

Runtime predictive analysis (or predictive analysis) is a runtime verification technique in computer science for detecting property violations in program executions inferred from an observed execution. An important class of predictive analysis methods has been developed for detecting concurrency errors (such as data races) in concurrent programs, where a runtime monitor is used to predict errors which did not happen in the observed run, but can happen in an alternative execution of the same program. The predictive capability comes from the fact that the analysis is performed on an abstract model extracted online from the observed execution, which admits a class of executions beyond the observed one.[1]

Overview

Informally, given an execution [math]\displaystyle{ t }[/math], predictive analysis checks errors in a reordered trace [math]\displaystyle{ t' }[/math] of [math]\displaystyle{ t }[/math]. [math]\displaystyle{ t' }[/math] is called feasible from [math]\displaystyle{ t }[/math] (alternatively a correct reordering[2] of [math]\displaystyle{ t }[/math]) if any program that can generate [math]\displaystyle{ t }[/math] can also generate [math]\displaystyle{ t' }[/math].

In the context of concurrent programs, a predictive technique is sound if it only predicts concurrency errors in feasible executions of the causal model of the observed trace. Assuming the analysis has no knowledge about the source code of the program, the analysis is complete (also called maximal[3][4]) if the inferred class of executions contains all executions that have the same program order and communication order prefix of the observed trace.

Applications

Predictive analysis has been applied to detect a wide class of concurrency errors, including:

Implementation

As is typical with dynamic program analysis, predictive analysis first instruments the source program. At runtime, the analysis can be performed online, in order to detect errors on the fly. Alternatively, the instrumentation can simply dump the execution trace for offline analysis. The latter approach is preferred for expensive refined predictive analyses that require random access to the execution trace or take more than linear time.

Incorporating data and control-flow analysis

Static analysis can be first conducted to gather data and control-flow dependence information about the source program, which can help construct the causal model during online executions. This allows predictive analysis to infer a larger class of executions based on the observed execution. Intuitively, a feasible reordering can change the last writer of a memory read (data dependence) if the read, in turn, cannot affect whether any accesses execute (control dependence).[10][11]

Approaches

Partial order based techniques

Partial order based techniques are most often employed for online race detection. At runtime, a partial order over the events in the trace is constructed, and any unordered pairs of critical events are reported as races. Many predictive techniques for race detection are based on the happens-before relation or a weakened version of it. Such techniques can typically be implemented efficiently with vector clock algorithms, allowing only one pass of the whole input trace as it is being generated, and are thus suitable for online deployment. [2] [12] [13]

SMT-based techniques

SMT encodings allow the analysis to extract a refined causal model from an execution trace, as a (possibly very large) mathematical formula. Furthermore, control-flow information can be incorporated into the model. SMT-based techniques can achieve soundness and completeness (also called maximal causality[4] [3]), but has exponential-time complexity with respect to the trace size. In practice, the analysis is typically deployed to bounded segments of an execution trace, thus trading completeness for scalability. [10] [14] [15] [16]

Lockset-based approaches

In the context of data race detection for programs using lock based synchronization, lockset-based[17] techniques provide an unsound, yet lightweight mechanism for detecting data races. These techniques primarily detect violations of the lockset principle. which says that all accesses of a given memory location must be protected by a common lock. Such techniques are also used to filter out candidate race reports in more expensive analyses.[4]

Graph-based techniques

In the context of data race detection, sound polynomial-time predictive analyses have been developed, with good, close to maximal predictive capability based on a graphs. [18]

Computational Complexity

Given an input trace of size [math]\displaystyle{ n }[/math] executed by [math]\displaystyle{ k }[/math] threads, general race prediction is NP-complete and even W[1]-hard parameterized by [math]\displaystyle{ k }[/math], but admits a polynomial-time algorithm when the communication topology is acyclic.[19] Happens-before races are detected in [math]\displaystyle{ O(n\cdot k) }[/math] time, and this bound is optimal.[20] Lockset races over [math]\displaystyle{ d }[/math] variables are detected in [math]\displaystyle{ O(n\cdot d) }[/math] time, and this bound is also optimal.[20]

Tools

Here is a partial list of tools that use predictive analyses to detect concurrency errors, sorted alphabetically.

See also

References

  1. "Runtime Predictive Analysis". November 10, 2008. http://fsl.cs.illinois.edu/index.php/Runtime_Predictive_Analysis. 
  2. 2.0 2.1 Smaragdakis, Yannis; Evans, Jacob; Sadowski, Caitlin; Yi, Jaeheon; Flanagan, Cormac (2012). "Sound predictive race detection in polynomial time". ACM SIGPLAN Notices 47 (1): 387. doi:10.1145/2103621.2103702. ISSN 0362-1340. 
  3. 3.0 3.1 Şerbănuţă, Traian Florin; Chen, Feng; Roşu, Grigore (2013). "Maximal Causal Models for Sequentially Consistent Systems". Runtime Verification. Lecture Notes in Computer Science. 7687. pp. 136–150. doi:10.1007/978-3-642-35632-2_16. ISBN 978-3-642-35631-5. 
  4. 4.0 4.1 4.2 Huang, Jeff (2015). "Stateless model checking concurrent programs with maximal causality reduction". Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 165–174. doi:10.1145/2737924.2737975. ISBN 9781450334686. 
  5. Kalhauge, Christian Gram; Palsberg, Jens (2018). "Sound deadlock prediction". Proceedings of the ACM on Programming Languages 2 (OOPSLA): 1–29. doi:10.1145/3276516. ISSN 2475-1421. 
  6. "Sound Dynamic Deadlock Prediction in Linear Time". http://umathur3.web.engr.illinois.edu/papers/dcp.pdf. 
  7. Sound Dynamic Deadlock Prediction in Linear Time, 2023, https://dl.acm.org/doi/10.1145/3591291, retrieved 29 September 2023 
  8. "Atomicity Checking in Linear Time using Vector Clocks". http://umathur3.web.engr.illinois.edu/papers/aerodrome-asplos2020.pdf. 
  9. Huang, Jeff (2018). "UFO". Proceedings of the 40th International Conference on Software Engineering. pp. 609–619. doi:10.1145/3180155.3180225. ISBN 9781450356381. 
  10. 10.0 10.1 Huang, Jeff; Meredith, Patrick O'Neil; Rosu, Grigore (2013). "Maximal sound predictive race detection with control flow abstraction". Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 337–348. doi:10.1145/2594291.2594315. ISBN 9781450327848. 
  11. Genç, Kaan; Roemer, Jake; Xu, Yufan; Bond, Michael D. (2019). "Dependence-aware, unbounded sound predictive race detection". Proceedings of the ACM on Programming Languages 3 (OOPSLA): 1–30. doi:10.1145/3360605. ISSN 2475-1421. 
  12. Kini, Dileep; Mathur, Umang; Viswanathan, Mahesh (2017). "Dynamic race prediction in linear time". Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 157–170. doi:10.1145/3062341.3062374. ISBN 9781450349888. 
  13. Roemer, Jake; Genç, Kaan; Bond, Michael D. (2018). "High-coverage, unbounded sound predictive race detection". Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 374–389. doi:10.1145/3192366.3192385. ISBN 9781450356985. 
  14. Liu, Peng; Tripp, Omer; Zhang, Xiangyu (2016). "IPA: improving predictive analysis with pointer analysis". Proceedings of the 25th International Symposium on Software Testing and Analysis. pp. 59–69. doi:10.1145/2931037.2931046. ISBN 9781450343909. 
  15. Wang, Chao; Kundu, Sudipta; Ganai, Malay; Gupta, Aarti (2009). "Symbolic Predictive Analysis for Concurrent Programs". FM 2009: Formal Methods. Lecture Notes in Computer Science. 5850. pp. 256–272. doi:10.1007/978-3-642-05089-3_17. ISBN 978-3-642-05088-6. 
  16. Said, Mahmoud; Wang, Chao; Yang, Zijiang; Sakallah, Karem (2011). "Generating Data Race Witnesses by an SMT-Based Analysis". NASA Formal Methods. Lecture Notes in Computer Science. 6617. pp. 313–327. doi:10.1007/978-3-642-20398-5_23. ISBN 978-3-642-20397-8. 
  17. Savage, Stefan; Burrows, Michael; Nelson, Greg; Sobalvarro, Patrick; Anderson, Thomas (1997). "Eraser". ACM Transactions on Computer Systems (Association for Computing Machinery (ACM)) 15 (4): 391–411. doi:10.1145/265924.265927. ISSN 0734-2071. 
  18. Pavlogiannis, Andreas (2020). "Fast, sound, and effectively complete dynamic race prediction". Proceedings of the ACM on Programming Languages 4 (POPL): 1–29. doi:10.1145/3371085. ISSN 2475-1421. 
  19. Mathur, Umang; Pavlogiannis, Andreas; Viswanathan, Mahesh (2020). "The Complexity of Dynamic Data Race Prediction". Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science. Association for Computing Machinery. pp. 713–727. doi:10.1145/3373718.3394783. ISBN 9781450371049. 
  20. 20.0 20.1 Kulkarni, Rucha; Mathur, Umang; Pavlogiannis, Andreas (2021). "Dynamic Data-Race Detection Through the Fine-Grained Lens". 32nd International Conference on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in Informatics (LIPIcs) (Schloss Dagstuhl -- Leibniz-Zentrum fur Informatik) 203: 16:1–16:23. doi:10.4230/LIPIcs.CONCUR.2021.16. ISBN 9783959772037.