Probabilistic programming

From HandWiki

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically.[1] It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable.[2][3] It can be used to create systems that help make decisions in the face of uncertainty.

Programming languages used for probabilistic programming are referred to as "probabilistic programming languages" (PPLs).

Applications

Probabilistic reasoning has been used for a wide variety of tasks such as predicting stock prices, recommending movies, diagnosing computers, detecting cyber intrusions and image detection.[4] However, until recently (partially due to limited computing power), probabilistic programming was limited in scope, and most inference algorithms had to be written manually for each task.

Nevertheless, in 2015, a 50-line probabilistic computer vision program was used to generate 3D models of human faces based on 2D images of those faces. The program used inverse graphics as the basis of its inference method, and was built using the Picture package in Julia.[4] This made possible "in 50 lines of code what used to take thousands".[5][6]

The Gen probabilistic programming library (also written in Julia) has been applied to vision and robotics tasks.[7]

More recently, the probabilistic programming system Turing.jl has been applied in various pharmaceutical[8] and economics applications.[9]

Probabilistic programming in Julia has also been combined with differentiable programming by combining the Julia package Zygote.jl with Turing.jl. [10]

Probabilistic programming languages are also commonly used in Bayesian cognitive science to develop and evaluate models of cognition. [11]

Probabilistic programming languages

PPLs often extend from a basic language. The choice of underlying basic language depends on the similarity of the model to the basic language's ontology, as well as commercial considerations and personal preference. For instance, Dimple[12] and Chimple[13] are based on Java, Infer.NET is based on .NET Framework,[14] while PRISM extends from Prolog.[15] However, some PPLs such as WinBUGS offer a self-contained language, that maps closely to the mathematical representation of the statistical models, with no obvious origin in another programming language.[16][17]

The language for winBUGS was implemented to perform Bayesian computation using Gibbs Sampling (and related algorithms). Although implemented in a relatively unknown programming language (Component Pascal), this language permits Bayesian inference for a wide variety of statistical models using a flexible computational approach. The same BUGS language may be used to specify Bayesian models for inference via different computational choices ("samplers") and conventions or defaults, using a standalone package winBUGS (or related R packages, rbugs and r2winbugs) and JAGS (Just Another Gibbs Sampler, another R package). More recently, other languages to support Bayesian model specification and inference allow different or more efficient choices for the underlying Bayesian computation, and are accessible from the R data analysis and programming environment, e.g.: Stan, NIMBLE and NUTS. The influence of the BUGS language is evident in these later languages, which even use the same syntax for some aspects of model specification.

Several PPLs are in active development, including some in beta test. Two popular tools are Stan and PyMC.[18]

Relational

A probabilistic relational programming language (PRPL) is a PPL specially designed to describe and infer with probabilistic relational models (PRMs).

A PRM is usually developed with a set of algorithms for reducing, inference about and discovery of concerned distributions, which are embedded into the corresponding PRPL.

List of probabilistic programming languages

This list summarises the variety of PPLs that are currently available, and clarifies their origins.

Name Extends from Host language
Analytica[19] C++
bayesloop[20][21] Python Python
Bean Machine[22] PyTorch Python
CuPPL[23] NOVA[24]
Venture[25] Scheme C++
Probabilistic-C[26] C C
Anglican[27] Clojure Clojure
IBAL[28] OCaml
BayesDB[29] SQLite, Python
PRISM[15] B-Prolog
Infer.NET[14] .NET Framework .NET Framework
dimple[12] MATLAB, Java
chimple[13] MATLAB, Java
BLOG[30] Java
diff-SAT[31] Answer set programming, SAT (DIMACS CNF)
PSQL[32] SQL
BUGS[16] Component Pascal
FACTORIE[33] Scala Scala
PMTK[34] MATLAB MATLAB
Alchemy[35] C++
Dyna[36] Prolog
Figaro[37] Scala Scala
Church[38] Scheme Various: JavaScript, Scheme
ProbLog[39] Prolog Python
ProBT[40] C++, Python
Stan[17] BUGS C++
Hakaru[41] Haskell Haskell
BAli-Phy (software)[42] Haskell C++
ProbCog[43] Java, Python
Gamble[44] Racket
PWhile[45] While Python
Tuffy[46] Java
PyMC[47] Python Python
Rainier[48][49] Scala Scala
greta[50] TensorFlow R
pomegranate[51] Python Python
Lea[52] Python Python
WebPPL[53] JavaScript JavaScript
Let's Chance[54] Scratch JavaScript
Picture[4] Julia Julia
Turing.jl[55] Julia Julia
Gen[56] Julia Julia
Low-level First-order PPL[57] Python, Clojure, Pytorch Various: Python, Clojure
Troll[58] Moscow ML
Edward[59] TensorFlow Python
TensorFlow Probability[60] TensorFlow Python
Edward2[61] TensorFlow Probability Python
Pyro[62] PyTorch Python
NumPyro[63] JAX Python
Saul[64] Scala Scala
RankPL[65] Java
Birch[66] C++
PSI[67] D
Blang[68]
MultiVerse[69] Python Python

Difficulty

Reasoning about variables as probability distributions causes difficulties for novice programmers, but these difficulties can be addressed through use of Bayesian network visualisations and graphs of variable distributions embedded within the source code editor.[70]

See also

Notes

  1. "Probabilistic programming does in 50 lines of code what used to take thousands". phys.org. April 13, 2015. http://phys.org/news/2015-04-probabilistic-lines-code-thousands.html. 
  2. "Probabilistic Programming". probabilistic-programming.org. http://probabilistic-programming.org/wiki/Home. 
  3. Pfeffer, Avrom (2014), Practical Probabilistic Programming, Manning Publications. p.28. ISBN:978-1 6172-9233-0
  4. 4.0 4.1 4.2 "Short probabilistic programming machine-learning code replaces complex programs for computer-vision tasks". KurzweilAI. April 13, 2015. http://www.kurzweilai.net/short-probabilistic-programming-machine-learning-code-replaces-complex-programs-for-computer-vision-tasks. 
  5. Hardesty, Larry (April 13, 2015). "Graphics in reverse". https://news.mit.edu/2015/better-probabilistic-programming-0413. 
  6. "MIT shows off machine-learning script to make CREEPY HEADS". https://www.theregister.co.uk/2015/04/14/mit_shows_off_machinelearning_script_to_make_creepy_heads/. 
  7. "MIT's Gen programming system flattens the learning curve for AI projects" (in en-US). 2019-06-27. https://venturebeat.com/2019/06/27/mits-gen-programming-system-allows-users-to-easily-create-computer-vision-statistical-ai-and-robotics-programs/. 
  8. Semenova, Elizaveta; Williams, Dominic P.; Afzal, Avid M.; Lazic, Stanley E. (2020-11-01). "A Bayesian neural network for toxicity prediction" (in en). Computational Toxicology 16: 100133. doi:10.1016/j.comtox.2020.100133. ISSN 2468-1113. https://www.sciencedirect.com/science/article/pii/S2468111320300438. 
  9. Williams, Dominic P.; Lazic, Stanley E.; Foster, Alison J.; Semenova, Elizaveta; Morgan, Paul (2020), "Predicting Drug-Induced Liver Injury with Bayesian Machine Learning", Chemical Research in Toxicology 33 (1): 239–248, doi:10.1021/acs.chemrestox.9b00264, PMID 31535850, https://pubs.acs.org/doi/10.1021/acs.chemrestox.9b00264 
  10. Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will (2019). "∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing". arXiv:1907.07587 [cs.PL].
  11. Goodman, Noah D; Tenenbaum, Joshua B; Buchsbaum, Daphna; Hartshorne, Joshua; Hawkins, Robert; O'Donnell, Timothy J; Tessler, Michael Henry. "Probabilistic Models of Cognition". http://probmods.org/. 
  12. 12.0 12.1 "Dimple Home Page". analog.com. July 2, 2021. https://github.com/analog-garage/dimple. 
  13. 13.0 13.1 "Chimple Home Page". analog.com. April 16, 2021. https://github.com/analog-garage/chimple. 
  14. 14.0 14.1 "Infer.NET". microsoft.com. Microsoft. http://research.microsoft.com/en-us/um/cambridge/projects/infernet/. 
  15. 15.0 15.1 "PRISM: PRogramming In Statistical Modeling". http://rjida.meijo-u.ac.jp/prism/. 
  16. 16.0 16.1 "The BUGS Project - MRC Biostatistics Unit". cam.ac.uk. http://www.mrc-bsu.cam.ac.uk/bugs/. 
  17. 17.0 17.1 "Stan". mc-stan.org. http://mc-stan.org/. 
  18. "The Algorithms Behind Probabilistic Programming". http://blog.fastforwardlabs.com/2017/01/30/the-algorithms-behind-probabilistic-programming.html. 
  19. "Analytica-- A Probabilistic Modeling Language". lumina.com. http://www.analytica.com. 
  20. "bayesloop - Probabilistic programming framework". http://bayesloop.com/. 
  21. "GitHub -- bayesloop". December 7, 2021. https://github.com/christophmark/bayesloop. 
  22. "Bean Machine - A universal probabilistic programming language to enable fast and accurate Bayesian analysis". beanmachine.org. https://beanmachine.org. 
  23. "Probabilistic Programming with CuPPL". popl19.sigplan.org. https://popl19.sigplan.org/event/lafi-2019-probabilistic-programming-with-cuppl. 
  24. Collins, Alexander; Grewe, Dominik; Grover, Vinod; Lee, Sean; Susnea, Adriana (June 9, 2014). "NOVA: A Functional Language for Data Parallelism". Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming. Array'14. pp. 8–13. doi:10.1145/2627373.2627375. ISBN 9781450329378. https://dl.acm.org/citation.cfm?id=2627375. 
  25. "Venture -- a general-purpose probabilistic programming platform". mit.edu. http://probcomp.csail.mit.edu/venture/. 
  26. "Probabilistic C". ox.ac.uk. http://www.robots.ox.ac.uk/~brooks/probabilistic-c/. 
  27. "The Anglican Probabilistic Programming System". ox.ac.uk. January 6, 2021. https://github.com/probprog/anglican-infcomp. 
  28. "IBAL Home Page". http://www.eecs.harvard.edu/~avi/IBAL/. 
  29. "BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself". GitHub. December 26, 2021. https://github.com/probcomp/bayeslite. 
  30. "Bayesian Logic (BLOG)". mit.edu. http://people.csail.mit.edu/milch/blog/. 
  31. "diff-SAT (probabilistic SAT/ASP)". October 8, 2021. https://github.com/MatthiasNickles/diff-SAT/. 
  32. Dey, Debabrata; Sarkar, Sumit (1998). "PSQL: A query language for probabilistic relational data". Data & Knowledge Engineering 28: 107–120. doi:10.1016/S0169-023X(98)00015-9. 
  33. "Factorie - Probabilistic programming with imperatively-defined factor graphs - Google Project Hosting". google.com. http://code.google.com/p/factorie/. 
  34. "PMTK3 - probabilistic modeling toolkit for Matlab/Octave, version 3 - Google Project Hosting". google.com. http://code.google.com/p/pmtk3/. 
  35. "Alchemy - Open Source AI". washington.edu. http://alchemy.cs.washington.edu/. 
  36. "Dyna". http://www.dyna.org/. 
  37. "Charles River Analytics - Probabilistic Modeling Services". cra.com. February 9, 2017. http://www.cra.com/figaro. 
  38. "Church". mit.edu. http://projects.csail.mit.edu/church/wiki/Church. 
  39. "ProbLog: Probabilistic Programming". http://dtai.cs.kuleuven.be/problog. 
  40. ProbaYes. "ProbaYes - Ensemble, nous valorisations vos données". probayes.com. http://www.probayes.com/fr/Bayesian-Programming-Book/downloads/. 
  41. "Hakaru Home Page". hakaru-dev.github.io/. https://hakaru-dev.github.io/. 
  42. "BAli-Phy Home Page". bali-phy.org. http://www.bali-phy.org/. 
  43. "ProbCog". GitHub. https://github.com/opcode81/ProbCog/wiki/Features. 
  44. Culpepper, Ryan (January 17, 2017). "gamble: Probabilistic Programming". https://github.com/rmculpepper/gamble. 
  45. "PWhile Compiler". GitHub. May 25, 2020. https://github.com/zz5013/pwCompiler. 
  46. "Tuffy: A Scalable Markov Logic Inference Engine". stanford.edu. http://i.stanford.edu/hazy/tuffy/home. 
  47. PyMC devs. "PyMC". pymc-devs.github.io. https://docs.pymc.io/en/v3/. 
  48. stripe/rainier, Stripe, 2020-08-19, https://github.com/stripe/rainier, retrieved 2020-08-26 
  49. "Rainier · Bayesian inference for Scala". https://samplerainier.com/. 
  50. "greta: simple and scalable statistical modelling in R". https://greta-dev.github.io/greta/. 
  51. "Home — pomegranate 0.10.0 documentation" (in en). https://pomegranate.readthedocs.io/en/latest/index.html. 
  52. "Lea Home Page". bitbucket.org. https://bitbucket.org/piedenis/lea. 
  53. "WebPPL Home Page". github.com/probmods/webppl. http://dippl.org/. 
  54. (in EN) Let's Chance: Playful Probabilistic Programming for Children | Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. Chi Ea '20. April 25, 2020. pp. 1–7. doi:10.1145/3334480.3383071. ISBN 9781450368193. https://dl.acm.org/doi/abs/10.1145/3334480.3383071. Retrieved 2020-08-01. 
  55. "The Turing language for probabilistic programming". December 28, 2021. https://github.com/yebai/Turing.jl. 
  56. "Gen: A General Purpose Probabilistic Programming Language with Programmable Inference". https://probcomp.github.io/Gen/. 
  57. "LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models". ox.ac.uk. November 2, 2019. https://github.com/bradleygramhansen/PyLFPPL. 
  58. "Troll dice roller and probability calculator". http://topps.diku.dk/torbenm/troll.msp. 
  59. "Edward – Home". http://edwardlib.org/. 
  60. TensorFlow (2018-04-11). "Introducing TensorFlow Probability". https://medium.com/tensorflow/introducing-tensorflow-probability-dca4c304e245. 
  61. "'Edward2' TensorFlow Probability module" (in en). https://github.com/tensorflow/probability/tree/master/tensorflow_probability/python/edward2. 
  62. "Pyro" (in en). http://pyro.ai. 
  63. "NumPyro" (in en). http://num.pyro.ai/en/stable/. 
  64. "CogComp - Home". https://cogcomp.org/page/software_view/Saul. 
  65. Rienstra, Tjitze (2018-01-18), RankPL: A qualitative probabilistic programming language based on ranking theory, https://github.com/tjitze/RankPL, retrieved 2018-01-18 
  66. "Probabilistic Programming in Birch". http://birch-lang.org/. 
  67. "PSI Solver - Exact inference for probabilistic programs". https://psisolver.org/. 
  68. "Home". https://www.stat.ubc.ca/~bouchard/blang/. 
  69. Perov, Yura; Graham, Logan; Gourgoulias, Kostis; Richens, Jonathan G.; Lee, Ciarán M.; Baker, Adam; Johri, Saurabh (2020-01-28), MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming 
  70. Gorinova, Maria I.; Sarkar, Advait; Blackwell, Alan F.; Syme, Don (2016-01-01). "A Live, Multiple-Representation Probabilistic Programming Environment for Novices". Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. CHI '16. New York, NY, USA: ACM. pp. 2533–2537. doi:10.1145/2858036.2858221. ISBN 9781450333627. 

External links