22 Annotated Bibliography
This is less an annotated and more of a citation and link dump while I move the references into the main text.
22.1 Textbooks
Kruschke (2015) Doing Bayesian data analysis (Kruschke 2015) Another accessible introduction aimed at psychology. Website with additional material.
McElreath (2016) Statistical rethinking (McElreath 2016) An accessible introduction to Bayesian stats; effectively an intro-stats/linear models course taught from a Bayesian perspective.
Lee (2012) Bayesian Statistics : An Introduction (Lee 2012)
Marin and Robert (2015) Bayesian Essentials with R (Marin and Robert 2014) and solutions manual
Robert and Casella. 2009. Introducing Monte Carlo Methods with R (Robert and Casella 2009)
Robert and Casella. 2004. Monte Carlo statistical methods (Robert and Casella 2004)
Albert (2009) Bayesian Computation with R (Albert 2009)
Jackman (2009) Bayesian Analysis for the Social Sciences (Jackman 2009) Covers commonly used models in the social sciences. Largely covers Gibbs sampling methods and
Hoff (2009) A First Course in Bayesian Statistical Methods (Hoff 2009)
Gelman, Carlin, Stern, Dunson, and Vehtari (2013) Bayesian data analysis (3rd Edition) (A. Gelman, Carlin, et al. 2013)
Gelman, and Hill (2007) Data analysis using regression and multilevel/hierarchical models (A. Gelman and Hill 2007) An accessible introduction to to linear models and multilevel models.
Efron and Hastie (2016) Computer Age Statistical Inference: Algorithms, Evidence, and Data Science This is a unique work that blends an overview of statistical methods with a history of statistics. (Efron and Hastie 2016)
Robert (2007) The Bayesian Choice A statistics graduate-level book on Bayesian statistics.
Berger (1993) Statistical Decision Theory and Bayesian Analysis (Berger 1993) The classic book on Bayesian inference and decision theory. The underlying statistical theory is still relevant even if its date makes the computational aspects less so.
Murphy (2012) Machine Learning: A Probabilistic Perspective (Murphy 2012) A machine learning book with a heavy Bayesian influence.
MacKay (2003) Information Theory, Inference, and Learning Algorithms URL. (MacKay 2003) On information theory, but combines it with Bayesian statistics, and is ultimately about learning and evidence. Lectures from the course are available here.
Gelman and Hill (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models (A. Gelman and Hill 2007)
Gelman, Carlin, Stern, Dunson, Vehtari, and Rubin (2013) Bayesian Data Analysis 3rd ed.
Jackman, Simon. 2009. Bayesian Analysis for the Social Sciences (Jackman 2009)
Lynch, Scott M. 2007. Introduction to Applied Bayesian Statistics and Estimation for Social Scientists
Lunn, Jackson, Best, Thomas, and Spiegelhalter (2012) The BUGS Book: A Practical Introduction to Bayesian Analysis (Lunn et al. 2012)
Peter Hoff. 2009. A First Course in Bayesian Statistical Methods (Hoff 2009)
Congdon. 2014. Applied Bayesian Modeling.
Marin and Roberts. 2014. Bayesian Essentials with R.
Robert and Casella. Introducing Monte Carlo Methods with R (Robert and Casella 2009)
22.2 Syllabi
- Ryan Bakker and Johannes Karreth, “Introduction to Applied Bayesian Modeling” ICPSR. Summer 2016. Syllabus; code
- Justin Esarey. “Advanced Topics in Political Methodology: Bayesian Statistics” Winter 2015. Syllabus; Lectures.
- Kruschke. Doing Bayesian Data Analysis site.
- Nick Beauchamp. “Bayesian Methods.” NYU. syllabus.
- Alex Tanhk. “Bayesian Methods for the Social Sciences” U of Wisconsin. Spring 2017. syllabus.
- MTH225 Statistics for Science Spring 2016. github website.
- Ben Goodrich, “Bayesian Statistics for Social Sciences” Columbia University. Spring 2016.
- Bakker. “Introduction to Applied Bayesian Analysis” University of Georgia. syllabus; site
- Myimoto. “Advances in Quantitative Psychology: Bayesian Statistics, Modeling & Reasoning” U of Washington. Winter 2017. site
- Neil Frazer. Bayesian Data Analysis. Hawaii. Spring 2017. syllabus
- Lopes. 2016. Bayesian Statistical Learning: Readings in Statistics and Econometrics. syllabus.
- Lopes. 2012 Simulation-based approaches to modern Bayesian econometrics. Short course.
- Lopes. 2015. Bayesian Econometrics. syllabus.
22.3 Topics
22.4 Bayes’ Theorem
- Puga, Kryzwinski, and Altman (2015) “Points of significance: Bayes’ theorem” Nature Methods
22.5 Article Length Introductions to Bayesian Statistics
- Stan Modeling 2.17. Ch. 29. “Bayesian Inference”
- Michael Clarke Bayesian Basics.
- Eddy (2004) “What is Bayesian Statistics” Nature Biotechnology
- Jackman. 2004. Bayesian Analysis for Political Research. Annual Review of Political Science DOI:10.1146/annurev.polisci.7.012003.104706.
- Kruschke, J.K. & Liddell, T.M. Psychon Bull Rev (2017). doi:10.3758/s13423-016-1221-4 - Kruschke and Liddell (2017) “Bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective”
22.5.1 Why Bayesian
- Jim Savage. Why learn Bayesian Modeling? April 10, 2017.
22.5.2 Modern Statistical Workflow
- Savage, Jaim. 2017. A Brief Introduction to Econometrics in Stan
- Betancourt, Michael. Robust Statistical Workflow with RStan
- Stan Modeling Guide “Model Building as Software Development”
- Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., Gelman, A. (2018). Visualization in Bayesian workflow
22.5.3 Bayesian Philosophy
Gelman (2008) “Objections to Bayesian Statistics” Bayesian Analysis
Gelman and Shalizi (2012) “Philosophy and the practice of Bayesian statistics” British Journal of Mathematical and Statistical Psychology
Borsboom and Haig (2012) “How to practice Bayesian statistics outside the Bayesian church: What philosophy for Bayesian statistical modelling?” British Journal of Mathematical and Statistical Psychology
Berger and Berry (1988) “Statistical Analysis and the Illusion of Objectivity” American Scientist American Scientist 1988
Efron (2010) “The Future of Indirect Evidence”
Efron (1986) “Why Isn’t Everyone a Bayesian?” American Statistician (B. Efron 1986b). See comments Chernoff (1986), Lindley (1986), Morris (1986), Smith (1986), Press (1986), B. Efron (1986a).
Philosophy and the practice of Bayesian statistics in the social sciences
Rubin (1984) Rubin, Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician
Andrew Gelman Induction and Deduction in Bayesian Data Analysis
Berger (2013) "Could Fisher, Jeffreys and Neyman Have Agreed on Testing? Statistical Science
22.5.4 Bayesian Hypothesis Testing
- Gross, J. H. (2015) “Testing What Matters (If You Must Test at All): A Context-Driven Approach to Substantive and Statistical Significance” American Journal of Political Science (Gross 2014)
22.5.5 Bayesian Frequentist Debates
- Bayesians and Frequentists : Models, Assumptions, and Inference (slides)
- Kass Statistical Inference: The Big Picture
- Noah Smith Bayesian vs. Frequentist: Is there any “there” there?
- Kass Kinds of Bayesians
- Anthony O’Hagan. Science, Subjectivity and Software (Comments on the articles by Berger and Goldstein)
- VanderPlas (2014) Frequentism and Bayesianism: A Python-driven Primer. posts
22.5.6 Categorical
- Agresti. Bayesian Inference for Categorical Data Analysis
- Gelman. 2008. “A weakly informative default prior distribution for logistic and other regression models”
- Rainey. 2016. “Dealing with Separation in Logistic Regression Models” Political Analysis
- Wechsler, Izbicki, and Esteves (2013) “A Bayesian look at nonidentifiability: a simple example”"
22.5.7 Variable Selection
- J. Ghosh and Ghattas (2015) Ghosh and Ghattas (2015) “Bayesian Variable Selection Under Collinearity” American Statistician
- Scott and Berger (2011) “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem” Annals of Statistics (Scott and Berger 2010)
- Ishwaran and Rao (2005) “Spike and slab variable selection: Frequentist and Bayesian strategies” Annals of Statistics
- Ishwaran, Kogalur, and Rao (2010) “spikeslab: prediction and variable selection using spike and slab regression” R Journal
- Polson and Scott. “Shrink globally, act locally: sparse Bayesian regularization and prediction” Bayesian Statistics
- Projection predictive variable selection using Stan + R
- Lasso Meets Horseshoe
- Piironen and Vehtari, Sparsity information and regularization in the horseshoe and other shrinkage priors
- Hahn and Carvalho. Decoupling Shrinkage And Selection In Bayesian Linear Models: A Posterior Summary Perspective
- Michael Betancourt Bayes Sparse Regression
22.5.8 Multiple Testing
- Gelman, Hill, and Yajima (2012) “Why we (Usually) don’t have to worry about multiple comparisons” Journal of Research on Educational Effectiveness
22.5.9 Rare Events
- King and Zheng. 2001. "Explaining Rare Events in International Relations
- King, Gary, and Langche Zeng. 2001. “Logistic Regression in Rare Events Data”
22.5.10 Identifiability
- Weschler et al. 2013. A Bayesian Look at Nonidentifiability: A Simple Example
22.5.11 Shrinkage
- Efron and Morris (1975) “Data Analysis Using Stein’s Estimator and its Generalizations” JASA (Efron and Morris 1975)
22.6 Software
Software for general purpose Bayesian computation are called probabilistic programming languages.
-
- Joseph Rickert. 2016. R Stan and Statistics
BUGS modeling language. Models are specified in a different language.
NIMBLE A very new BUGS-like language that works with R.
JAGS Gibbs/MCMC based
WinBUGS Gibbs and MCMC based software. It was one of the first but is now obsolete and unmaintained. Use JAGS or Stan instead.
OpenBUGS The continuation of the WinBUGS project. Also no longer well maintained. Use JAGS or Stan instead.
R has multiple packages that implement some Bayesian methods. See the Bayesian Task View
Python
- PyMC Very complete general-purpose Python package for Bayesian Analysis
- The various Machine learning packages like scikit-learn.
Edward. By David Blei. Deep generative models, variational inference. Runs on TensorFlow. Implements variational and HMC methods, as well as optimization.
Stata: Since version 14 it can estimate some Bayesian models. It uses Metropolis-Hastings and Gibbs methods.
Julia
- Mamba MCMC supporting multiple methods including Gibbs, MH, HMC, slice
22.6.1 Stan
Official Stan-dev R packages:
Others:
22.6.2 Diagrams
22.6.2.1 DAGs and Plate Notation
See Plate notation
tikz-bayesnet A TikZ library for drawing Bayesian networks
Daf A python package to draw DAGs
Relevant Stack Overflow questions:
22.6.2.2 Kruschke Diagrams
Diagrams in the style of Kruschke’s Doing Bayesian Analysis:
Blog posts
22.6.2.3 Venn Diagrams/Eikosograms
- Oldford and W.H. Cherry. 2006. “Picturing Probability: the poverty of Venn diagrams, the richness of Eikosograms”
22.6.3 Priors
- Betancourt (2017) “How the shape of a weakly informative prior affects inferences” Stan Case Studies
- Stan, Prior Choice Recommendations
22.7 Bayesian Model Averaging
- Montgomery, Hollenbach and Ward (2012) “Improving Predictions Using Ensemble Bayesian Model Averaging” Political Analysis
- Montgomery and Nyhan (2011) Bayesian Model Averaging: Theoretical Developments and Practical Applications
- BMA Package
- BMS Package
- BAS Package
- Amini and Parmeter (2011) “Bayesian Model Averaging in R” Journal of Economic and Social Measurement
- Fragoso and Neto (2015) Bayesian model averaging: A systematic review and conceptual classification (Fragoso and Neto 2015)
- Ley and Steel (2012) “Mixtures of g-priors for Bayesian model averaging with economic applications” Journal of Econometrics
- Ley and Steel (2009) “On the effect of prior assumptions in Bayesian model averaging with applications to growth regression” Journal of Applied Econometrics
- Volinsky, Raftery, Madigan, and Hoeting (1999) “Bayesian model averaging: A Tutorial” Statistical Science
22.8 Multilevel Modeling
- Stegmueller (2013), “How Many Countries for Multilevel Modeling? A Comparison of Frequentist and Bayesian Approaches” American Journal of Political Science (Stegmueller 2013)
- Shor, Bafumi, Keele, and Park (2007) “A Bayesian multilevel modeling approach to time-series cross-sectional data” Political Analysis
- Beck and Katz (2007) “Random coefficient models for time-series—cross-section data: Monte Carlo experiments” Political Analysis (Beck and Katz 2007)
- Western and Jackman (1994). “Bayesian Inference for Comparative Research” American Political Science Review (Western and Jackman 1994)
- Anderson and Fetner. 2008. “Economic inequality and intolerance: attitudes toward homosexuality in 35 democracies” American Journal of Political Science
22.9 Mixture Models
- Imai, K. and Tingley, D. (2012) “A Statistical Method for Empirical Testing of Competing Theories” AJPS
22.10 Inference
22.10.1 Discussion of Bayesian Inference
- Lindley. The Analysis of Experimental Data: The Appreciation of Tea and Wine
22.11 Model Checking
22.11.1 Posterior Predictive Checks
- Gelman, Andrew (2007) “A Bayesian Formulation of Exploratory Data Analysis and Goodness-of-fit Testing” International Statistical Review
- Gelman, Meng, Stern (1996) “Posterior Predictive Fitness Via Realized Discrepencies”
- Kruschke. Posterior predictive checks can and should be Bayesian: Comment on Gelman and Shalizi, ‘Philosophy and the practice of Bayesian statistics
- Confusions about posterior predictive checks
- Gabry, Jonah. Graphical posterior predictive checks using the bayesplot package
22.11.2 Prediction Criteria
- Gelman, Andrew, Jessica Hwang, and Aki Vehtari. 2014. “Understanding Predictive Information Criteria for Bayesian Models.” Statistics and Computing
- Vehtari, Gelman, and Gabry. 2016 Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC
- Vehtari and Lampinen (2002) Bayesian model assessment and comparison using cross-validation predictive densities
- Vehtari and Ojanen (2012) “A survey of Bayesian predictive methods for model assessment, selection and comparison”
22.11.3 Software Validation
- Cook, Gelman, and Rubin (2006) “Validation of Software for Bayesian Models Using Posterior Quantiles” and Correction
- Savage, Jim. An easy way to simulate fake data from your Stan model
- Stan Best Practices
22.12 Hierarchical Modeling
- Kruschke and Vanpaeml “Bayesian Estimation in Hierarchical Models”
- Park, Gelman, and Bafumi (2004) “Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls” Political Analysis
- Lax and Phillips. 2009. “How Should We Estimate Public Opinion in the States?” AJPS
22.13 Shrinkage/Regularization
- Piironen and Vehtari. 2016. On the Hyperprior Choice for the Global Shrinkage Parameter in the Horseshoe Prior
- Lopes. 2015. Bayesian Regularization slides.
22.14 Empirical Bayes
- Berger (2006) “The case for objective Bayesian analysis” Bayesian Analysis
- Efron (2014) “Frequentist accuracy of Bayesian estimates” JRSS B
- Efron (2010) “The Future of Indirect Evidence” Statistical Science
22.15 History of Bayesian Statistics
- Robert and Casella (2011) “A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data” Statistical Science
- Stigler (2018) “Richard Price, the first Bayesian” Statistical Science (Stigler 2018)
- Stigler (1983) “Who discovered Bayes’s theorem?” American Statistician (Stigler 1983)
- Fienberg (2006) “When did Bayesian Inference Become "Bayesian"?” Bayesian Analysis (Fienberg 2006)
22.16 Sampling Difficulties
- Carpenter (2017) “Typical sets and the curse of dimensionality” Stan Case Studies
- Betancourt (2017) “Diagnosing biased inference with divergences” Stan Case Studies
- Betancourt (2016) “Diagnosing suboptimal cotangent disintegrations in Hamiltonian Monte Carlo”
- Betancourt and Girolami (2013) “Hamiltonian Monte Carlo for Hierarchical Models”
22.17 Complicated Estimation and Testing
- King, Tomz, and Wittenberg (2000) “Making the most of statistical analyses: improving interpretation and presentation” Propose a pseudo-Bayesian method.
- Golder “Interactions”. See referenced papers.
- Hanmer and Kalkan (2012) “Behind the curve: clarifying the best approach to calculating predicted probabilities and marginal effects from limited dependent variable models” American Journal of Political Science
22.18 Pooling Polls
- Jackman (2000) “Pooling the Polls over an Election Campaign” Australian Journal of Political Science
- Linzer (2013) “Dynamic Bayesian forecasting of presidential elections in the States” JASA
22.19 Visualizing MCMC Methods
- https://chi-feng.github.io/mcmc-demo/
- https://mimno.infosci.cornell.edu/hmc/ and http://www.mimno.org/articles/hmc/
- http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/
- https://ridlow.wordpress.com/category/animation/
- http://people.math.aau.dk/~kkb/Undervisning/Bayes14/sorenh/docs/sampling-notes.pdf
- https://rpubs.com/mv2521/mcmc-animation
- http://blog.revolutionanalytics.com/2013/09/an-animated-peek-into-the-workings-of-bayesian-statistics.html
- https://people.duke.edu/~ccc14/sta-663/Animation.html
- https://artax.karlin.mff.cuni.cz/r-help/library/asbio/html/anm.mc.bvn.html
- https://groups.google.com/forum/#!topic/stan-users/nOk80xTlSyE
- https://www.youtube.com/watch?v=Vv3f0QNWvWQ
- https://theclevermachine.wordpress.com/2012/11/18/mcmc-hamiltnonian-monte-carlo-a-k-a-hybrid-monte-carlo/
- https://www.youtube.com/watch?v=pHsuIaPbNbY&list=PLqdbxUnkqOw2nKn7VxYqIrKWcqRkQYOsF&index=11
- http://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo.html
22.20 Bayesian point estimation / Decision
- Stan Modeling Language. Ch 32. Bayesian Point Estimation.
- Modes, Medians and Means: A Unifying Perspective. Not explicitly motivated with Bayesian decision theory; nevertheless, it is a good intuitive explanation of these estimators.
- The Impact of Reparameterization on Point Estimates
- Rainey
22.21 Stan Modeling Language
- Ch 1–8 Introduction.
- pay attention to Ch 1, 8. skim the rest. know where to look for help.
- Ch 28. Optimizing Stan Code for Efficiency (Neal’s funnel, reparameterization, vectorization)
- Ch 22. Reparameterization and change of variables
- Ch 23. Customized
- Ch 24. User-defined functions
- Ch 25. problematic posteriors
- Ch 29. Bayesian Data Analysis
- Ch 30. Markov Chain Monte Carlo Sampling (R hat, ESS, convergence, thinning)
- Ch 31. Penalized MLE
- Ch 32. Bayesian Point Estimation
- Ch 34. Hamiltonian Monte Carlo Sampling
- Ch 35. Transformations of Constrained Variables - changes of variables.
22.22 Bayes Factors
- Lindley’s Paradox
- Bayes’ Factors
- Robert (2016) The expected demise of the Bayes factor (Robert 2016).
- Kass and Raftery (1995) “Bayes factors” (Kass and Raftery 1995)