EmpiricalBayes-Baseball

These notebooks follow along with the code in the book Introduction to Empirical Bayes: Examples from Baseball Statistics. The code in the book is written in R so I wanted to give it a shot in python.

Here are the chapters and the corresponding notebooks.

Chapter	Description	File
Chapter 1	Introduction	no notebook - no code in this chapter
Chapter 2	The beta distribution	Chapter02-TheBetaDistribution.ipynb
Chapter 3	Emperical Bayes estimation	Chapter03-EmpericalBayesEstimation.ipynb
Chapter 4	Credible intervals	Chapter04-CredibleIntervals.ipynb
Chapter 5	Hypothesis Testing and FDR	Chapter05-HypothesisTestingFDR.ipynb
Chapter 6	Bayesian A/B testing	Chapter06-BayesianABtesting.ipynb
Chapter 7	Beta binomial regression	Chapter07-BetaBinomialRegression.ipynb
Chapter 8	Emperical Bayesian hierarchical modeling	Chapter08-EmpericalBayesHierarchicalModeling.ipynb
Chapter 9	Mixture models and expectation maximization	Chapter09- MixtureModels_ExpectationMaximization.ipynb
Chapter 10	The multinomial and the Dirichlet	Chapter10-MultinomialAndTheDirichlet.ipynb
Chapter 11	The ebbr package	no notebook - chapter just introduces R package
Chapter 12	Simulation	Chapter12-Simulation.ipynb
Chapter 13	Simulating replications	no notebook - see below

Note: There are no notebooks for chapters 1, 11, and 13 (for now). Chapter 1 is just text and chapter 11 introduces a package that is specific to R. Because we are rewriting the entire project in python, all of the things in the ebbr package have already been implemented in the notebooks for chapters 2-10. There is currently no notebook for chapter 13. This chapter is simply doing replications of the simulations in chapter 12. It would be simple to add in by doing replications of the same work in chapter 12. I might work on this in the future, but for now I will probably leave it out.

Another thing to note is that the MLE solutions can be a little unstable. We need to play with the bounds and initial parameters to get a reasonable result. I found that using pymc can be much better, but it is also very slow. That is primarily why I did not perform the replications in Ch 13. The MLE was not giving the best paremeters but the probabilistic programming appraoch was a bit slow to perform replications without pushing the analysis onto a more powerful computing cluster for parallel operations.

Please let me know if you find any errors or issues!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmpiricalBayes-Baseball

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
.gitignore		.gitignore
Chapter02-TheBetaDistribution.ipynb		Chapter02-TheBetaDistribution.ipynb
Chapter03-EmpericalBayesEstimation.ipynb		Chapter03-EmpericalBayesEstimation.ipynb
Chapter04-CredibleIntervals.ipynb		Chapter04-CredibleIntervals.ipynb
Chapter05-HypothesisTestingFDR.ipynb		Chapter05-HypothesisTestingFDR.ipynb
Chapter06-BayesianABtesting.ipynb		Chapter06-BayesianABtesting.ipynb
Chapter07-BetaBinomialRegression.ipynb		Chapter07-BetaBinomialRegression.ipynb
Chapter08-EmpericalBayesHierarchicalModeling.ipynb		Chapter08-EmpericalBayesHierarchicalModeling.ipynb
Chapter09- MixtureModels_ExpectationMaximization.ipynb		Chapter09- MixtureModels_ExpectationMaximization.ipynb
Chapter10-MultinomialAndTheDirichlet.ipynb		Chapter10-MultinomialAndTheDirichlet.ipynb
Chapter12-Simulation.ipynb		Chapter12-Simulation.ipynb
README.md		README.md

jabrantley/EmpiricalBayes-Baseball

Folders and files

Latest commit

History

Repository files navigation

EmpiricalBayes-Baseball

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages