This is the repository for the Computational Statistics course covering numerical linear algebra, Gaussian processes, Newton’s method and optimization, numerical integration, Markov chain Monte Carlo (MCMC), the Bootstrap, density estimation, and machine learning (neural networks and deep learning). The course will focus on the development of various algorithms for optimization and simulation, the workhorses of much of computational statistics. A variety of algorithms and data sets of gradually increasing complexity (1 dimension
- Practices for reproducible analysis
- Fundamentals of data management and munging
- Use Python as a language for statistical computing
- Use mathematical and statistical libraries effectively
- Profile and optimize serial code
- Effective use of different parallel programming paradigms
In particular, the focus in on algorithms for:
- Optimization
- Newton-Raphson (functional programming and vectorization)
- Quadrature (adaptive methods)
- Gradient descent (multivariable)
- Solving GLMs (multivariable)
- Expectation-maximization (multivariable + finite mixture models)
- Simulation and resampling
- Bootstrap (basics of parallel programming)
- Map-reduce applications in statistics for big data
- Monte Carlo simulations (more parallel programming)
- MCMC (various samplers - GPU programming)