Overview of Statistical Machine Learning
Director: Nic Schraudolph (SML, NICTA and adjunct with CSL, RSISE)
The course is a general introduction to the methods and practice
of statistical machine learning.
Pre-Requisites and Assumed Knowledge
A bachelor's degree in a relevant subject area;
confident use of a common programming language.
Mathematical training at the 2nd year undergraduate level,
including basic linear algebra and probability theory.
Dates
- Registration: by 04 Apr 06
- Course Dates: 25 Apr to 01 Jun 06 (6 weeks)
- Lectures: Tue&Thu 10-12
- Tutorial/Exercise sessions: once a week, time and place TBD
- Assignments Due: by 09 Jun 05
- Notification: by 26 Jun 06
Presenters
- Simon Guenter
- Nic Schraudolph
- Doug Aberdeen
- SVN Vishwanathan
- Alex Smola
(all SML, NICTA and adjunct with CSL, RSISE)
Location
NICTA on Northbourne Ave., or RSISE on the ANU campus, depending on majority
of participants.
Workload
- Weekly contact hours: 4h lecture, 2h tutorial
- Total contact hours: 24h lecture, 12h tutorial
- Assignments: 3 required, 5h each, 15h total
- Preparation/Reading: 1.5h per week, 9h total
- Total workload: 24 + 12 + 15 + 9 = 60h (3 units)
Assessment
Only a pass or fail mark will be awarded. To pass the course, students
must gain a pass mark on at least 3 out of at least 4 offered assignments.
Detailed Syllabus
DRAFT - subject to change at the discretion of the
course organizer.
- Bayesian Inference
- frequentists vs. Bayesians
- derivation of Bayes' Rule
- use for inference
Assignment 1 (theory): Ovarian Cancer Screening
Reading: Euro coin tosses (MacKay)
- Maximum Likelihood Modeling
- regression, classification, density estimation
- maximum likelihood loss functions
Reading: Maximum Likelihood--Mixture of Gaussians (Schiele)
- Density Estimation
- parametric vs. non-parametric
- classification via density estimation
- semi-parametric and mixture models
- Expectation-Maximisation (EM) algorithm
Assignment 2 (programming): EM
Reading: A Gentle Tutorial of the EM Algorithm (pages 1-3)
- Least Squares Regression
- linear vs. non-linear models
- simple gradient descent
- singular value decomposition
- basis functions, generalized least squares
- classification via regression
- Neural Networks
- biological background
- learning in neural networks
- backpropagation algorithm
Assignment 3 (programming): implement neural network
- Classical (Batch) Optimization
- Newton, quasi-Newton
- conjugate gradient
Reading: Conjugate Gradient Without the Pain (chapters 1-4)
- Stochastic (Online) Optimization
- need for online learning
- direct (gradient-free) methods
- gradient step size adaptation
Assignment 4
- Overfitting, Validation, and Regularisation
- empirical vs. true risk
- cross-validation techniques
- Ockham's razor, regularization
- minimum description length
- Reinforcement Learning (Doug Aberdeen)
- dynamic programming
- function approximation
- simulation
- policy based methods
- Tesauro's backgammon
Assignment 5 (programming): reinforcement learning
- Kernel Methods 1 (Alex Smola / SVN Vishwanathan)
- Kernel Methods 2 (Alex Smola / SVN Vishwanathan)
Assignment 6: kernel methods
10/05 - N. Schraudolph