Statistical learning theory
Year:  20162017 

Catalog number:  4433STLT4Y 
Teacher(s): 

Language:  English 
Blackboard:  Yes 
EC:  4 
Level:  500 
Period:  Semester 1, Block II 
 Yes Elective choice
 Yes Contractonderwijs
 Yes Exchange
 Yes Study Abroad
 No Evening course
 Yes A la Carte
 No Honours Class
Description
This course gives an overview of techniques for automated learning from illunderstood data for which it is hard or impossible to formulate a model that is even approximately correct. Here “learning” means: “finding structure, patterns, regularities” and using these patterns to predict future data. The field is very similar to an area within computer science called “machine learning”, since many contributions in this field have their origin in computer science (pattern recognition, artificial intelligence).
Main topics in the course will be (1) supervised learning (regression and classification, but with a strong focus on the latter); (2) model selection; (3) basic clustering. The methods discussed will include various classical and stateoftheart classification methods: LDA (1930s), naive Bayes, perceptrons (1960s), neural networks and deep learning, decision trees (1980s), logistic regression, boosting and support vector machines (2000s). We explain interrelations between these methods and analyze their behaviour. As for model selection, we again consider both classical and state of the art methods including various forms of crossvalidation, Ridge, Lasso and other L1 methods. As to clustering, we consider the classic Kmeans and EM methods.
See www.timvanerven.nl/teaching/statlearn2016/ for detailed course information.
Prerequisites
•Familiarity with least squares linear regression.
•Ability to program in R or Python
Course objectives
An introduction to Statistical Learning
Time Table
For the course days, course location and class hours check the Time Table under the
tab “Statsci Students —> Program Schedule” at http://www.math.leidenuniv.nl/statscience
Mode of Instruction
Lectures and computer practicals.
Assesssment method
•A written openbook exam (50%)
•Two assignments (each 25%)
It is required to having a passing score both for the assignments and for the exam. This means at least a 5.5 average for the assignments and a 5.5 for the exam.
Both homework assignments involve setting up some experiments in R or Python, experimenting, and writing a short report about the results. Discussing the problems in the group is encouraged, but every participant must do their own experiments and write a report on their own.
Date information about the exam and resit can be found in the Time Table pdf document under the tab “Masters Programme” at http://www.math.leidenuniv.nl/statscience. The room and building for the exam will be announced on the electronic billboard, to be found at the opposite of the entrance, the content can also be viewed here.
Reading list
 T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, 2nd edition, 2009.
Handouts of some (very few) papers
Registration
Enroll in Blackboard for the course materials and course updates.
To be able to obtain a grade and the ECTS for the course, sign up for the (re)exam in uSis ten calendar days before the actual (re)exam will take place. Note, the student is expected to participate actively in all activities of the program and therefore uses and registers for the first exam opportunity.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.
Contact information
Tim van Erven: tim@timvanerven.nl
Remarks
 This is an elective course in the Master’s programme of the specialisation Statistical Science for the Life & Behavioural sciences.
Is part of  Programme type  Semester  Block 

Computer Science with the specialization Data Science  Master  
Mathematics  Master  1  
Statistical Science for the Life & Behavioural Sciences: Data Science  Master  1  II 
Statistical Science for the Life and Behavioural Sciences  Master  1  II 