top of page

DA 546 Introduction to Statistical Learning

Course Schedule

​

DAYS: Tuesday, Wednesday, Thursday

TIME: 10:00 am – 10:55 am

VENUE: Microsoft Teams

​

​

Course Syllabus

​

The course syllabus is divided into the following five modules.

​

​

​

​

 

​

​

 

​

01

The scatter diagram, the correlation coefficient, potential problems with correlation coefficient: outliers, non-linear association, association and causation.

02

Least squares fitting, computational issues in fitting the model, hypothesis testing on the coefficients, interval estimation, prediction of new observations, assessing the accuracy of the model, detection of outliers, multicollinearity, variable selection, predictions

03

An overview, logistic regression, Bayes classifier, linear discriminant analysis, quadratic discriminant analysis, naive Bayes, k-nearest neighbour, support vector machines

04

Testing and training data, cross-validation, leave-one-out-cross-validation, k-fold cross-validation, bias-variance trade-off; bootstrapping, jackknife

05

An overview, basics of decision trees, regression trees, classification trees

​

​

  • James, G., Witten, D., Hastie, T. and Tibshirani, R. An introduction to statistical learning. 2nd edition. Springer. 2021.

  • Trevor Hasti, T., Tibshirani, R. and Friedman, J. The elements of statistical learning. 2nd edition. Corr. 9th printing 2017 edition. Springer. (19 April 2017).

References:

Textbooks: 

​

​

  • Montgomery, D.C., Peck, E.A. and Vining, G.G. Introduction to linear regression analysis. 6th edition. Wiley. 2021.

  • Freedman, D., Pisani, R., Purves, R. and Adhikari, A. Statistics. 4th edition. Viva books. 2011.

Broad Course Objectives

​

​

  • Learn to formulate a statistical problem in mathematical terms from a real-life situation.

  • Get a solid foundation in statistical learning.

  • Learn to select appropriate statistical learning methods and carry out statistical analysis.

  • Understand the implications and limitations of various methods.

  • Gain hands-on experience through a project.

Course Logistics

​

​

  • The course is being offered to Ph.D. students and M.Tech students from specific branches only.

  • The course is a 6 credit course (3-0-0-6).

  • All queries relating to course material should be asked in class. This is to ensure that all students can
    benefit from the discussion.

  • I am available by appointment. Please send me an email with a subject starting with course code: DA546.

Honor Code

​

​

  • Be honest and transparent with your exams, quizzes, and project. Any form of cheating is unacceptable and will lead to consequences,

  • All lectures are mandatory.

Evaluation Pattern

​

Mid-semester examination: 30 %

End-semester examination: 30%

Project: 20%

Quizzes: 20% (Two/Three quizzes will be conducted.)

​

 

If you solve the exercises given in lectures and submit the solutions formally through email, you get 5 bonus marks in your mid-semester as well as end-semester examinations subject to the following conditions:

​

  • All exercises given before the mid-sem exam should be solved and submitted for 5 bonus marks in the mid-sem. 

  • All exercises given before the end-sem exam should be solved and submitted for 5 bonus marks in the end-sem. 

  • The exercises given during a lecture should be solved and submitted within two days of that particular lecture.

 

 

The project can be a group project or an individual one, depending on the final strength of the class and will be announced in the next few weeks. 

Course materials

​

​

  • The course material will be uploaded on Microsoft Teams. These will mainly include lecture slides, recordings, and exercises for practice.

bottom of page