Home     MCADS Lab     Research     Publications     Activities     Codes     Data     Teaching


DS 5220: Supervised Machine Learning and Learning Theory


GENERAL INFORMATION

  • Instructor: Prof. Ehsan Elhamifar
  • Instructor Office Hours: Mondays, 4:30pm—5:30pm, 310E WVH
  • Class: Mondays and Wednesdays 14:50—16:30, Behrakis Health Sciences Cntr 315
  • TAs: Shantam Gupta (gupta.sha [at] husky.neu.edu), Office Hour: Fridays, 10-11am, 462WVH
  • Discussions, Lectures, Homeworks on Piazza
  • DESCRIPTION

    This course covers practical algorithms and the theory for supervised machine learning from a variety of perspectives. Topics include generative/discriminative learning, parametric/non-parametric learning, deep neural networks, support vector machines, decision trees and forests as well as learning theory (bias/variance tradeoffs, VC theory). The course will also discuss recent applications of machine learning, such as computer vision, data mining, natural language processing, speech recognition and robotics.

    PREREQUISITES

    Introduction to Probability and Statistics, Linear Algebra, Algorithms.

    SYLLABUS
    1. Linear regression, Overfitting, Regularization, Sparsity

    2. Maximum likelihood estimation

    3. Bayesian learning, MAP estimation

    4. Logistic regression

    5. Naive Bayes

    6. Perceptron

    7. Convex optimization, Lagrangian function, Optimality conditions

    8. SVM and kernels

    9. Neural networks and deep learning: DNNs, CNNs

    10. Decision trees and Ensemble methods

    11. Hidden Markov Models

    GRADING

    Homeworks are due at the beginning of the class on the specified dates. No late homeworks or projects will be accepted.

    • Homeworks: 4 HWs (40%)

    • Project (30%)

    • Final Exam (30%)

    Homework consist of both analytical questions and programming assignments. Programming assignments must be done via Python. Both codes and results of running codes on data must be submitted.

    The exam consist of analytical questions from topics covered in the class. Students are allowed to bring a single cheat sheet to the exam.

    TEXTBOOKS

    • [CB] Christopher Bishop, Pattern recognition and machine learning. [Required]

    • [KM] Kevin P. Murphy, Machine Learning: A Probabilistic Perspective. [Optional]

    • [KF] Daphne Koller and Nir Friedman, Probabilistic Graphical Models. [Optional]

    READINGS

      Lecture 1: Introduction to ML, Linear Algebra Review

      Lecture 2: Introduction to Regression

      • Chapter 3 from CB book.

      Lecture 3: Linear Regression: Convexity, Closed-form Solution, Gradient Descent

      • Chapter 3 from CB book.

      Lecture 4: Robust Regression, Overfitting, Regularization

      • Chapter 3 from CB book.

      Lecture 5: Basis Function Expansion, Hyper-parameter Tuning, Cross Validation, Probability Review

      Lecture 6: Maximum Likelihood Estimation

      • Chapter 2 from CB book.

      Lecture 7: Bayesian Learning, Maximum A Posteriori (MAP) Estimation, Classification

      • Chapter 3 and 4.3 from CB book.

      Lecture 8: Logistic Regression, Parameter Learning via Maximum Likelihood, Overfitting

      • Chapter 4.3 from CB book.

      Lecture 9: Softmax Regression, Discriminate vs Generative Modeling, Generative Classification

      • Chapter 4.2 from CB book.

      Lecture 10: Generative Classification, Naive Bayes

      • Chapter 4.2 from CB book.

      Lecture 11: Generative Classification, Naive Bayes

      • Chapter 4.2 from CB book.

      Lecture 12: Convex Optimization, Lagrangian Function, KKT Conditions

      • See lecture notes on piazza.

      Lecture 13: Project pitch

      Lecture 14: Suport Vector Machines

      • Chapter 7 from CB book.

      Lecture 15: Suport Vector Machines: Vanilla SVM, Dual SVM

      • Chapter 7 from CB book.

      Lecture 16: Suport Vector Machines: Soft-Margin SVM, Kernel SVM, Multi-Class SVM

      • Chapter 7 from CB book.

      Lecture 17: Neural Networks

      • Chapter 5 from CB book.

      Lecture 18: Neural Networks: Training, Forward and Back Propagation

      • Chapter 5 from CB book.

    ADDITIONAL RESOURCES

    ETHICS

    All students in the course are subject to the Northeastern University's Academic Integrity Policy. Any submitted report/homework/project by a student in this course for academic credit should be the student's own work. Collaborations are only allowed if explicitly permitted. Per CCIS policy, violations of the rules, including cheating, fabrication and plagiarism, will be reported to the Office of Student Conduct and Conflict Resolution (OSCCR). This may result in deferred suspension, suspension, or expulsion from the university.