CPSC/AMTH 445/545 - Introduction to Data Mining - Fall 2016 Yale
Yale University CPSC 445/545 - F2016

CPSC/AMTH 445/545

Introduction to Data Mining

Fall 2016

Instructor: Guy Wolf (guy.wolf@yale.edu)

TA: Nicholas Marshall (nicholas.marshall@yale.edu)
ULA: Yutaro Yamada (yutaro.yamada@yale.edu)

The ability to process and extract insightful information from large amounts of data has become a desired, if not necessary, skill in almost every field of industry and science. Among other benefits, such information can provide useful knowledge, support decision-making, uncover hidden trends, and enable deeper understanding of observed phenomena. This course will cover some of the main problems and challenges encountered in data analysis and applications, and provide fundamental tools and techniques for solving them. We will discuss popular algorithms for data organization & visualization, such as principal component analysis (PCA) and multidimensional scaling (MDS). Students will become familiar with a variety of machine learning and data mining approaches. These will include both supervised approaches, such as performing classification with support vector machines (SVM), and unsupervised ones, such as clustering data with k-means.

The lectures and discussions in class will be accompanied by homework exercises that combine theoretical questions, which emphasize the understanding of underlying data mining principles, together with programming tasks (e.g., in MatLab and/or Python) that demonstrate practical implementations of studied data mining techniques. Grades in this course will be based on these exercises, a project, and an exam.

The course assumes basic prior knowledge in probabilities, linear algebra, data structures, algorithms, and programming.



Tuesdays & Thursdays 1:00-2:15, GR109 (Rosenfeld Hall, 109 Grove St)


Wednesdays 7:00 PM, AKW 100

Office Hours:

Instructor: Wednesdays 6:00-7:00 PM, AKW 103
TA: Mondays 5:00-6:00 PM (or by appointment), AKW 307
ULA: Fridays 4:00-5:00 PM (or by appointment), AKW 200


No required textbook, but the following books are recommended for the course:


This is a tentative list of topics we intend to cover, which may change as we progress through the course:


Extra topics (slides not prepared specifically for this course):
NOTE: This webpage is outdated since it relates to a past iteration of this course.