Gerard de Melo
CBIM 8, Dept. of Computer Science

Office hours: By appointment (due to pandemic). Please send an email or use Sakai forums.

All emails must have "[CS439]" in subject to be considered.

Teaching Assistants

Abu Shoeb
Office hours: Tue 3–4 PM
WebEx Meeting# 790 739 681
See announcement for password.

Shahab Raji
Office hours: Thu 3–4 PM
WebEx Meeting# 790 739 681
See announcement for password.

Liqin Long (Grader)

Time and Location

Tuesdays and Thursdays, 18:40 – 20:00
Richard Weeks Hall (RWH) 102, Busch Campus
Online via Sakai

Section 1 (Tuesdays, 20:25 - 21:20)
Section 2 (Thursdays, 20:25 - 21:20)
WebEx Meeting# 791 508 063
See announcement for password.



Our modern world is increasingly being driven by data. We increasingly see data determining which companies succeed, who wins elections, and even who marries whom. In this course, we will cover fundamental techniques in the emerging field of Data Science. This course is aimed at computer science students, so we will focus in particular on important computational aspects such as working with massive amounts of data ("Big Data") and learning from data ("machine learning").


01-21 TueIntroduction
01-23 ThuData Collection: Gathering Data
01-28 TueData Collection: Parsing Data
01-30 ThuData Analysis: Data Frames and Preprocessing
02-04 TueData Analysis: Basic Statistics
02-06 ThuBig Data Analysis (Hadoop)
02-11 TueBig Data Analysis (Spark)
02-13 ThuBig Data Analysis (Spark, Ethical Aspects)
02-18 TueTextual Data
02-20 ThuData Visualization (Guest lecture by Professor James Abello)
02-25 TueTextual Data, Data Visualization
02-27 ThuSocial Networks, Link Analysis, Graph Data Mining
03-03 TueIn-Class Mid-Term Exam
03-05 ThuFinding Groups (Clustering)
03-10 TueLearning from Data: Feature Extraction and Dimensionality Reduction
03-12 ThuSpring Recess (extended)
03-17 TueSpring Recess
03-19 ThuSpring Recess
03-24 TueLearning from Data: Predicting Numbers (Regression)
03-26 ThuLearning from Data: Predicting Numbers (Regression)
03-31 TueLearning from Data: Simple Classification Algorithms
04-02 ThuLearning from Data: Essential Practices, Modern Classification Algorithms
04-07 TueLearning from Data: Modern Classification Algorithms
04-09 ThuLearning from Data: Evaluating Machine Learning Models
04-14 TueLearning from Data: Multi-Class Classification
04-16 ThuLearning from Data: Fairness and Bias, Interpretability and Explainability
04-21 TueLearning from Data: Deep Learning
04-23 ThuData Mining Algorithms
04-28 TueRecommending Items
04-30 ThuProject Presentations
05-07 ThuFinal Exam (8:00 PM, officially until 11:00 PM)

See also: Rutgers Academic Calendar.

Slides, Discussion Forum

Sakai (coming soon) will be used to host slides, as well as to provide a forum for discussions.

Grading and Course Project

The grades will be determined as follows:

We will occasionally sample the attendance in recitations, and people who are found to attend and participate regularly can receive a grade bonus of up to 10%.

Graded homework assignments will be announced on Sakai. Make sure to enable e-mail notifications.
A set of slides with details about the course project requirements will be on Sakai as well.



Since we are focusing on the latest developments, this course does not strictly follow any designated coursebook. Rather, specific references for further reading will be posted at the end of the slides for each unit (typically the last slide). Still, the following (optional) books may be useful.


For problems or questions about this site, please contact Gerard de Melo. Rutgers is an equal access/equal opportunity institution. Individuals with disabilities are encouraged to direct suggestions, comments, or complaints concerning any accessibility issues with Rutgers web sites to: or complete the Report Accessibility Barrier / Provide Feedback form.