Instructor

Gerard de Melo
CBIM 8, Dept. of Computer Science
Office hours: Tuesdays, 6-7 PM
(Note: Office hours on Feb. 13 moved to Wednesday, Feb. 14 4-5pm due to a faculty interview!)

All emails must have "[CS439]" in subject to be considered.

Teaching Assistants

Hanxiong Chen
hc691@scarletmail.rutgers.[...]
Office hours:
Fridays, 10:30-11:30 AM in Hill 206

Shahab Raji
sr1101@rutgers.[...]
Office hours:
Tuesdays, 2:30-3:30 PM in Hill 206

Time and Location

Tuesdays and Fridays, 12:00 – 13:20
Tilett Hall 242, Livingstone Campus

Recitations:
(Section 1) Tuesdays 8:55 - 9:50 AM
Beck Hall 252, Livingstone Campus
(Section 2) Fridays 8:55 AM - 9:50 AM
Beck Hall 250, Livingstone Campus

Overview

Our modern world is increasingly being driven by data. We increasingly see data determining which companies succeed, who wins elections, and even who marries whom. In this course, we will cover fundamental techniques in the emerging field of Data Science. This course is aimed at computer science students, so we will focus in particular on important computational aspects such as working with massive amounts of data ("Big Data") and learning from data ("machine learning").

Schedule/Syllabus (subject to change)

DateTopics
01-16 Tue Introduction
01-19 FriData Representation, Preprocessing
01-23 TueData Preprocessing ("Data Wrangling")
01-26 FriData Wrangling and Data Management
01-30 TueExploratory Data Analysis
02-02 FriData Visualization (Guest Lecture by Professor James Abello)
02-06 TueNo class
02-09 FriExploratory Data Analysis / Big Data (Hadoop)
02-13 TueBig Data (Hadoop/Spark)
02-16 FriBig Data (Spark)
02-20 TueData Streams and Big Data Algorithms
02-23 FriLearning from Data: Basics, Evaluation
02-27 TueLearning from Data: Algorithms
03-02 FriLearning from Data: Algorithms
03-06 TueLearning from Data: Algorithms
03-09 FriLearning from Data: Learning Representations
03-13 TueSpring Recess
03-16 FriSpring Recess
03-20 TueIn-Class Mid-Term Exam
03-23 FriData Mining Algorithms
03-27 TueData Mining Algorithms
03-30 FriSocial Networks, Link Analysis, Graph Data Mining
04-03 TueClustering
04-06 FriClustering
04-10 TueDimensionality Reduction
04-13 FriPractical Issues: Data Integration
04-17 TuePractical Issues: Ethics and Data Science
04-20 FriOutlook
04-24 TueProject Presentations
04-27 FriProject Presentations

See also: Rutgers Academic Calendar.

Slides, Discussion Forum

Sakai is used to host slides, as well as to provide a forum for discussions.

Grading and Course Project

The grades will be determined as follows:

Graded homework assignments will be announced on Sakai. Make sure to enable e-mail notifications.
A set of slides with details about the course project requirements will be on Sakai as well.

Policies:

References

Since we are focusing on the latest developments, this course does not strictly follow any designated coursebook. Rather, specific references for further reading will be posted at the end of the slides for each unit (typically the last slide). Still, the following (optional) books may be useful.