Using CAREER funds, we've developed three new courses to teach distributed systems and data management concepts to undergraduates, graduate students in the CS department, as well as MS students in the brand-new Data Sciences MS program of Columbia's Data Science Institute (DSI).
Distributed Systems Fundamentals:
The first undergraduate course at Columbia to teach concepts of large-scale distributed system design. Topics include distributed
communication protocols, remote procedure calls, consistency models, fault tolerance, the consensus problem, Paxos, security, and
several case studies of real distributed systems design, including Google's Bigtable, Chubby, and Spanner, and Amazon's Dynamo.
A unique aspect of this course is its mixed classes that alternate between concepts and real-life application of these concepts in
large-scale systems.
Website: http://roxanageambasu.github.io/ds-class/ (teaching materials available).
Advanced Distributed Systems:
Graduate-level research seminar that provide an overview of influential research that provided the basis of most large-scale,
cloud infrastructures today. Students read and discuss papers on important distributed systems topics, including distributed consensus,
consistency models and algorithms, service-oriented architectures, large-scale data storage, and distributed transactions, and big-data
processing frameworks. CAREER-inspired topics related to privacy and responsible data management are discussed in class and
experimented with in semester-long, team projects.
Website: http://roxanageambasu.github.io/ds2-class/ (teaching materials will be available).
Computer Systems for Big Data:
Graduate-level systems course specifically designed for, and included from start in the curriculum of, Columbia's new Master's program for
the Data Science Institute (DSI). Topics include hardware building blocks, fundamental computing tradeoffs, various distributed and
parallel processing models (batch, iterative, streaming), popular frameworks that exemplify each, plus their internal designs and tradeoffs.
It is being co-taught by Geambasu, Sethumadhavan, and Erran Li from the CS department.
Website: http://columbia.github.io/systems-bigdata-class/ (teaching materials will be
available).