Topic Description
Fitting a model to a collection of observations is one of the quintessential questions in machine learning.
The standard assumption is that the data was generated by a model of a given type (e.g., a mixture model).
This simplifying assumption is at best only approximately valid, as
real datasets are typically exposed to some source of contamination.
Hence, any estimator designed for a particular model that is to be used in practice
must also be
robust in the presence of model misspecification. This is the prototypical goal in
robust statistics, a field that took shape in the 1960s
with the pioneering works of Tukey and Huber.
Until recently, even for the basic problem
of robustly estimating the mean of a highdimensional dataset,
all known robust estimators were hard to compute.
Moreover, the quality of the common heuristics (e.g., RANSAC) degrades badly as the dimension increases.
This state of affairs prompted the following natural question:
Can we reconcile robustness and computational efficiency in highdimensional estimation?
A recent line of work in computer science has sought to circumvent this apparent computational
barrier for a range of natural generative models. Specifically, two concurrent works gave the first provably robust and
efficiently computable estimators for several fundamental questions that were previously thought to be computationally intractable.
These include robust estimation of mean and covariance in high dimensions, and robust learning of various latent variable models.
Since the dissemination of these works, there has been a flurry of research activity on algorithmic robust estimation
in a variety of highdimensional settings. Moreover, the developed ideas have been useful in
several practical applications, including finding patterns in genomic datasets
and improving the security of machine learning systems.
Goals and Target Audience
The notion of robustness lies at the core of machine learning.
The first objective of the workshop will be to introduce the local machine learning community
to the new insights and techniques in the exciting area of algorithmic robust statistics.
During the last couple of years, several algorithmic techniques
have been developed for robust highdimensional estimation by various groups
of researchers. There are intriguing connections between these techniques
and the relations between them can only be drawn when presenting all of them together.
By bringing a number of experts together, we expect to foster collaboration
and open exchange of ideas that will significantly push the boundary of the field.
This includes both circumventing technical obstacles and discussing new directions
and opportunities for future work.
Specifically, we will examine the following concrete questions:
What are the limits of current algorithmic techniques, and
can we develop new techniques that yield efficient robust estimators
for richer families of problems and models? Is there a formal connection between adversarial
robustness and other notions of algorithmic stability, such as differential privacy, and adaptive data analysis?
Can we leverage insights from the theory of robustness to improve the security
of machine learning systems against adversarial attacks?
List of Participants
Pranjal Awasthi (Rutgers)
Sivaraman Balakrishnan (CMU)
Nina Balcan (CMU)
Moses Charikar (Stanford)
Yu Cheng (Duke)
Ilias Diakonikolas (USC)
Chao Gao (Chicago)
Sam Hopkins (Cornell)
Prateek Jain (MSR)
Gautam Kamath (MIT)
Daniel Kane (UCSD)
Ravi Kannan (MSR)
Weihao Kong (Stanford)
Pravesh Kothari (Princeton)
PoLing Loh (UW Madison)
Stas Minsker (USC)
Ankur Moitra (MIT)
Anup Rao (Adobe Research)
Sankeerth Rao (UCSD)
Mahdi Soltanokotabi (USC)
Gregory Valiant (Stanford)
Santosh Vempala (GaTech)
David Woodruff (CMU)
Tentative Schedule
Monday
Time 
Talk 
9:30  9:40  Opening Remarks

9:40  11:40  Ilias Diakonikolas Algorithmic HighDimensional Robust Statistics
[pdf]

12:00 12:30  Daniel Kane Good Sets and Sample Complexity
[pdf]

12:30 2:00  Lunch offered at TTI

2:00 4:00  Pravesh Kothari SoS Method for Robust Mean Estimation
[pdf]

4:30 5:30  David Woodruff Sketching for MEstimators and Robust Numerical Linear Algebra
[pdf]

Tuesday
Time 
Talk 
9:30  10:00  Daniel Kane Robust Covariance Estimation
[pdf]

10:15  12:00  Sam Hopkins SoS Method for Robust Estimation of Higher Moments
[pdf]
[boardphoto1]
[boardphoto2]

12:00 2:00  Lunch (on your own)

2:002:45  Daniel Kane Computational Lower Bounds for Robust Estimation
[pdf] or
[pdf]

3:00 4:00  Daniel Kane ListDecodable Learning and Learning Mixtures of Distributions
[pdf]

4:30 5:30  Open Problems Session 
Wednesday
Time 
Talk 
9:30  10:15  Chao Gao Robust Estimation: Optimal Rates, Adaptation and Computation
[pdf]

10:45  11:30  PoLing Loh Robust estimation via (non)convex Mestimation
[pdf]

12:00 2:00  Lunch (on your own)

2:002:45  Stas Minsker Statistical estimation with heavytailed data
[pdf]

2:45 5:00  Working groups/Collaboration 
Thursday
Time 
Talk 
9:30  10:30  Pranjal Awasthi Malicious PAC Learning of Halfspaces
[pdf]

10:45  11:15  Ilias Diakonikolas Robust PAC Learning of Geometric Concepts
[pdf]

11:3012:15  Prateek Jain Hard Thresholding method for Robust Regression 
12:30 1:30  Nina Balcan (TTIC Seminar) Foundations of Data Driven Algorithm Design
[pdf]

2:00 5:00  Working groups/Social event 
Friday
Time 
Talk 
9:30  10:20  Gautam Kamath Beyond Theory: Realizing Robustness
[pdf]

10:20  2:00  Working groups and Lunch 
2:002:30  Weihao Kong Algorithms and Lower Bounds for Robust Linear Regression
[pdf]

2:453:15  Yu Cheng Faster Algorithms for Robust Estimation
[pdf]

3:15 5:30  Working groups/Open Problems 