Faculty Candidate Seminar

What we can solve with a sublinear number of samples

Dr. Sofya Raskhodnikova

Dr. Raskhodnikova is a Post-doctoral Fellow at the Weizmann Institute of Science in Rehobot, Israel; she received her PhD from MIT in 2003
Suppose we are given a list of numbers and we wish to determine
whether it is sorted. That problem obviously requires reading the
entire list. But Ergun, Kannan, Kumar, Rubinfeld and Viswanathan
showed in 1998 that if we know in advance that our list is either
sorted or far from sorted, we can perform the test by examining only a
small portion of the list. Testing if a list is sorted is thus an
example of a task that can be performed with a small number of samples
into the input.

As data of all types gets easier to obtain and cheaper to store,
datasets are becoming increasingly large, and we are faced with a
need to perform computational tasks on massive datasets. Can we
still compute something useful about a dataset when reading all of
it is prohibitively expensive? In other words, what can we solve
with a sublinear number of samples from the input? This is the main
question addressed by the newly emerging area, called Sublinear

In this talk, we will give a few examples of specific problems that
can be solved with a sublinear number of samples, starting with the
sorting example and moving on to simple properties of images, edit
distance between strings, and approximating compressibility of a
string. We will also explain a framework called "property testing"
that was defined by Rubinfeld and Sudan in 1996 to formalize the kind
of approximation that sublinear algorithms are good at. We will
present a partial classification of what can and cannot be solved in
that model when an algorithm can query the input only in a sublinear
number of places.

Sponsored by

CSE Division