Dissertation Defense

Sublinear Time Algorithms for the Sparse Recovery Problem

Yi Li

Tracking heavy hitters in high-volume, high-speed data streams, monitoring changes in data streams, designing pooling schemes for biological tests (e.g., high throughput sequencing, testing for genetic markers), localizing sources in sensor networks, and combinatorial pattern matching are all quite different technological challenges, yet they can all be expressed in the same mathematical formulation, called the sparse recovery problem. This problem has further application to telecommunications and medical imaging processing. In the sparse recovery problem, we have a signal x of length N that is sparse or highly compressible; i.e., it consists of k significant entries ('heavy hitters') while the rest of the entries are essentially negligible. We wish to acquire a small amount of information (approximately commensurate with the sparsity) about this signal in a linear, non-adaptive fashion, and then use that information to recover the significant entries quickly. In a data stream setting, our signal is the distribution of items seen, while in biological group testing, the signal is proportional to the binding affinity of each drug compound (or the expression level of a gene in a particular organism). We want to recover the positions and values of only the heavy hitters, as the rest of the signal is not of interest. In this talk I shall discuss two variants of the problem (ell_2/ell_2 and ell_1/ell_1) and extend the techniques to the off-grid Fourier case.

Sponsored by

Martin Strauss