Theory Seminar

Deterministic Clustering in High Dimensional Spaces: Sketches and Approximation

Chris SchwiegelshohnAarhus university
4941 Beyster BuildingMap
Moreso than other areas, algorithms used for large scale data analysis in high dimensional spaces rely on randomization. From the celebrated Johnson-Lindenstrauss lemma, to streaming and sketching algorithms, to even popular heuristics like k-means++, randomization seems to have a strong competitive edge. A natural question to ask whether this is necessary?

In this talk we make present results that mark the first step in this direction. For k-clustering problems such as k-means and k-median, we provide deterministic algorithms for dimension reduction, coresets, and polynomial time approximation schemes. All algorithms are fixed parameter tractable in the number of clusters k and the desired precision parameter epsilon. Based on joint work with Vincent Cohen-Addad and David Saulpic.


Greg Bodwin


Euiwoong Lee