Dissertation Defense

Estimation of Information Measures and Applications in Machine Learning

Morteza Noshad
GM Conference Room, Lurie Engineering Center (4th floor)Map
Information-theoretic measures such as Shannon entropy, mutual information, and the Kullback-Leibler (KL) divergence have a broad range of applications in information and coding theory, statistics, machine learning, and neuroscience. KL-divergence is a measure of difference between two distributions, while mutual information captures the dependencies between two random variables. Furthermore, the binary Bayes classification error rate specifies the best achievable classifier performance and is directly related to an information divergence measure.
In most practical applications the underlying probability distributions are not known and empirical estimation of information measures must be performed based on data. In this thesis, we propose scalable and time-efficient estimators of information measures that can achieve the parametric mean square error (MSE) rate of O(1/N). Our approaches are based on different methods such as k-Nearest Neighbor (k-NN) graphs, Locality Sensitive Hashing (LSH), and Dependence Graphs. The core idea in all of these estimation methods is a unique plug-in estimator of the density ratio of the samples. We prove that the average of an appropriate function of density ratio estimates over all of the points converges to the divergence or mutual information measures. We apply our methods to several machine learning problems such as structure learning, feature selection, and information bottleneck (IB) in deep neural networks.

Sponsored by

Hero III,Alfred O


Sonya Siddique