Mining Multiple Perspectives From Social Media and Scientific Literature
Add to Google Calendar
This thesis studies how Natural Language Processing techniques can be used to mine
perspectives from textual data. The first part of the thesis focuses on analyzing the text posted
by people who participate in discussions on social media sites. We particularly focus on threaded
discussions that discuss ideological and political topics. The goal is to identify the different viewpoints
that the discussants have with respect to the discussion topic. We use subjectivity and sentiment
analysis techniques to identify the attitudes that the participants carry toward one another and
toward the different aspects of the discussion topic. This involves identifying opinion expressions
and their polarities, and identifying the targets of opinion. We use this information to represent
discussions in one of two representations: discussant attitude vectors or signed attitude networks.
We use data mining and network analysis techniques to analyze these representations to detect rifts
in discussion groups and study how the discussants split into subgroups with contrasting opinions.
In the second part of the thesis, we focus on mining perspectives from scientific literature. We
analyze the text adjacent to reference anchors in scientific articles as a means to identify researchers'
viewpoints toward previously published work. We propose methods for identifying, extracting, and
cleaning citation text. We analyze this text to identify the purpose (author's intention) and polarity
(author's sentiment) of citation. Finally, we present several applications that can benefit from this
analysis such as generating multi-perspective summaries of scientific articles and predicting future
prominence of publications.