Machine Reading of Natural Language and Interactive Visualization
Add to Google Calendar
In natural language processing, the summarization of information
in a large amount of text has typically been viewed as a type of
natural language generation problem, e.g. "produce a 250 word summary
of some documents based on some input query" . An alternative view,
which will be the focus of this talk, is to use natural language
parsing to extract facts from a collection of documents and then
use information visualization to provide an interactive summarization
of these facts.
The first step is to extract detailed facts about events from natural
language text using a predicate-centered view of events (who did
what to whom, when and how). We exploit semantic roles in order to
create a predicate-centric ontology for entities which is used to
create a knowledge base of facts about entities and their relationship
with other entities.
The next step is to use information visualization to provide a
summarization of the facts in this automatically extracted knowledge
base. The user can interact with the visualization to find summaries
that have different granularities. This enables the discovery of
extremely uncommon facts easily.
We have used this methodology to build an interactive visualization
of events in human history by machine reading Wikipedia articles.
I will demo the visualization and describe the results of a user
study that evaluates this interactive visualization for a summarization
Anoop Sarkar is a Professor at Simon Fraser University in British
Columbia, Canada where he co-directs the Natural Language Laboratory
(http://natlang.cs.sfu.ca). His research is focused on machine
learning approaches to multilingual natural language processing.