AI Seminar
Towards Open-domain Generation of Programs from Natural Language
Add to Google Calendar
Code generation from natural language is the task of generating
programs written in a programming language (e.g. Python) given a
command in natural language (e.g. English). For example, if the input
is "sort list x in reverse order" , then the system would be required
to output "x.sort(reverse=True)" in Python. In this talk, I will talk
about (1) machine learning models to perform this code generation, (2)
methods for mining data from programming web sites such as stack
overflow, and (3) methods for semi-supervised learning, that allow the
model to learn from either English or Python on its own, without the
corresponding parallel data.
Graham Neubig is an assistant professor at the Language Technologies
Institute of Carnegie Mellon University. His work focuses on natural
language processing, specifically multi-lingual models that work in
many different languages, and natural language interfaces that allow
humans to communicate with computers in their own language. Much of
this work relies on machine learning to create these systems from
data, and he is also active in developing methods and algorithms for
machine learning over natural language data. He publishes regularly in
the top venues in natural language processing, machine learning, and
speech, and his work occasionally wins awards such as best papers at
EMNLP, EACL, and WNMT. He is also active in developing open-source
software, and is the main developer of the DyNet neural network
toolkit.