CSE researchers win Area Chair Award at ACL 2024
CSE PhD student Inderjeet Nair and Professor Lu Wang have been selected to receive the Area Chair Award at the 2024 Annual Meeting of the Association for Computational Linguistics (ACL) for their paper titled “MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning.” Their research explores ways to improve structured reasoning in large language models (LLMs) through the generation of more accurate and consistent reasoning graphs.
The ACL Area Chair Award is given to one paper per track at the ACL conference, with winners selected by the Senior Area Chair for their relevance and innovation. Out of more than 4,000 submissions and nearly 2,000 accepted papers appearing at the conference this year, just 21 were selected for the award.
In their paper, Nair and Wang propose a novel method, called MIDGARD, that improves the performance of LLMs on structured reasoning tasks through the use of a minimum description length-based approach to reduce errors.
Whereas LLMs consistently perform well on traditional commonsense reasoning, such as reading comprehension or answering questions, they struggle with structured commonsense tasks, which require generating structured output, such as a graph, from natural language. Discrepancy in graph style and error propagation are trenchant challenges LLMs demonstrate in such tasks.
Nair and Wang’s solution, MIDGARD, addresses both of these issues by using a minimum description length-based approach to aggregate multiple graph samples generated by the LLMs. Using this method, MIDGARD effectively identifies and maintains consistent properties across various samples while discarding those that are likely erroneous. This not only mitigates the impact of error propagation but also ensures the inclusion of nodes and edges that may have been omitted due to single-pass decoding limitations.
The ability to generate accurate and reliable reasoning graphs is a significant step forward for LLM development, enhancing their trustworthiness and usefulness in a variety of applications.