The ability to understand natural language has been a long-standing dream of the AI community. In the past decade, using representative tasks such as Natural Language Inference (NLI) and large publicly available datasets, the community has made impressive progress towards that goal using machine learning (especially deep learning) tecnhiques. Recently, various researchers showed how the state-of-the-art (SOTA) data-driven deep learning models are brittle, often generalize poorly, and rely on hidden patterns in the data than actually reason to derive the conclusion. While many analyses have traced the root cause of such behavior to shortcomings in public datasets, recent work (TaxiNLI, and CheckList) also showed that the models generalize poorly even in presence of adequate amounts of data. A recent ACL 2020 theme paper (by Bender and Koller 2020) trace these shortcomings back to the core idea behind data-driven (supervised and un-supervised) training paradigm. Through proposing an Octopus-test, the authors suggest that a model of Natural Language “trained purely on form will never learn meaning".

Such lack of grounding to the real world and lack of interpretability, begs the question, whether the vast literature of Symbolic Logic may come to the rescue. Advances of statistical learning based on the concepts from Logic, has been (re)coined under the umbrella of Neuro-symbolic methods. The recent famed AI Debates have seen participation from top researchers in Deep Learning, Symbolic Logic, Psychology, and Cognitive Science. Researchers together have cast doubt about whether pure data-driven methods based on Neural Architectures would be sufficient to achieve understanding and whether current popular benchmarks reflect relevant aspects of understanding. Parallelly, neuro-symbolic methods have proliferated from rule-based auxiliary objective formulation, simple pipelined execution of sequential data-driven learning followed by probabilistic logical methods; to neural architectures defined by declarative logic, and more end-to-end learnable systems. But a closer look at the current Neuro-symbolic systems (NLProlog, DeepProbLog) tells us that extensions of such methods to a practical in-the-wild task such as NLI is still non-trivial.

Slightly independently, the ML community has also seen rapid improvements in new architectures employed for multi-hop reasoning for Automated Theorem Proving, and ML procedure integration efforts with programming languages (PL+ML). Hence, it is legitimate to revisit the potential and practicality of Neuro-Symbolic methods in natural language reasoning.

Goal of the Workshop

The broader goal of NLU requires generalization in various reasoning aspects in the presence of linguistic variability, and this is hard. Instead, we want to take specific reasoning dimensions, informed by Logic, Knowledge Representation and Reasoning, and Linguistics literature; and track the progress of both Neural and Neuro-Symbolic efforts. Specifically, we are interested in the following questions: 1) are there linguistic and logical combinations where neural methods fail to generalize even with large amounts of data, 2) can neuro-symbolic methods (or ML+PL) generalize better for certain reasoning dimensions, and 3) can we build more informed benchmarks (datasets and metrics) that track such reasoning capability-wise success more explicitly.

Organizing Committee



Somak Aditya
Microsoft Research India

Maria D. Chang
IBM Research Almaden

Swarat Chaudhuri
UT Austin

Monojit Choudhury
Microsoft Research India

List of Comparable Workshops