Introduction
The ability to understand natural language has been a long-standing dream of the AI community. In the past decade,
using representative tasks such as Natural Language Inference and popular large datasets, the community has made impressive
progress using machine learning (especially deep learning) tecnhiques. Recently, various work has shown
how the state-of-the-art (SOTA) models based on data-driven deep learning methods are brittle, often generalize poorly, and rely on hidden patterns in the data than actually reason to
derive the conclusion. While many analyses have traced the root cause of such behavior to shortcomings in public datasets, recent work (TaxiNLI [1], CheckList [2], InfoTABS [3]) also
shows that the models generalize poorly even in presence of adequate amounts of data. Recent ACL 2020 theme papers trace these shortcomings back to the core idea behind
data-driven training paradigm. Through proposing an Octopus-test, Bender and Koller 2020 suggests that a model of Natural Language “trained purely on form will never learn
meaning”.
Such lack of grounding to the real world and lack of interpretability, begs the question, whether the vast literature of Symbolic Logic may come to the rescue. Advances of
statistical learning based on the concepts from Logic, has been (re)coined under the umbrella of Neuro-symbolic methods. The recent famed AI Debates have seen participation
from top researchers in Deep Learning, Symbolic Logic, Psychology, and Cognitive Science. Researchers together have cast doubt about whether pure data-driven methods based on
Neural Architectures would be sufficient to achieve understanding and whether current popular benchmarks reflect relevant aspects of understanding. Parallelly, neuro-symbolic
methods have proliferated from rule-based auxiliary objective formulation, simple pipelined execution of sequential data-driven learning followed by probabilistic logical methods; to
neural architectures defined by declarative logic, and more end-to-end learnable systems. But a closer look at the
current Neuro-symbolic systems (NLProlog, DeepProbLog) tells us that extensions of such methods to a practical in-the-wild task such
as NLI is still non-trivial.
Slightly independently, the ML community has also seen rapid improvements in new architectures employed for multi-hop reasoning for Automated
Theorem Proving, and ML procedure integration efforts with programming languages (PL+ML). Hence, it is legitimate to revisit the potential and practicality of Neuro-Symbolic
methods in natural language reasoning.
Our goal is to track efforts towards algorithmic (or logical) generalization in the presence of linguistic variability. Specifically, we pose the
following questions: 1) are there linguistic and logical combinations where neural methods fail to generalize even with
large amounts of data, 2) can neuro-symbolic methods (or ML+PL) generalize for certain combinations, and 3) can we track such capability-wise
success more explicitly with more informed benchmarks. We believe this would be the first workshop to bring together researchers from Deep
Learning, Symbolic Logic, and Programming Languages in the same platform.