Introduction

The ability to understand natural language has been a long-standing dream of the AI community. In the past decade, using representative tasks such as Natural Language Inference and popular large datasets, the community has made impressive progress using machine learning (especially deep learning) tecnhiques. Recently, various work has shown how the state-of-the-art (SOTA) models based on data-driven deep learning methods are brittle, often generalize poorly, and rely on hidden patterns in the data than actually reason to derive the conclusion. While many analyses have traced the root cause of such behavior to shortcomings in public datasets, recent work (TaxiNLI [1], CheckList [2], InfoTABS [3]) also shows that the models generalize poorly even in presence of adequate amounts of data. Recent ACL 2020 theme papers trace these shortcomings back to the core idea behind data-driven training paradigm. Through proposing an Octopus-test, Bender and Koller 2020 suggests that a model of Natural Language “trained purely on form will never learn meaning”. Such lack of grounding to the real world and lack of interpretability, begs the question, whether the vast literature of Symbolic Logic may come to the rescue. Advances of statistical learning based on the concepts from Logic, has been (re)coined under the umbrella of Neuro-symbolic methods. The recent famed AI Debates have seen participation from top researchers in Deep Learning, Symbolic Logic, Psychology, and Cognitive Science. Researchers together have cast doubt about whether pure data-driven methods based on Neural Architectures would be sufficient to achieve understanding and whether current popular benchmarks reflect relevant aspects of understanding. Parallelly, neuro-symbolic methods have proliferated from rule-based auxiliary objective formulation, simple pipelined execution of sequential data-driven learning followed by probabilistic logical methods; to neural architectures defined by declarative logic, and more end-to-end learnable systems. But a closer look at the current Neuro-symbolic systems (NLProlog, DeepProbLog) tells us that extensions of such methods to a practical in-the-wild task such as NLI is still non-trivial. Slightly independently, the ML community has also seen rapid improvements in new architectures employed for multi-hop reasoning for Automated Theorem Proving, and ML procedure integration efforts with programming languages (PL+ML). Hence, it is legitimate to revisit the potential and practicality of Neuro-Symbolic methods in natural language reasoning. Our goal is to track efforts towards algorithmic (or logical) generalization in the presence of linguistic variability. Specifically, we pose the following questions: 1) are there linguistic and logical combinations where neural methods fail to generalize even with large amounts of data, 2) can neuro-symbolic methods (or ML+PL) generalize for certain combinations, and 3) can we track such capability-wise success more explicitly with more informed benchmarks. We believe this would be the first workshop to bring together researchers from Deep Learning, Symbolic Logic, and Programming Languages in the same platform.