Document Type
Conference Proceeding
Publication Title
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Version
Final Published Version
Volume
1804
Publication Date
2018
Abstract
We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.
DOI
https://doi.org/10.48550/arXiv.1804.08207
Citation
A Poliak, A Haldar, R Rudinger, JE Hu, E Pavlick. 2018. "Collecting diverse natural language inference problems for sentence representation evaluation." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 1804.08207: 67-81.