May 5, 2022, 1:11 a.m. | Sheshera Mysore, Arman Cohan, Tom Hope

We present a new scientific document similarity model based on matching
fine-grained aspects of texts. To train our model, we exploit a
naturally-occurring source of supervision: sentences in the full-text of papers
that cite multiple papers together (co-citations). Such co-citations not only
reflect close paper relatedness, but also provide textual descriptions of how
the co-cited papers are related. This novel form of textual supervision is used
for learning to match aspects across papers. We develop multi-vector
representations where vectors correspond …

