http://arxiv.org/abs/2209.10702

Sept. 23, 2022 | Peter Eastman, Pavan Kumar Behara, David L. Dotson, Raimondas Galvelis, John E. Herr, Josh T. Horton, Yuezhi Mao, John D. Chodera, Benjamin P. Pritcha

cs.LG updates on arXiv.org arxiv.org

Machine learning potentials are an important tool for molecular simulation,
but their development is held back by a shortage of high quality datasets to
train them on. We describe the SPICE dataset, a new quantum chemistry dataset
for training potentials relevant to simulating drug-like small molecules
interacting with proteins. It contains over 1.1 million conformations for a
diverse set of small molecules, dimers, dipeptides, and solvated amino acids.
It includes 15 elements, charged and uncharged molecules, and a wide range …

