Section 70
Chapter 69,098

Accurate Thermochemistry with Small Data Sets: A Bond Additivity Correction and Transfer Learning Approach

Grambow, C.A.; Li, Y-Pei.; Green, W.H.

Journal of Physical Chemistry. a 123(27): 5826-5835


ISSN/ISBN: 1089-5639
PMID: 31246465
DOI: 10.1021/acs.jpca.9b04195
Accession: 069097062

Download citation:  

Machine learning provides promising new methods for accurate yet rapid prediction of molecular properties, including thermochemistry, which is an integral component of many computer simulations, particularly automated reaction mechanism generation. Often, very large data sets with tens of thousands of molecules are required for training the models, but most data sets of experimental or high-accuracy quantum mechanical quality are much smaller. To overcome these limitations, we calculate new high-level data sets and derive bond additivity corrections to significantly improve enthalpies of formation. We adopt a transfer learning technique to train neural network models that achieve good performance even with a relatively small set of high-accuracy data. The training data for the entropy model are carefully selected so that important conformational effects are captured. The resulting models are generally applicable thermochemistry predictors for organic compounds with oxygen and nitrogen heteroatoms that approach experimental and coupled cluster accuracy while only requiring molecular graph inputs. Due to their versatility and the ease of adding new training data, they are poised to replace conventional estimation methods for thermochemical parameters in reaction mechanism generation. Since high-accuracy data are often sparse, similar transfer learning approaches are expected to be useful for estimating many other molecular properties.

PDF emailed within 0-6 h: $19.90