High-Throughput Calculations of Molecular Properties in the MedeA® Environment: Accuracy of PM7 in Predicting Vibrational Frequencies, Ideal Gas Entropies, Heat Capacities, and Gibbs Free Energies of Organic Molecules.
Xavier Rozanska, James J P Stewart, Philippe Ungerer, Benoit Leblanc, Clive Freeman, Paul Saxe, and Erich Wimmer.
J Chem Eng Data, 2014 p. 140603143218005.
The atomistic and molecular simulation environment MedeA in its functionalities and graphical user interface has been enhanced to prepare and submit on the order of 1000 simulations on different structures, and to collect and help in the analysis of the results. We illustrate this with the determination of the accuracy of the semiempirical (SE) package MOPAC2012 (Stewart, J. J. P.
MOPAC2012; Stewart Computational Chemistry: Colorado Springs, CO, USA, 2012; http://OpenMOPAC.net) with the PM7 method (Stewart, J. J. P. Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and reoptimization of parameters. J. Mol. Model. 2013, 19, 1−32) to compute frequencies of vibration and thermodynamic properties, specifically the zero point energies, ideal gas heat capacity at constant pressure, entropy, and Gibbs free energy, between 200 and 1000 K for 795 organic molecules.
The results were compared with experimental data and density functional theory (DFT) values (using B3LYP/TZVP and BP86/ TZVP DFT methods). This comparison showed that the PM7 frequencies of vibration above 2500 cm−1 are systematically underestimated. An a posteriori correction using a linear relationship rescaling of the frequencies permitted resetting to zero the average relative deviations with respect to experimental reference values. This frequency correction also removed the bias from the zero point energies, ideal gas heat capacity, and entropy average deviations from the PM7 results. The root-mean-square deviation (RMSD) of PM7 and the DFT heat capacities of 160 organic molecules were equivalent with respect to experimental values, being about 5 %, 2.5 %, and 3 % at 300 K, 600 K, and 1000 K, respectively. The RMSD of PM7, when compared to the DFT values, became 4 %, 2 %, and 1 % for the same temperatures when the analysis was extended to a set of 795 molecules. In the case of the ideal gas entropies, the RMSD of the PM7 relative to DFT values were between 5 % and 4 % between 300 K and 1000 K, respectively. The RMSD of the Gibbs free energies of PM7 were 15 kJ mol−1 and 30 kJ mol−1 at 300 K and 1000 K, respectively. The efficiency of this semiempirical approach was tested on a set of approximately 5800 molecules. This set was processed in about a day, thus demonstrating the scalability of the approach to big data sets.