Understanding Chemical Trends in Generated Databases of Functional Molecules

Title

Understanding Chemical Trends in Generated Databases of Functional Molecules

Subject

Chemistry

Creator

Francesco Bartucca

Contributor

Reinhard Maurer, Zsuzsanna Koczor-Benda

Abstract

This project aims to investigate the output of G-SchNet, a generative machine learning model used by many in the field of computational chemistry. Discrepancies between the distributions of generated and training data are highlighted, in particular a tendency to generate molecules which contain more heavy atoms compared to the training datasets. Databases used for training include a thiols set composed of commercially-available molecules, and the more widely used OE62 and QM9 sets.

Meta Tags

Chemistry, Machine Learning, Computational Chemistry, Data Analysis

Files

Citation

Francesco Bartucca, “Understanding Chemical Trends in Generated Databases of Functional Molecules,” URSS SHOWCASE, accessed January 10, 2025, https://linen-dog.lnx.warwick.ac.uk/items/show/574.