The color of a molecule (more precisely, its absorption wavelength) is determined by its chemical structure. However, the relationship between chemical structure and color remains elusive. Being able to establish this relationship—predicting the color from the chemical structure—would be valuable information to gain chemical insight.

To establish this structure-property relationship, considering basic aspects of molecular structures is useful: for instance, if a molecule contains a path with alternating single and double edges, “conjugated double bonds,” it tends to absorb light at longer wavelengths. Here, we aim to bring about further insight using a data-driven approach.

In this project, you will investigate to what extent we can use machine learning to predict the color of a molecule purely from its chemical structure. The use of adequate structural representations for molecular systems will be key in order to faithfully encode complex atom-in-molecule environments, while accounting for translational, rotational, and permutation invariance. You will train a machine learning model from a database of experimentally determined absorption and fluorescence wavelengths. The project will offer you a chance to work on state-of-the-art methods for machine learning applied to molecular systems.

Contact: Dr. Tristan Bereau ( Prof. Sander Woutersen (