Designing better drugs with machine learning

We will show how we use our scikit-learn based toolkit cream to build and validate predictive models for important properties in drug design, such as optimization of biochemical/cellular activity and ADME properties.

Tags: Artificial Intelligence, Deep Learning & Artificial Intelligence, Data Science, Jupyter, Machine Learning, Science

Scheduled on thursday 16:35 in room lecture


Daniel Kuhn

Daniel Kuhn is computational chemist and discovery project leader at Merck in Darmstadt, Germany. After studying pharmacy in Marburg he obtained his PhD from Philipps-University, Marburg in computational chemistry working with Gerhard Klebe on the classification of protein binding sites. In 2004 he joined Boehringer Ingelheim in Vienna as computational chemist working in oncology research. In 2010 he joined Merck as principal scientist contributing to projects in early and late drug discovery stages. He is also leading a lead optimization project towards preclinical development. His research interests include protein kinase drug discovery, structure-based design and machine learning approaches to hit identification and lead-optimization.


Developing safe and efficacious drugs is a multi-parameter optimization process. A good drug needs besides affinity to the receptor or protein target also good ADME (Absorption, Distribution, Metabolism, and Excretion) properties to be successful. Machine learning models are used to assess compound ideas early-on in the design process and give guidance what to synthesize next. We will present our scikit-learn based toolkit cream to build and validate predictive models. Results from validation studies using existing data as well as prospective prediction results will be shared. Special focus will be placed on how confidence in predictions can be increased using cross-validation procedures and by consideration of the domain of applicability of the predictions.