Precision Medicine Project - AI-driven drug discovery for diseases of unmet need with high-throughput phenomics data Supervisor(s): Dr Diego Oyarzún, Prof Neil Carragher & Prof Asier Unciti-Broceta Centre/Institute: School of Informatics Abstract Drug discovery is extremely expensive and most candidate compounds fail at various stages of clinical trials. Artificial Intelligence (AI) has recently emerged as a promising tool to accelerate the search for new active compounds. Due to their ability to sieve through large amounts of data, AI algorithms are being deployed at every stage of the drug discovery pipeline. In this project, we aim to leverage the power of AI to identify early lead compounds in the drug discovery pipeline which are active upon specific disease endotypes. By expanding the number of chemical starting points, the project will deliver a powerful tool to support downstream chemical optimization toward increased efficacy and safety. Background In spite of the notable achievements seen in modern approaches to drug discovery that focus on specific targets, the field suffers from stubbornly high failure rates during clinical development. Approximately 90% of drug candidates fail clinical trials and do not ultimately provide substantial benefits to patients [1]. These limitations become even more pronounced given the rising R&D costs and urgent unmet therapeutic needs in specific disease areas. To address these challenges, “phenotypic drug discovery” strategies have resurfaced as a valuable approach alongside the standard target-based drug discovery [2]. This approach involves screening and selection of candidate compounds based on observable phenotypes caused by drug administration in biological models. These phenotypic endpoints are typically derived from experiments using cell-based assays and do not require prior knowledge of the specific drug target. The supervisory team recently employed phenotypic data and machine learning to discover three anti-senescence compounds [3]. Building on this recent success, in this project we aim at leveraging progress in AI and multiparametric image-based phenotypic screening (“cell painting” data [4]) perfomed across genetically distinct cell panel to accelerate the discovery and optimization of lead compounds, and thus support translation of preclinical discoveries and increased clinical success. Aims Our general aim is to build and validate an end-to-end digital discovery platform to process large phenomics screens and extract actionable insights for compound optimisation, target identification and mechanism-of-action understanding. The platform will integrate unsupervised machine learning algorithms to identify phenotypic endpoints from screening data, which will then be employed to train predictive models of compound activity. Through a combination of data analysis, virtual screening and chemical synthesis, the project will lead to a suite of optimised chemical structures for downstream experimental validation. In a first phase, we will use existing data from the Carragher lab to build and fine-tune the AI pipeline, and build toward a full experimental validation using our in-house high-throughput automated imaging platform in the second half of the project. Existing datasets include over 10,000 compounds tested across more than a dozen disease-relevant cell lines. Specific objectives: 1. Develop a robust clustering pipeline to extract phenotypic endpoints from phenomic drug screens. 2. Train predictive machine learning models with a combination of statistical learning algorithms as well as modern deep learning approaches. 3. Validate the platform on a fresh phenomics screen and identify candidate structures for chemical optimisation. Training Outcomes: The student will gain extensive expertise on advanced statistics, machine learning, biomedical data science, and chemoinformatics. The student will be trained in a multidisciplinary environment with access to both computational and experimental infrastructure and expertise available from the three supervisors, who have collaborated for over 2 years. The project will expose the student to the challenges of noisy, high-dimensional biological datasets and offer a broad range of skills that will enhance their future employment opportunities in AI for drug discovery, an area of top priority for startups and big pharma. References: [1] Hay, M., et al., Clinical development success rates for investigational drugs. Nat Biotechnol, 2014. 32(1): p. 40-51. [2] Moffat, J.G., et al., Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat Rev Drug Discov, 2017. 16(8): p. 531-543. [3] Smer-Barreto, V., et al., Discovery of senolytics using machine learning. Nat Commun, 2023. [4] Warchal S.J., Unciti-Broceta A., Carragher N. O. Next-generation phenotypic screening. Future Med Chem. 2016 Jul;8(11):1331-47. Apply Now Click here to Apply Now The deadline for 24/25 applications is Monday 15th January 2024 Applicants must apply to a specific project, ensure you include details of the project on the Recruitment Form below, which you must submit to the research proposal section of your EUCLID application. Document Precision Medicine Recruitment Form (878.6 KB / DOCX) Please ensure you upload as many of the requested documents as possible, including a CV, at the time of submitting your EUCLID application. Q&A Sessions Supervisor(s) of each project will be holding a 30 minute Q&A session in the first two weeks of December. If you have any questions regarding this project, you are invited to attend the session on 4th December at 10am GMT via Microsoft Teams. Click here to join the session. This article was published on 2024-09-24