Functional Stratification of Metabolic Disease Gene Variants to Accelerate Diagnosis, and Guide Treatment

Precision Medicine Project - Functional Stratification of Metabolic Disease Gene Variants to Accelerate Diagnosis, and Guide Treatment

Supervisor(s): Prof Rob Semple, Prof Grzegorz Kudla, Prof Joseph Marsh, Prof Neil Carragher
Centre/Institute: Centre for Cardiovascular Science

Background

Increasing next generation sequencing (NGS) in diagnostic laboratories means that rare variants in disease genes are being discovered at a rapidly increasing rate. However our ability to distinguish true loss-of-function variants from incidental benign variants has not kept pace with this, resulting is an exponential rise in reporting of “Variants of Unknown Significance” (VUS). A transformational approach known as multiplexed assays of variant effects (MAVE) now allows assessment of up to tens of thousands of variants in a single experiment[1]. MAVEs have been enabled by development of methods to engineer large numbers of variants and to quantify their function through NGS coupled to high throughput disease-relevant assays. MAVEs reduce experimental costs and capture variants from all ethnic backgrounds, while pooling and parallel analysis of large numbers of variants, with multiple replicates and under the same conditions minimises noise.

Project Workplan and Aims

MAVEs will be applied to one or more genes causing metabolic disease. The gene(s) to be studied will be decided by the student in discussion with supervisors. Criteria will be: A. VUS must pose a substantial problem in testing of the gene B. the gene must show complex genotype-phenotype correlation, being implicated in more than one disease, and C. there must be unanswered structure-function questions. Aims will be to classify all VUS in the target gene as benign or pathogenic, to associate them with specific disease phenotypes, to make data available freely accessible to accelerate genetic diagnosis, and to facilitate development of targeted therapies. Genes of interest include MFN2 (in which different mutations cause peripheral neuropathy, adipose tissue abnormalities and insulin resistance), PIK3R1 (insulin resistance, immunodeficiency, developmental abnormalities) and POLD1 (cancer, immunodeficiency, premature ageing and insulin resistance)[2].

The project will adopt an established workflow using a large pool of primers including “NNS” codons for all wild type codons. Primers will be used in an established nicking mutagenesis strategy​ to construct a barcoded variant library in plasmids[3]. Long read sequencing will phase barcodes and mutations. In parallel with library generation, high content cellular imaging will be optimized for known disease-causing mutations in the gene of interest, both tailored to the cellular phenotype caused by known disease mutations (e.g. visualization of mitochondrial network morphology for MFN2) and cellular process agnostic (e.g. “cell painting” using multiple vital dyes to mark different organelles)

Variant libraries will be delivered to cells via a high efficiency landing pad incorporated in the genome by lentivirus or gene editing. The resulting cells will be selected and stored, before the pre-optimised high content assays are used. Barcode sequencing will be used to count every occurrence of every mutation in different phenotypic groups, allowing stratification of the whole mutational library by functional effect. Candidate treatments will also be tested on the library and assessed similarly.

Finally, the student will test the performance of the MAVEs they have undertaken against published variant effect predictors(VEPs)[4]. They will compare strategies to fill in experimental gaps, using a novel approach that exploits the high correlation between VEPs and MAVEs, while investigating whether variant-level quality score thresholds can be established, below which imputation is preferable to using low quality experimental data. They will integrate data into a searchable online database, with tools for visualisation of variant effect maps, using generalisable platforms currently being developed. This will align with guidelines for release of VEPs, and recommendations regarding sharing of training data and code.

Project-specific training/experience will be provided in:

  • Rare disease genetics, monogenic metabolic disease, and clinical translation
  • Cutting edge molecular  biology, and high content imaging of engineered eukaryotic cells
  • Analysis of long and short read next generation sequencing data, and associated computation
  • State-of-the-art variant effect prediction strategies, protein language models, and supervised machine learning techniques. This will enable comparison with experimental MAVE data, and integration of MAVE measurements into tailored predictive models of variant effects.

References

[1] Araya & Fowler (2011) Trends Biotechnol 29(9): 435-442

[2] Bonnefond & Semple Diabetologia. 2022 Nov;65(11):1782-1795.

[3] Wrenbeck et al (2016) Nat Methods 13(11): 928-930

[4] Livesey & Marsh (2020) Mol Syst Biol 16(7):e9380

Apply Now

Click here to Apply Now

  • The deadline for 24/25 applications is Monday 15th January 2024
  • Applicants must apply to a specific project, ensure you include details of the project on the Recruitment Form below, which you must submit to the research proposal section of your EUCLID application. 
    Document
  • Please ensure you upload as many of the requested documents as possible, including a CV, at the time of submitting your EUCLID application.  

Q&A Sessions

Supervisor(s) of each project will be holding a 30 minute Q&A session in the first two week of December. 

If you have any questions regarding this project, you are invited to attend the session on 6th December at 3.45pm GMT via Microsoft Teams. Click here to join the session.