Deep learning methods in biomedical research: from genomics to multi-omics approaches

Principal Investigator: Julie Hussin
Theme : Health
Competition : 2017 Competition: IVADO's Grants for fundamental research projects
Status : Completed
Start : Mar. 19, 2018
End: Sept. 18, 2020
Budget : $245,000.00



For this project, we have implemented several machine learning and deep learning methods to analyze data from a variety of omics technologies, including genomics, epigenomics, transcriptomics, proteomics and metabolomics. We explored several representations of the genetic data based on the encoding of whole sequences (RecDL), genetic variants (Diet Network), and genetic ontology (DeepSimDef). We conducted metabolomics studies of heart disease using machine learning approaches to investigate the impact of myocardial infarction on the patients' metabolome. This revealed a differential fatty acid signature depending on the drug using unsupervised learning methods. We also identified lignoceric acid, potentially important in heart failure, using the XGboost method, a metabolite currently undergoing biological validation. Finally, we evaluated the generalizability of our approaches, including the Diet Network's approach to predicting ethnicity based on genomic data. We demonstrated that the approach could help us to make accurate predictions on independent datasets with different sets of genetic markers and different levels of missing data, which are ubiquitous in omics data. Our work has also revealed the importance of biological interpretation in prediction, an aspect on which our future work will focus.


Lead Genome Centre : Génome Québec

Partner : IVADO


Simon Gravel Université McGill
Yoshua Bengio Université de Montréal