I am a researcher in statistical genetics. From January 2016 to August 2019, I have been a postdoctoral fellow in Ron Do’s lab at the Icahn School of Medicine at Mount Sinai, in New York, USA.
Before joining the Institute for Personalized Medicine at Mount Sinai, I was a BioStatistician at the laboratory of Genomics and metabolic diseases of the Institut Pasteur in Lille, France. I defended my PhD thesis, “Exploratory Analysis of transcriptomic data: from their visualisation to the integration of external information” on the 4th of September, 2013 in Agrocampus Ouest, Rennes, France.
September 2019
Start at Université Paris Cité
Marie Verbanck starts a new position at Université Paris Cité as Assistant Professor and creates the GEML Research group.
September 2021
ANR grant
The PleioMap project aiming at leveraging pleiotropy in human genetic architecture to build a map of pleiotropy using machine learning has been funded by the French National Research agency (ANR JCJC, PI: Marie Verbanck).
February 2022
PleioMap kick-off
The PleioMap project, funded by the French National Research agency for 3.4 years, started with two graduate students Martin Tournaire and Martin K. Amouzou.
October 2022
Martin Tournaire
Martin tournaire, MS, offically starts as a PhD student in the GEML group after successfully graduating summa cum laude from his masters program.
Our research expertise
Statistical genetics
Omics studies
Pharmacogenomics
Complex diseases
Machine learning
Data recycling
The GEML Team
Marie Verbanck
Assistant Professor - Team leader
Asma Nouira
Postdoctoral Fellow
Martin Tournaire
PhD student
Treudsky Antoine
Graduate student
Our research projects
Leading research that is inherently multidisciplinary and diverse, at the crossroads between statistics and genomics, I have developed several statistical methodologies and lead several studies in human genetics. Throughout my collaborations and experience, I have developed expertise in the genetics of the metabolic syndrome with a long term ambition to take part into the rising era of personalized medicine.
Omics analysis
Mendelian randomization
Pleiotropy
Pharmacogenetics
Clustering
Deep learning
Our publications
Research Publications with leadership effort from members in the GEML team.
Contact us
Marie Verbanck
Biostatistique, Traitement et Modélisation des données biologiques
UR 7537 – BioSTM
Faculté de Pharmacie de Paris - Université Paris Cité
4 avenue de l’Observatoire
75006 Paris, France
Email: marie.verbanck[at]u-paris.fr
Omics analyses
We have been involved in several omics studies
At the European Diabetes Institute, I collaborated closely with Drs. Odile Poulain-Godefroy and Marie Favennec. We conducted several studies on the kynurenine pathway and were able to show a link between the kynurenine pathway and metabolic syndrome, including obesity and type 2 diabetes. We also conducted a transcriptomic study on bisphenol A and its substitutes which allowed us to show that exposure to bisphenol S and F has very similar consequences to bisphenol A exposure in terms of target genes and behave as endocrine disruptors.
Mendelian randomization
Lorem ipsum dolor sit amet consectetur.
A major research focus of my second postdoctoral fellowship in Dr. Do's team at Mount Sinai New York is causal inference. We developed a new method to identify and correct for horizontal pleiotropy in MR and applied this method to 4250 MR tests {verbanck_widespread_2017}. Thus we were able to show that horizontal pleiotropy exists in more than 48% of the causal relationships between risk factors and diseases and that it is not without consequences, going as far as to produce false positives in the causal relationships and to bias the estimation of the causal effect. We also applied this method in an extensive causal inference study of the effect of uric acid level on the risk of developing chronic kidney disease. We were unable to detect any causal link between uric acid and chronic kidney disease, so this study is crucial in deciding whether or not to continue with ongoing clinical trials.
PleioMap
Leveraging pleiotropy in human genetic architecture by building a map of pleiotropy using machine learning.
Although pleiotropy, which occurs when a genetic element has a causal effect on at least two traits, is thought to play a central role in the genetic architecture of complex traits and diseases, it is a poorly understood mechanism. Here, we reexamined known concepts of human genetics through the prism of pleiotropy and we formulated the 5-pleiotropy hypothesis. Indeed, observed pleiotropy which occurs when one genetic variant affects more than one trait can stem from 5 biological mechanisms: 1) LD pleiotropy (linkage disequilibrium) 2) vertical pleiotropy (causality) 3) network pleiotropy (genetic correlation) 4) serendipitous pleiotropy (polygenicity) 5) horizontal pleiotropy (independent effects).
Our global objective is to study pleiotropy and to model the 5 types of pleiotropy using publicly available summary statistics data stemming from genome-wide association studies (GWASs). More specifically, in Workpackage 1, we will build a comprehensive framework to disentangle between the 5 states of pleiotropy by modeling pleiotropy due to relationships between traits while quantifying pleiotropy at the level of genetic variants, providing in fine a genome-wide map of pleiotropy. Strategies to achieve this first goal include i) improving on proof-of-concept method; ii) rerouting existing methods used to model relationships between complex traits and diseases; iii) build a novel statistical framework based on machine learning algorithms, notably semi-supervised learning by using additional eQTL (expression Quantitative Trait Loci) data and a colocalization method to label genetic variants for pleiotropy.
In Workpackage 2.1, we propose to study relationships between complex traits and diseases stemming from pleiotropy. Special attention will be paid to disentangling between vertical & network pleiotropies, which stem from causal relationships between traits, and the other forms of pleiotropy. In addition, after applying PleioMap to many traits, we intend to develop visualization tools to build networks of relationships between complex traits and diseases.
In Workpackage 2.2, we will study the pleiotropic effects of the genetic variants themselves and make an inventory of 5 types of pleiotropy validating or invalidating our 5-pleiotropy model. By identifying pleiotropic effects of genetic variants which we think can be much weaker than the effects identified by traditional GWASs, we do hope to be able to provide updated heritability estimates for complex traits and diseases and contribute to improving fine-mapping.
The full code to produce the PleioMap, the PleioMap itself as well as the network of relationships between traits and diseases will be made publicly available as a resource to the scientific community.
The PleioMap project is ambitious and challenging but we strongly believe that this field of research is of interest for the human genetics community and will be thriving in the coming years. We do expect PleioMap to open new avenues for additional applications such as prediction of drug side effects or off-target effects in genome editing and will provide insights into new biological mechanisms behind the shared etiology of traits and diseases. Therefore, we do hope that PleioMap and the study of pleiotropy will trickle down to allow the development of novel preventive and therapeutic strategies and towards personalized medicine applications.
Pharmacogenomics
Lorem ipsum dolor sit amet consectetur.
Genetics is increasingly used to inform the search for therapeutic targets in multiple aspects.
On the one hand, a better understanding of the genetic architecture of complex diseases in general, including the study of pleiotropy, allows to predict the behavior of therapeutic targets, in terms of side effects for example. Thus, during my second postdoctoral fellowship at Mount Sinai in New York, I developed a new statistical method to quantify horizontal pleiotropy at the scale of a genetic variant in collaboration with Drs. Do and Jordan. We have shown that horizontal pleiotropy is ubiquitous and closely related to polygenicity or even omnigenicity in complex traits and diseases. On the other hand, genetics can be used to predict the effectiveness and side effects of potential treatments. With Drs. Do, Duffy and Dobbyn, we have developed a new model for predicting side effects in clinical trials based on the integration of electronic medical records and transcriptomic data. Finally, pharmacogenomics is in some ways the reverse of patient stratification, as it involves treating all patients similarly and identifying those patients who respond similarly and particularly in the "best" way to treatment. The surgicogenomics study conducted during my first postdoctoral fellowship at the European Diabetes Institute is a concrete example of this.
Clustering
Clustering methods associated with multidimensional exploratory methods are valuable in the study of high-dimensional data and allow to combine a large amount of information and to classify elements (patients, genes) into homogeneous groups whose interpretation is facilitated.
During my PhD at Agrocampus Ouest, I specialized in the study of transcriptomic data and in particular in the development of exploratory multidimensional analysis methods.
In collaboration with Drs. Julie Josse and François Husson, we developed a regularized version of principal component analysis that can be used for denoising data. Moreover, with Drs. Jérôme Pagès and Sébastien Lê, we have developed a clustering algorithm for transcriptomic data with integrated gene ontology annotations. In addition, in collaboration with the team of Pr Bruce Gelb (pediatric cardiologist) from Mount Sinai in New-York, I developed a new stratification algorithm for the sub-phenotyping of patients with congenital heart disease.
Deep Learning
In human genetics, and especially to study pleiotropy, the major issue is to obtain labeled data since the ground truth is unknown. However, it has been shown that semi-supervised learning strategies have already been applied, with high gain in classification performance. Therefore, we have already developed a strategy to partially label genetic variants for pleiotropy. Thus, we will explore a well-established machine learning method, namely **Convolutional Neural Network**, commonly applied to analyze images. Receptive fields will contain genomic windows instead of windows of pixels. In CNN architecture, the receptive fields overlap with each other and do convolutions between the kernel and the data: this is analogous to the sliding window approach, a traditional method in genetics, with genomic intervals "sliding" across the genome. Furthemore, the block architecture of CNNs is comparable to LD-blocks (dependence structure between alleles), one of the major obstacle of mapping pleiotropy.
This project has been funded by the Data Intelligence Institute of Paris.
Dr Marie Verbanck
Assistant Professor in genetics machine learning
Since September 2019, I have been Assistant Professor (Maître de Conférences) in statistics at Université de Paris.
From January 2016 to August 2019, I have been a postdoctoral fellow in Ron Do’s lab at the Icahn School of Medicine at Mount Sinai, in New York, USA.
Dr Asma Nouira
Postdoctoral fellow in machine learning
In 2017, I obtained my engineer diploma on embedded systems in National Engineering School of Sousse (ENISo), Tunisia. After that, I got my master degree in Intelligent Communicating Systems in 2018. I did an internship at East Paris Institute of Chemistry and Materials Science (ICMPE) in Paris, I was working on Machine learning techniques applied on chemistry to discover novel chemical compounds for hydrogen storage. This project was supervised by Jean-Claude Crivello and Nataliya Sokolovoska . Since January 2019, I started my PhD project at CBIO, a team directed by Thomas Walter . I am mostly interested on Machine Learning and Deep Learning applications. At CBIO, I am looking forward to resolving biological problems in order to find association between the human genome and cancer using Machine Learning techniques.
Martin Tournaire
PhD student in statistical genetics
Martin is currently a Ph.D. student in the BioSTM lab. He completed in 2022 his Master degree in computational biology at the Engineering Faculty of Life Sciences (ENSAT) in Toulouse. His Master's internship was focused on labeling pleiotropic genetic variants in GWAS data, in order to create a deep learning classification algorithm. He is pursuing this work during his Ph.D, mentored by Pr. Rozenholc and Dr. Verbanck.
Treudsky Antoine
Graduate student in machine learning
Treudsky completed his Bachelor's degree in Applied Economics, with a focus on statistics. He then went on to pursue a first-year Master's degree in quantitative methods and econometrics for health research. Currently, he is in the process of completing his second year Master's degree in Statistics, Modelling, and Data Science in Health. With his academic background and ongoing studies, Treudsky is highly motivated to become a health data scientist. He is eager to apply his knowledge and skills to make a positive impact in the field of health research.