Sometimes early detection helps in overcoming the serious effects of diseases. A new method is working on it. Statistical machine learning is being used to sort through mountains of complex biological data.
The method is called SLIDE and it successfully integrates several complex biological datasets first and thereafter pulls out unique factors to come up with results which are easy to understand.
It may transform how we think about multi-omics data. We can get detailed information on genetics, metabolism and more from such large and varied datasets, reveals Cornell researchers and a Cornell Ph.D. at the University of Pittsburgh.
The study is titled “SLIDE: Significant Latent Factor Interaction Discovery and Exploration across Biological Domains” and is published in the Nature Methods journal. Co-author Florentina Bunea states, “I love it because it is interpretable… Essentially, we can find interpretable hidden mechanisms from measurable biological input.”
Florentina Bunea is a professor of statistics and data science in the Cornell Ann S. Bowers College of Computing and Information Science.
The study is basically built on a foundation of theoretical work conducted by co-authors including Bunea; Marten Wegkamp, professor of statistics and data science in Cornell Bowers CIS, and of mathematics in the College of Arts and Sciences; and Xin Bing, Ph.D., a former Cornell doctoral student in the field of statistics who is now at the University of Toronto.
Bunea said that SLIDE offers confirmation and discovery as it can corroborate previous findings and point to unknown mechanisms.
Theoreticians at Cornell partnered with Jishnu Das, Ph.D., assistant professor of immunology at the University of Pittsburgh, to develop the application.
SLIDE is an advance version compared to previous methods and it is capable in taking multi-omics data profiles from samples. It can predict whether the samples are from healthy or diseased organisms.