In-depth profiling of single cells
21.07.2022 / Single-cell analyses provide a wealth of molecular and genetic information. A team led by MDC researcher Uwe Ohler is using machine learning to combine these data and produce meaningful profiles of the cells. The Chan Zuckerberg Initiative is now funding the project.
Despite its minuscule size, a cell is an incredibly complex thing. From the vast libraries of DNA in its nucleus, a cell can read the exact genetic information it needs in a given moment, then translate it into RNA and ultimately into a vast array of different proteins. Various technologies exist today that can characterize the properties and status of cells in unprecedented detail. These allow scientists to do things like read DNA and the corresponding genetic switches, analyze RNA, and identify the different proteins and their forms. The problem, though, is that this produces huge volumes of data that describe entirely different aspects of the cell. And even if it’s actually the same information from the same type of cell, the data still vary depending on which lab produced it, at what time, and with which technology.
A model cell
The Computational Regulatory Genomics Lab led by Professor Uwe Ohler at the Berlin Institute for Medical Systems Biology (BIMSB), which is part of the Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), is seeking to solve this problem. The team is working on computer-aided methods that automatically combine the data in a model of the cell and analyze it, without the need to process the data first. To achieve this, the researchers are using machine learning, which is a field of artificial intelligence. The work involved in developing the data processing and analysis is now being funded by the Chan Zuckerberg Initiative (CZI). CZI was set up in 2015 by Facebook founder Mark Zuckerberg and Priscilla Chan. It funds projects in science and research, education, social justice, and inclusion.
Combining the different data like this should make it possible to build detail-rich molecular profiles of individual cells and cell types. The profiles should then help to answer specific questions, such as: Which properties characterize a specific cell type? Is the cell healthy or sick? How can I infer the number of certain surface proteins from the genes that are read (gene expression)? Which regulatory segments of DNA are involved in hereditary diseases?
When it comes to characterizing cell types, the group already has a few answers up its sleeve. For instance, Pia Rautenstrauch, a researcher in Ohler’s lab, combined three different types of data on gene regulation in a way that allowed her to filter out the biologically relevant data from the cells. Her deep-learning model ignored the noise (the differences that are caused solely by things like the measurement technology and that make it hard to interpret the data). The model earned Rautenstrauch success at the NeurIPS competition in winter 2021. In another project, scientists working with Ohler used self-learning algorithms to categorize data sets from genetic switches in zebrafish. They wanted to find out which switches are active in which cell type.
An ecosystem for data analysis
“CZI is hugely interested in these methods of automated data integration. They fund a large network of biomedical laboratories,” says Ohler. Programs like the ones underway in Ohler’s lab should connect the varied data from the different research groups and make it possible to analyze them together. The ultimate aim is to create “a large ecosystem for analysis platforms that is available to all interested parties,” says Ohler. He adds that the methods and principles being developed could also be transferred to completely different fields of research. For instance, they could be used to integrate satellite monitoring data from different wavelengths. Here, too, the information needs to be carefully connected in order to produce a meaningful overall picture.
Text: Janosch Deeg
Source: Press Release MDC
In-depth profiling of single cells