Dr. Cole Trapnell: 'One of the things we aspire to do is to build predictive models of how complex biological systems, like whole mouse embryos, will be behave.'
Editor’s Note: On December 7, 2023, UW Medicine, the Allen Institute, and the Chan Zuckerberg Initiative announced the launch of the Seattle Hub for Synthetic Biology. Among the UW Medicine researchers leading the project is Dr. Cole Trapnell, a member of BBI. Here are some of Trapnell’s initial observations about the project.
What is your role in the Seattle Hub for Synthetic Biology?
Sea-Hub has two main scientific initiatives. One centers on molecular recording technologies and using those technologies to understand areas of biological development and immunology. The second centers on using high throughput and large scale perturbation experiments, coupled with single-cell sequencing as a way of phenotyping and measuring what happens.
In the first initiative, we’re trying to understand how to install devices that allow us to record or even rewire how genomes work. And in the second, we are trying to perturb the genome and use single-cell molecular profiling to understand how the genome works, particularly in the context of development. As one of Sea-Hub’s scientific co-directors, I’ll be leading the second initiative. Dr. Marion Pepper is the other; Dr. Jay Shendure is the overall scientific director. My focus is the genetic perturbation and single-cell analysis.
How will artificial intelligence be used?
AI will play a central role. One of the things we aspire to do is to build predictive models of how complex biological systems, like whole mouse embryos, will behave. Then, we subject those embryos to systems with different kinds of perturbations. For example, introducing disease-causing mutations or environmental stresses, or with drugs. In the state of affairs right now, it’s hard to make forecasts of what biology is going to do.
There has been tremendous progress in machine learning over the past few years. AI systems are now able to make forecasts about things that were incredibly hard to do even a few years ago. But one thing is very clear: In order to work well, those AI systems need a mountain of training data. AI systems are taught how to make predictions by showing them examples. In something like ChatGPT, the training data is all publicly available text on the Internet. If you want to teach an AI system to make predictions about biology, you have to give it measurements of biological systems.
You have to say to the AI, “When I messed with this gene, this happened, and I messed with this other gene, this happened. OK, now you know how that works. So, what happens when I mess with this third gene that you’ve never seen before.” We are going to be generating a mountain of data from experiments in the lab and then feeding that data into systems that we can use to make forecasts on what the embryo is going to do. We’re also going to be doing these experiments in multiple animal models and human stem-cell systems, and we plan to train AI systems to make predictions about human biology based on animal data. That will give us a better understanding of human mutations and diseases that are very difficult to study in the lab.
This is an extraordinary scientific undertaking – building new technologies to record the history of cells over time. Can this be accomplished in only five years?
When I very first started my lab, about 10 years ago, we were at the outset of another daunting technical challenge: measuring what goes on inside all the different kinds of cells from the human body, or from animals used in disease research, like zebrafish.
At that time, we could sequence a handful of individual cells, say a few hundred, for thousands of dollars. And we could only do it in cells that we grew in dishes, or that were from very particular samples. Five years later, in 2019, Jay’s lab and mine – together with our collaborators – sequenced millions of cells from whole mouse embryos at a similar cost. So, in five years, the size and scale of single-cell experiments that were possible for a given budget went up exponentially. And the technology continues to become even more powerful. We can now sequence millions of cells from thousands of specimens in just one experiment. This is the kind of scale that will unlock AI and other advanced analysis tools to really understand how our genomes are wired.
The rate of technological progress has been incredible. Not just us at the UW, but everywhere. And yet, if you had told me things would go this quickly when I started my lab, I would not have believed you. It’s reassuring at moments like this when you have what seems to be an enormous technological task in front of you to look at the expansion of throughput in genome sequencing and the expansion of single cell technology. Now, we are at a similar moment where we need to expand our capacity to record biological histories inside cells and write them into the genome. We need to perturb the genome in myriad ways to see how those perturbations impacts how they work (or don’t work). And we need to distill that vast amount of data down into knowledge – into descriptions that people can understand of how cells and genomes and embryos work. We are going to need technological leaps like what we’ve seen in single-cell sequencing to achieve what we want to achieve at the Sea-Hub. But I think past will be prologue.