To learn how today’s proteins evolved, scientists are reconstructing ancient molecules.
The influenza virus is a quick-change artist. In a few decades, its genome can evolve as much as animal genomes can over millions of years. That means that the viral proteins, including those that alert our bodies to an infection, constantly reinvent themselves, threatening our immune systems and frustrating vaccine developers.
For Jesse Bloom, a biologist studying how evolution affects proteins, that relentless change is an opportunity. Thanks to data collected during past flu seasons, Bloom knows the exact genetic makeup of some ancestors of today’s influenza viruses. His lab group at the Fred Hutchinson Cancer Research Center in Seattle uses that information to figure out how the viruses made their immunity-dodging transformations.
Bloom and others are part of a growing group of scientists who practice “evolutionary biochemistry.” They seek to explain life’s tremendous diversity and determine exactly how that diversity emerged. Rather than focusing on how plants or animals adapted to different environments, however, these researchers consider diversity on a much smaller scale: Their work aims to explain how the small set of proteins that powered primitive life-forms evolved into the millions of specialized proteins that drive biological processes today.
Exploiting the genetic records, Bloom can assemble virus proteins that existed in bygone times, then reconstruct how they evolved, one amino acid at a time. Other researchers are analyzing modern species to resurrect the ancestral forms of biological molecules that have evolved over millions of years.
With a historical protein in hand, researchers can test how swapping out a single amino acid — as evolution might have done — changes how the protein flexes or folds and connects (or doesn’t) with other molecules. By trying out alternate versions of a protein’s history through stepwise amino acid changes, scientists can learn how a protein’s physical form has both enabled and constrained its evolution.
Ultimately, this work might answer some long-standing questions: To what extent does evolution depend on chance events? Can evolution reach the same point by traveling different paths? How does biological complexity evolve? Such experiments are also helping researchers who study modern proteins sort out how the order of amino acids relates to biological function.
Form is function
That ordered series of amino acids is spelled out by the gene that holds the blueprint for a protein. Once the proper amino acids are strung together, they origami-fold into tiny structures with nooks and protrusions that determine what the protein does inside a cell. A protein’s folded shape lets it grab on to specific bits of DNA or hasten certain chemical reactions. Mutations in a gene can shift the resulting protein’s shape or alter subtle aspects of its behavior so that, over time, a protein’s function can change. But the possibilities are not endless. New proteins that fall apart, fail to fold or don’t perform as needed don’t survive the tests of natural selection.
“The physical determinants of folding, stability, solubility, function and specificity are absolutely essential aspects of the evolutionary process,” says University of Chicago biologist Joe Thornton. “That has not been widely appreciated or explicitly addressed until pretty recently.” Now, Thornton says, it’s clear that to understand molecular evolution, it’s important to study proteins as functioning, physical objects.
As they reconstruct proteins’ pasts, researchers are finding that genetic mutations sometimes remodel a molecule just enough to give a chance to other mutations that would have failed earlier. That creates opportunities for new features and functions to evolve — an idea that biologists have considered for decades but have only just begun to explore in the lab.
Bloom and colleagues, for instance, used an influenza virus protein called nucleoprotein to examine how interactions among mutations have affected the overall evolution of the virus. Understanding the combined effects of several mutations could allow researchers to anticipate the short-term effects of new genetic variation. That knowledge could help improve forecasts of which viral strains are likely to circulate in upcoming flu seasons, important information for designing effective vaccines.
Comparing nucleoprotein genes from strains of the virus isolated in 1968 and 2007, Bloom’s team mapped out the most likely steps by which the 1968 protein morphed into its newer form. Though nucleoprotein still plays the same role that it did in 1968 — aiding in the assembly of viral RNA — 33 of its 498 amino acids changed over those four decades, and a few changed more than once, the researchers reported in 2013 in eLife.
Bloom’s team built the 1968 nucleoprotein, then tested the effects of introducing each historical mutation. Some of the mutations affected parts of the protein that tip off a person’s immune cells that an invader is present — they probably helped the flu virus avoid detection. But on their own, some of those changes were bad for the virus: The nucleoprotein could no longer stay properly folded long enough to do its job.
During the course of the nucleoprotein’s evolution, some mutations boosted the protein’s stability, giving it a bit of a buffer. When later mutations occurred, allowing the virus to buck immune recognition, these earlier changes probably held the structure stable so the protein could still function.
When a mutation’s effects depend on other mutations, this interplay is called epistasis. These interactions within individual molecules have been important in shaping evolutionary trajectories, says University of Oregon biophysicist Michael Harms, who is studying how diverse functions evolved in a group of proteins called s100s. He calls epistasis “the common feature in all of evolution.”
Codependent interactions don’t occur just between pairs of mutations. They can be significantly more complex. Analyzing data from other labs, Harms has found epistatic interactions involving up to six different mutations. Such interplay means that in many cases, if genes had transformed themselves just a bit differently, evolution would have veered onto a different course.
Thornton uses ancestral protein reconstruction to study how steroid hormones — which control stress responses, growth and sexual developmental in vertebrates — evolved partnerships with their receptors. Receptors are proteins that bind to specific partners to activate responses in the cell. By comparing steroid receptors in different species, Thornton can map the evolutionary relationships between the molecules and infer the likely amino acid sequence of their common ancestor. Then he introduces a DNA molecule that encodes the long-extinct protein into lab-grown cells. Those cells use the genetic instructions to manufacture a tiny piece of the deep past.
Many of Thornton’s studies begin with a 450-million-year-old receptor protein that he and colleagues reconstructed in 2006. The protein gave rise to modern receptor molecules that are activated by different hormones. One receptor, the glucocorticoid receptor, responds to the stress hormone cortisol. The other, the mineralocorticoid receptor, controls levels of salt and other electrolytes in response to the hormone aldosterone. Thornton’s team found that their reconstructed ancestor could be activated by both cortisol and mineralocorticoids.
A receptor that responded only to cortisol appeared 40 million years after the promiscuous receptor, Thornton showed. His team found a set of amino acid changes that converted the general ancestral receptor into the cortisol-specific one. But the mutations that changed the ancient receptor’s preference couldn’t have generated a functional receptor by themselves, experiments showed.
“The function-switching mutations are not tolerated on their own,” Thornton says. They destabilize parts of the receptor. Like the flu virus’s evolving nucleoprotein, the ancestral receptor’s structure had to be buttressed before it could withstand the mutations that would make the receptor choosier.
Two amino acid changes quietly readied the ancient receptor for its transformation, Thornton and colleagues reported in 2009 in Nature. Without them, the path to the function-switching mutation would have been inaccessible. “If we were to wind back the clock and set history rolling again, it’s very unlikely that those permissive mutations would occur,” he says. “We would have ended up with a very different glucocorticoid receptor and a very different endocrine system.”
Thornton and Harms, then a postdoctoral researcher in Thornton’s lab at University of Oregon in Eugene, explored whether evolution could have taken an alternate route to the same end. Harms created and screened thousands of variants of the ancestral protein, searching for alternative mutations that might have set it up for the same functional switch. He found none, the researchers reported in Nature in 2014. Evolution, it seems, had acted on a rare opportunity.
Biophysical analyses of variant receptor proteins showed why so few mutations enabled cortisol-specific binding to evolve. Although certain parts need extra support, the receptor also needs to be able to transition between two forms: an inactive conformation when no cortisol is present, and a gene-activating conformation when the hormone binds. Some mutations stabilize the active form of the receptor too much, locking it into an “always-on” configuration. Mutations also had to be compatible with the ancestral protein on their own, before the function-switching mutations were introduced.
“A mutation has to fulfill all these requirements, and that is not easy to do,” Thornton says. “That seems to be the explanation for why permissive mutations [for this functional switch] are so rare.”
But not every new function is the result of complicated epistatic interactions. In January in eLife, Thornton and Ken Prehoda of the University of Oregon described an ancient protein that gained a completely new function by way of a single amino acid change.
The team studied the origins of an animal protein that helps cells orient themselves in space before dividing. Doing so is vital for positioning new cells in the right places within a growing body. Single-celled life-forms had to get this right before multicellular organisms could evolve.
Thornton, Prehoda and colleagues focused on a segment of the protein called GKPID (for GK protein-interaction domain), which orients cells by acting as a scaffold during division. The billion-year-old ancestor of GKPID did nothing of the sort. It was an enzyme predecessor to the modern guanylate kinase, which catalyzes a chemical reaction that cells use to make some of the building blocks of DNA. Amazingly, Thornton says, one mutation was enough to transform the ancestral protein from an enzyme to a working scaffold.
That surprising result is an example of why developing general theories about the physical principles shaping evolution requires a grasp of the evolutionary histories of a broader collection of proteins.
“Every time people take [a protein] apart, they see a new feature,” Harms says. Fortunately, he says, thanks to faster computers, better software and a growing number of genomes to reference, research on ancestral protein reconstruction is on the rise.
While chance events can shift the landscape of evolution’s possibilities, evolving proteins also have some freedom to explore. They can take more than one path to some functions.
Douglas Theobald, a biochemist at Brandeis University in Waltham, Mass., has seen this in his own investigations of an enzyme that many cells use to produce energy without oxygen. The enzyme, lactate dehydrogenase, evolved from structurally similar enzymes not just once, but at least four times in different groups of organisms. By reconstructing the evolutionary events that transformed a similar enzyme, malate dehydrogenase, into lactate dehydrogenase, Theobald and colleagues found that two groups of single-celled parasites came by the same enzyme in different ways. The researchers reported the findings in eLife in 2014 and in Protein Science in February.
The work demonstrates that different genetic backgrounds may steer evolution along different paths in different organisms but still lead to similar outcomes, Theobald says. “Even if there is a lot of epistasis, there’s still lots of different ways to the same function.”
Biochemist Susan Marqusee of the University of California, Berkeley has also found that there’s more than one way for a protein to do something new.
Marqusee collaborated with Thornton’s team to compare how two bacteria, Escherichia coli and the heat-loving Thermus thermophilus, evolved enzymes that do the same job at very different temperatures. T. thermophilus thrives in hot springs, at temperatures that would cause most proteins to fall apart. Biochemists are eager to borrow from nature’s strategies to engineer heat-tolerant proteins but have struggled to find general principles that account for this property. By reconstructing the common ancestor of the RNA-snipping enzyme known as H1 from E. coli and T. thermophilus, Marqusee’s team found out how the bacterial protein takes the heat.
That 3-billion-year-old common ancestor was less stable than the enzyme that T. thermophilus uses today, the team reported in 2014 in PLOS Biology. As the heat-tolerant protein evolved, its stability steadily increased — not because of any one innovation, but by virtue of distinct biophysical strategies at different points in time.
“The physical chemistry doesn’t really matter as long as in the end, they add up to the right phenotype,” Marqusee says. Because evolution was able to take advantage of different amino acids to boost stability in a variety of ways, the enzyme’s growing resilience to hot environments didn’t depend on the chance occurrence of a particular set of mutations.
Studies of how proteins have evolved in the past are unlikely to spell out how evolution will proceed in the future. “The emerging picture is that the role of chance is so great that long-term predictions of the future evolution of any protein is a very risky enterprise,” Thornton says. But recent research does offer insights into how and why today’s proteins do what they do.
One example comes from Thornton’s work on how the DNA-binding sites on steroid receptors have evolved along with their DNA targets. The hormone-activated receptors act as transcription factors, binding to specific sections of DNA to switch on certain genes. In 2014, Thornton’s team reported in Cell that a bulky amino acid in an ancestral protein prevented the protein from binding to the stretch of DNA favored by many of today’s steroid receptors. The ancestral protein awkwardly bumped up against the DNA, unable to make enough contact to really grab on. The receptor gained its new specificity when mutations ended those obstacles and introduced new clashes that blocked its access to the former binding site.
Researchers often can’t tell which differences between two related proteins make them behave differently. But reconstructing evolutionary paths can point them in the right direction.
Using ancestral reconstruction, Theobald and Brandeis colleague Dorothee Kern studied how Abl, a growth-promoting protein linked to chronic myelogenous leukemia, diverged from the related Src protein. The researchers wanted to know why the anticancer drug Gleevec binds to and shuts off Abl without obstructing Src, even though Src has a very similar structure. Theobald, Kern and colleagues identified 15 amino acids in Abl that are crucial for Gleevec binding. The amino acids influence how the protein transitions between two different configurations (that shape-shifting is disrupted in some patients with Gleevec-resistant cancers). The finding, published last year in Science, suggests that researchers may be able to develop better drugs by considering these conformational shifts.
Some proteins, or parts of proteins, might even be inherently more able to evolve than others. Certain parts of the fast-evolving viral protein hemagglutinin are unusually tolerant of change, Bloom and Bargavi Thyagarajan, who was a postdoctoral researcher in Bloom’s lab, reported in 2014 in eLife. Antibodies against hemagglutinin are the immune system’s best defense against influenza, but the protein is adept at escaping detection.
The researchers used a relatively new method called deep mutational scanning to build and test hemagglutinin proteins with nearly every possible amino acid change — about 10,000 in all — in viruses grown in the lab. In a host, changes that disguise hemagglutinin from the immune system would be advantageous. Even though there was no immune system to hide from in the lab, viruses still survived more changes to parts of hemagglutinin that would be recognized by an immune system than they did changes to other parts of the protein. Bloom and his graduate student Michael Doud reported a more detailed view of the protein and the areas that are more and less likely to tolerate mutations online on bioRxiv.org in April.
Most vaccines against influenza virus (left) aim at segments on the head of the viral protein hemagglutinin (right)..
The protein’s head, recent research finds, is more likely to take on mutations (red means more likely). In contrast, the stalk of the protein is less amenable to mutations (yellow signifies less likely to mutate) and therefore may be a better target for vaccine development.
That’s good for the virus, but bad for people. Hemagglutinin seems capable of accumulating change in the very sites that vaccine developers would like to remain the same. But the finding also suggests that flu vaccines designed to target less mutation-tolerant regions of hemagglutinin might be more likely to protect against the flu from season to season. That’s a strategy some labs are already exploring — targeting the less-evolvable stalk of hemagglutinin’s lollipop-shaped structure.
It’s not yet clear why certain parts of the hemagglutinin protein tolerate change so well; Bloom hopes that studying the mutational tolerance of other proteins will help researchers figure that out.
“We’re never going to be able to predict evolution precisely, because it’s a highly stochastic process,” Bloom says. “But I think we can make better forecasts about many of the evolutionary processes that affect us. These are really challenging problems, but I think we are getting to the point where we can use experiments and molecular understanding to help us think about these processes.”