Bioinformaticist Steven Salzberg, Ph.D., recently joined—or, more accurately, rejoined—the Johns Hopkins faculty. From 1989 to 1997, Salzberg was a member of the Department of Computer Science. In recent years, he’s directed the Center for Bioinformatics and Computational Biology at the University of Maryland. Salzberg is now a member of the Institute of Genetic Medicine, with faculty posts in the Departments of Medicine and Biostatistics.
The term bioinformatics probably did not exist a few decades ago. What do you do as a bioinformaticist, and how did you arrive at this profession. Did you begin as a computer scientist and then gravitate toward biology or vice versa?
SALZBERG: As a bioinformaticist. I focus on developing new technologies to analyze DNA sequences in different ways. I started out as an English literature major at Yale and got interested in computer science when I was a junior. I got a job programming after graduation and returned to graduate school in computer science, first at Yale, then later at Harvard, where I finished my Ph.D.
When the Human Genome Project was announced while I was in grad school, I thought immediately that it sounded like the most exciting scientific project around, and I wanted to be a part of it. I started teaching myself about genomes while finishing my Ph.D., and I sat in on Stephen Jay Gould’s introduction to evolution course. Later, as an assistant professor of Computer Science at Hopkins, I audited the undergraduate course in genetics, sitting in the back of a large lecture hall and just enjoying the science.
In the early 1990s, the first DNA sequences started to become publicly available, and I saw the chance to use my computer science skills to help interpret the genome. I published my first bioinformatics paper in 1992 and really dove in with both feet a couple of years later.
What aspect of the field drew you back to Hopkins?
SALZBERG: What excites me about Hopkins are the opportunities to work on the human genome and human genetics. Working on genetic diseases requires close collaborations with scientists who are experts on those diseases and who are working directly with patients and human tissue samples. Hopkins is one of the world’s top places for this kind of work.
From a bioinformatics perspective, what’s involved in finding genetic clues to disease?
SALZBERG: We now have new tools available—large-scale DNA sequencing—that make it possible to sequence the entire genomes of people with a particular disease. (You can sequence a person’s genome for $5,000. That isn't cheap, but it's 10,000 times cheaper than it was 10 years ago.)
One of our goals is to find mutations in the sequence data that could be clues to disease.
So today, if I want to study heart disease or diabetes or another disease, it is feasible to sequence the genomes of 100, 200 or more people with that condition. Then I can look for places where each person’s DNA sequence differs from a normal genome and I can compare the variants in the whole group to look for ones that are shared. Those common variations might be genetic differences that cause or contribute to the disease, and they provide targets for further research.
What about the vaunted promise of the Human Genome Project? Some scientists and journalists said that the HGP would lead researchers to learn the genes and genetic variations that predispose people to cancer and perhaps all diseases. Has the project lived up to its promise?
SALZBERG: I think the HGP has more than lived up to what I expected. There are incredible discoveries being made every day, and every time I look at the latest issues of Nature or Science I find something new and exciting about the human genome. But over the course of the HGP, some scientists made some pretty grand claims about it, and those claims have been slow to be realized. I think it's just a matter of time before the promises of individualized, genome-based medicine come true. Science moves slower than the public or Congress might want it to, but the science is actually moving ahead with remarkable speed.
What’s most challenging about using bioinformatics to find mutations that may be tied to disease?
SALZBERG: The deluge of information is overwhelming.
A human genome contains about 3 billion nucleotide base pairs. That’s three billion bits of sequence data. Each person’s genome differs from the reference genome in about three million places. Now multiply that times the 100 or more people enrolled in a study.
My group writes programs that compare the DNA sequence of one person with a reference genome and identify all the differences. We then use other programs to compare all of those differences among, say, 100 people enrolled in a study. If these programs aren't very, very efficient, then they will overwhelm even today's fastest multiprocessor computers. So we are constantly working on more efficient algorithms, ones that use less CPU time and less memory.
Have scientists identified any disease genes using this approach?
SALZBERG: Yes, but only a handful so far. In one study, researchers at the University of Washington and Seattle Children’s Hospital discovered a gene for a rare condition called Miller syndrome using a method called exome sequencing. That was low-hanging fruit: Miller syndrome is caused by a single mutation in a single gene. Many of the diseases we’re trying to understand are complex—involving multiple genes. Our approach requires more patients in order to track down the genes responsible, and as the number of people gets larger, the analysis gets more complicated.
What other challenges lie ahead?
SALZBERG: Computing infrastructure is a big hurdle. We need many computer servers with more than 500 gigabytes of memory, computing grids with hundreds of CPUs, and file systems that can handle hundreds of terabytes. Moving files like that around, from one central storage location to your computer, could take days or hours. We also need expert IT personnel who can set up and manage these scientific computing resources, which are very different from desktop computers, and different, too, from large patient databases. I'm still learning what's available at JHU, but it seems that the campus needs to invest in more research computing infrastructure.
Switching to another issue, you’ve become a vocal public critic of gene patenting. How did you get involved in that issue?
SALZBERG: A colleague and I wrote a program last year that would allow anyone to search their genome for mutations in the BRCA genes, which are associated with breast and ovarian cancer. We did it as a proof of concept: If you had your own genome sequence, you could run our app and see if you had any of 70 mutations that have been tied to breast cancer. We made our software freely available online, and we chose the BRCA1 and 2 genes because they are patented. (A company called Myriad Genetics holds the patents.) We were trying to illustrate the point that now that it’s become so much less expensive to sequence a genome, people will want to search their own genomes for disease mutations, and the notion that they can't do that without paying licensing fees is absurd.
Why are you opposed to gene patenting?
SALZBERG: Science is all about making discoveries and publishing them, sharing them, and using the information others have found. Science moves much faster when everything is open. Patenting genes slows progress.
Patent lawyers argue the opposite, of course. But we don’t always see how patents slow down science, because scientists simply avoid studying genes that are patented. More than 4,000 genes are now patented—about 20 percent of the human genome. These patents are a major disincentive to scientists: If I want to study a patented gene, I need to pay a licensing fee and deal with lawyers. It's much more productive for me to simply focus on the 80 percent of human genes that aren't patented.
The other problem is that patents were created to protect inventions that people come up with. Genes aren’t inventions; they're products of nature. So it’s just nonsense to say they are. The U.S. Patent Office made a big mistake when they first started allowing gene patents, decades ago, and we need to undo that.
How did Myriad respond to the publication of your test?
SALZBERG: I never heard directly from anyone associated with the patent. In the meantime, a federal judge issued a ruling saying that Myriad’s patents on the BRCA genes were invalid. Myriad is now fighting to overturn that decision. If they can't win that battle, the game is over for them, so I think they have bigger fish to fry. I hope the courts will uphold the initial decision, and gene patents will be on their way out.
In addition to contributing to the debate on gene patenting, you’ve blogged and written critically about a number of other biomedical issues. In particular, you write a column for Forbes that often assails pseudoscience. With the many demands on your schedule, why do you add on this other commitment to your time? You’re not required to write a column.
SALZBERG: I started blogging just for fun, but then my blog started gaining followers and I realized I could reach a different, and larger, audience than I do with my scientific papers. The Forbes column reaches an even larger audience than my original blog (I have two blog sites now), and most of them are nonscientists. I think that part of my role as a professor is to educate the public (not only my students) as much as I can. I also enjoy writing about science from a more popular standpoint.
The reason I've focused on pseudoscience is simple. I discovered a few years ago, when working on the influenza virus, that many people were reluctant to get vaccinated or to vaccinate their children because of fears that vaccines would cause autism. I was startled to discover this (no reasonable scientists believed it), and when I investigated I found that although the fears were completely unfounded, they were being encouraged by a highly vocal group of activists, including some very bad scientists as well as some people who just saw an opportunity to make money. I realized that these anti-scientists could undermine decades’ worth of research by real scientists trying to cure disease and improve public health.
From that topic I've expanded my focus to look at other questionable scientific claims, particularly those that affect public health. It's amazing how much pseudoscience is out there, and how many people—some of them with M.D.s and Ph.D.s—are eager to sell so-called cures that don't actually cure anything. It's important for scientists to speak out when we can, especially when we see the public being misinformed, and that's what I am trying to do with my blogs.
--Interviewed by Melissa Hendricks