In This Section      

Dome - The Power and Promise of Big Data

Dome June 2014

The Power and Promise of Big Data

Date: June 5, 2014

More information, faster computing add up to better health care.

1 2 3 4 5
Illustration by Phil Wrigglesworth

How does a doctor know when to recommend a prostate-specific antigen (PSA) screening? Since the cancer detection test can yield false-positives and spur unnecessary procedures, it could be harmful to men likely to die of another cause before a malignancy becomes fatal.

The problem is that doctors lack access to information that could help them decide whether to recommend the screening, says H. Ballentine Carter, director of the Division of Adult Urology in the Brady Urological Institute of The Johns Hopkins Hospital. “How in the world could a primary care physician possibly know what to do when recommendations vary from ‘don’t screen’ to ‘screen selectively’?” he asks.

The answer may lie in extreme numbers-crunching. At Johns Hopkins and elsewhere, excitement is growing about the power of “big data”—massive amounts of electronic patient information that can be mined to yield medical results that are tailored to each individual. Brain scans, genetic codes, family histories, eating habits and medical claim records are just a few examples of the information being collected and analyzed. This paradigm shift will replace a “one size fits all” concept of health care with an individualized approach that delivers better results at lower cost. “The idea is to mine this data, to figure out what matters,” says Carter.

A university-wide Individualized Health Initiative, known as Hopkins inHealth, seeks to capture this promise. Launched in July 2012, the initiative’s pilot projects are harnessing big data to improve cancer screenings, cystic fibrosis treatment, heart care decisions, autoimmune disease management, and diagnosis and treatment of age-related diseases.

Carter, a leader of the cancer screening pilot, is helping doctors analyze the potential benefits and risks of a PSA screening by taking into account factors including a patient’s age, chronic illnesses, life expectancy, race, family history and results from previous PSA tests. The goal is to add the algorithm to the Epic electronic medical record system, giving doctors information they can share with patients still in the exam room.

“Having tools to calculate life expectancy and guidance about how often to perform the screening would be useful in making these decisions quickly at the point of care,” says Craig Pollack, a Johns Hopkins internist also involved with the pilot. (Pollack conducted a 2011 survey finding that many Johns Hopkins Community Physicians doctors recommend the test for reasons not related to outcomes, including patient expectations and fear of malpractice.)

Once this cancer screening tool is in place, it will continually become more accurate and nuanced with additional information, such as a patient’s genetic sequencing and medications, as well as the medical histories of men who receive PSA screenings and of those who don’t. 

Just the Beginning

The value of big data is expected to grow as more material becomes available and computing power increases. “I have a vision that within 20 years, maybe sooner, everybody, from birth to death, will have all the information they need relative to their genome available to them and to their health care providers,” says Steven Salzberg, director of The Johns Hopkins University’s Center for Computational Biology

The rollout of the Epic electronic medical record system across Johns Hopkins Medicine is creating a vast repository of information about patients, from blood pressure and weight to disease treatments and results.

And technology is rapidly lowering the cost and increasing the availability of biological information. When the National Institutes of Health announced in June 2000 that it had completed a first “rough draft” of the human genome, the price tag to sequence a genome—to obtain an individual’s complete set of genetic information—was $30 million. Today, sequencing a genome costs less than $3,000, and Salzberg says it may decrease even more. Other information that can now be loaded into databases for search and analysis includes reports of gene activity and MRI images of healthy and unhealthy brains, which can help doctors zero in on a diagnosis.

Johns Hopkins leadership is enthusiastic about this new way of thinking. In 2012, the Center for Computational Biology was established within the McKusick-Nathans Institute of Genetic Medicine to help faculty researchers study genes and how they function.

The High Performance Research Computing Facility is scheduled to open on the Johns Hopkins Bayview Medical Center campus in September, a joint project with the University of Maryland, College Park. Funded with $30 million from the state of Maryland, the center will be equipped with a supercomputer that will be the “most powerful computing system at Johns Hopkins,” according to Scott Zeger, director of Hopkins inHealth. Its projected storage capacity of 20 petabytes (a colossal store of digital information) will give Johns Hopkins researchers new power to collect and analyze complex biological information.

Of course, simply gathering massive amounts of data can’t solve problems, Zeger cautions. Not only do high-powered computers have to sort through information, but skilled researchers and clinicians have to know what to ask the computers and how to interpret the results. “Just because you know your own genetic sequence doesn’t mean you know anything clinically relevant,” he notes.

Individualized Treatment for Cystic Fibrosis

For people suffering from cystic fibrosis, big data is beginning to make a difference.   

“We are at the threshold of being able to customize treatment for each patient,” says Garry Cutting, a professor of pediatrics in the Institute of Genetic Medicine and director of the CFTR2 project, which developed a searchable database of information from 40,000 people with cystic fibrosis.

Though the life expectancy of people with cystic fibrosis is increasing, it remains below age 40. The devastating disease, which clogs organs with thick mucus, is caused by a mutation in the CFTR gene, which regulates how fluids and salts move across a cell membrane.

Nearly 2,000 mutations in that particular gene have been discovered, and 129 have so far been verified as causing cystic fibrosis. Each mutation manifests itself with different symptoms, such as varying lung capacities or ability to digest food, so the more that is known about commonalities among mutations, the clearer the course of treatment, Cutting says. He points to the recent introduction of a drug that provides dramatic relief to the 4 percent of people with cystic fibrosis who have a specific genetic mutation.

All patients with cystic fibrosis at Johns Hopkins are asked to participate in genetic sequencing and to add their results to the growing trove of information about the disease. Nearly everyone agrees, says Cutting, because they want to improve their own treatment and help others. 

But genetic information is just part of the story. Because people with cystic fibrosis are in treatment their entire lives, the volume of information is unusually rich, particularly in a second database of 2,100 people in families that have at least two children with the disease.

Cutting said the study of cystic fibrosis is an outstanding starting point for Hopkins inHealth, because the disease is caused by mutations in a single gene. But the idea of using tremendous amounts of information to inform the treatment of each individual is likely to become universal, he says.

“Someday in the near future, we’re going to say, ‘Gosh, I can’t believe they used to give the same antibiotic to everybody.’”

— Karen Nitkin


Ask your questions about big data on Hopkins Happenings’ “Ask the Expert.”

Find Physicians Specializing In...

Related Services