When a sore throat and sinus congestion warrant a visit to the doctor, your physician will attempt to determine whether a cold virus or bacterial infection is to blame—oftentimes without success. So, just to be safe, they might write a potentially unnecessary script for an antibiotic.
But what if a nurse could swipe your saliva and run a quick genetic test for bacteria? If the test results are negative, you get a prescription for a decongestant and orders to get some rest, rather than contributing to the widespread overuse of antibiotics.
Rapid genetic screening on a personal level can take the guesswork out of a doctor visit. But on a grander scale, quickly analyzing genetic data stands to revolutionize research into everything from the mutations causing various cancers to the “Second You,” your microbiome, or the bacteria living inside you. Genomics can also improve understanding of a range of diseases — Alzheimer’s, irritable bowel syndrome, Crohn’s disease, for instance — as well as how to grow algae to best produce oil to make gasoline. In medicine, genetic screening can tell hospital staff what pathogens inhabit the hospital environment. In environmental research, it can clarify how communities of microorganisms fix carbon from the atmosphere and how their populations adapt to less rain and hotter summers.
Decreasing costs for sequencing instruments is driving access for new users, making them available to the common scientist. Today, you’ll find sequencers not only in most universities and other large research institutions, but also in hospitals, individual clinics and the small labs of individual researchers.
A Data Deluge
All this rapidly generated data has created a new bottleneck: The ability to analyze all of the data is swamping genomics. Bioinformatics tools rely on computers to pull together, classify, store, process and analyze molecular genetic and genomic data to make use of it. Unfortunately, the current tools are not entirely user-friendly or accessible to those whose expertise lies in biology rather than crunching data.
Seeing a need, a team in the Biosecurity and Public Health group at the Los Alamos National Laboratory, collaborating with the Naval Medical Research Center, developed a computational and web-based tool called EDGE Bioinformatics to help channel the data deluge and fulfill the promise of genomics.
Funded by the Department of Defense’s Defense Threat Reduction Agency, the work comes out of the lab’s decades of genetics and life sciences research. Long interested in the link between radiation and genetic mutations, the U.S. Department of Energy and the National Institutes of Health received federal funding in 1998 to begin the Human Genome Project to sequence, or map, the genome of Homo sapiens.
Los Alamos was a key player, contributing its expertise in life sciences, particularly genetics and computing resources to the task of unraveling the human genetic code. By June 2003, the map was mostly complete. Since then, the lab has taken on new challenges, such as illuminating the causes of cancer and perfecting algae for biofuel production.
The Los Alamos EDGE team created a web-based computer program with a variety bioinformatics analysis tools that a non-data cruncher can use. Using EDGE, with a few mouse clicks a novice in bioinformatics can create sophisticated analyses of a sample in minutes instead of days or weeks.
EDGE has already helped streamline data analysis for groups in multiple countries worldwide as well as within several government laboratories in the United States. Because the program is “open source,” anyone can use it or even modify it to suit their needs and bring the power of Big Data Analysis to even the smallest research lab — or doctor’s office.
Genomics researcher Patrick Chain is the EDGE team leader in the Biosecurity and Public Health group at Los Alamos National Laboratory. With a background in microbial ecology, evolution, genomics and bioinformatics, Chain has spent the past 20 years using genomics to study various microbial systems.
He currently leads a team of researchers whose charge is to devise novel methods, algorithms and strategies for the biological interpretation of massively parallel sequencing data.