DNA sequencing, introduced 35 years ago, was revolutionized after completion of the Human Genome Project (HGP) in 2003, which had a huge impact on the research community. The HGP took 13 years to decode an entire genome for $3.8 billion. Now researchers can sequence an entire genome in one day for $1,000.
Sequencing costs have decreased remarkably even compared to Moore’s law for the hardware industry due to the implementation of parallel sensors on the silicon chips. The new generation of sequencing platforms has resulted in increased sequencing capacity, greater processing speed, and higher level of efficiency.
These next-generation sequencing (NGS) technologies have been a critical driver of advances toward personalized medicine. This January, Life Technologies and Illumina announced their new products, the Benchtop Ion Proton™ and HiSeq® 2500 sequencers, respectively. Additional players in the NGS segment are Roche, Oxford Nanopore, Pacific Biosciences, and many others. These technologies have promising clinical applications in the diagnostic space.
The major challenges that NGS faces is data processing and interpretation of the sequenced data as well as the need for skilled personnel for data analysis. Despite the increasing potential for the sequencing price to drop to $1,000/genome, there are indirect overhead costs associated with data processing, storage, and bioinformatics software that must be identified in order to predict the future costs associated with NGS. The major challenge lies in calculating the costs for data processing and management as well as downstream analyses of the processed sequences.
What Big Data Will Mean in 2017
Data handling and analysis is a bottleneck for NGS applications. These challenges are creating opportunities for bioinformatics software and skilled personnel like bioinformaticians, statisticians, biologists, physicians, and geneticists that could interpret large amount of data.
The bioinformatics market size was $2.4 billion in 2011 and is predicted to increase at a compound annual growth rate of 25%, according to a report from Research and Markets released in March called Bioinformatics Market Outlook. By 2017, the bioinformatics market size is expected to reach $9.1 billion. Considering genomic interpretation makes up the largest segment of bioinformatics, we assume that 85%, or $7.7 billion, will be spent in this specific area in 2017.
In 2011, the gap between the bioinformatics and NGS markets was $1.55 billion, assuming sequencing costs are $6,000–$10,000 per genome and there is a need for bioinformatics tools and personnel for interpretation of the sequenced data. The next-generation sequencing market is expected to grow at 22.7% until 2016, according to a report released by MarketsandMarkets on May 4. Hence the difference between the growth rates will be roughly 2.3% from 2012–2017. This will cause an increase in the gap between the two market sizes to $6.3 billion by 2017.
The cost per genome for whole genome sequencing is expected to fall below $100 in 2017 from the $6,000–$10,000 range in 2011, according to a report from Market Research released in 2009 called DNA Sequencing Equipment and Services Market. Currently, a combination of one research scientist, two bioinformaticians, and two technicians are required for data handling and downstream analysis of one genome.
We predict that in the future a large sum of money will be invested in recruiting highly trained and skilled personnel for data handling and downstream analysis. Various physicians, bioinformaticians, biologists, statisticians, geneticists, and scientific researchers will be required for genomic interpretation due to the ever increasing data.
Hence, for cost estimation, it is assumed that at least one bioinformatician (at $75,000), physician (at $110,000), biologist ($72,000), statistician ($70,000), geneticist ($90,000), and a technician ($30,000) will be required for interpretation of one genome. The number of technicians required in the future will decrease as processes are predicted to be automated. Also the bioinformatics software costs will plummet due to the decrease in computing costs as per Moore’s law.
Thus, the cost in 2011 for data handling and downstream processing is $285,000 per genome as compared to $517,000 per genome in 2017. These costs are calculated by tallying salaries of each person involved as well as the software costs.
The projected market size of bioinformatics for genomic interpretation is huge and will rise due to the increasing volume of data for interpretation from sequencing experiments. Also, an increase in personnel requirements, even though the software costs will decrease to some extent, will tremendously add to the interpretation costs per genome. The data handling and downstream processing costs are estimated to almost double in 2017 as compared to 2011.
Thus, it can be concluded that despite the rapidly decreasing cost of sequencing, the overhead costs for NGS will increase to a great extent as calculated.