Surveillance Perspectives – Part II

Infectious lower respiratory diseases and diarrheal diseases are top causes of deaths worldwide, accounting for 4.1 million deaths in 2019. The current COVID-19 pandemic is yet another reminder about the importance of using advanced scientific tools to preemptively survey for early signs of infectious disease outbreaks.

Traditional microbial diagnostics techniques include the use of culture (media and agar), serological detection of pathogen-associated antibodies, and detection of microbial genetic materials (DNA or RNA) using PCR. However, these techniques suffer from a major limitation in coverage. With the use of specific culture conditions, antigens, or primers, only a subset of pathogens may be detected. Furthermore, methods like microbial culture may not be sensitive enough to capture low microbial load or suitable for microorganisms that cannot be cultured outside of the human body.

To overcome these challenges, there has been rising interest in using metagenomic approaches to profile all DNA or RNA in a patient sample, including the entire microbiome and the human host genome or transcriptome.1 Metagenomic approaches are not new. They have been extensively used to characterize environmental and forensic samples. Here, we will explain how metagenomic approaches can be used to enhance infectious disease surveillance, and discuss how these approaches have been acquiring clinical robustness.

Use of metagenomics in infectious disease surveillance

Currently, metagenomic analyses are performed using next-generation sequencing (NGS) technologies that offer greater throughput than microarray methods. NGS enables millions to billions of reads to be generated per run for comprehensive analysis of clinical samples. NGS is also credited with falling costs. With these advantages, NGS has become the predominant technique for metagenomics, and it has played pivotal roles in identifying causes of antibiotic resistance2 and infectious disease outbreaks.3

A typical workflow of metagenomics next-generation sequencing (mNGS) is as follows. There is total nucleic acid extraction of DNA and RNA. Next, RNA is reverse transcribed into complementary cDNA. Then DNA and cDNA materials are sequenced using instruments such as those from Illumina and Oxford Nanopore Technologies and assigned to their reference genomes to determine which microbial populations are present and in what relative numbers. To enhance sensitivity toward detecting pathogens of low numbers, magnetic beads can be used in a capture probe enrichment step.

The molecular PCR diagnostics assay is fairly cost effective and fast (with a turnaround time of about 2 hours) compared with mNGS (which requires close to 20 hours). However, mNGS enables a much broader range of pathogens including viruses, bacteria, fungi, and parasites to be identified directly from clinical samples based on uniquely identifiable DNA or RNA sequences.4 Importantly, mNGS data can also be integrated with other characterization techniques including RNA sequencing of human host responses to better understand the progression of disease and the response to pathogens and treatments.

Recently, researchers affiliated with the Chinese People’s Liberation Army General Hospital and the Vision Medicals Center for Medical Research made use of mNGS to detect for community-acquired pneumonia (CAP), which may be caused by over 100 viral, bacterial, fungal, and parasitic species.5 Due to the scarcity of diagnostic assays for rare pathogens and poor suitability for culture-based testing for fastidious microorganisms, up to 62% of CAP cases go undetected.

The team recruited a total of 159 patients: 100 tested using conventional culture and 59 tested using mNGS. There was no statistical difference between the two groups across multiple health indicators. mNGS was able to detect a broader range of pathogens. Specifically, mNGS detected 179 pathogens (113 bacteria, 32 fungi, and 34 viruses), which is about 70% more than were detected in the control group, where 105 pathogens (78 bacteria and 27 fungi) were detected.

This is a key advantage of mNGS because in more than 50% of the cases, there were co-infections with bacteria, fungi, and viruses in CAP-positive patients. Most importantly, with the mNGS method, there was significantly higher disease resolution (63%) compared to the control group (7%) and reduced mortality rate at hospitals (8.6% versus 26%) due to changes in treatment plans after mNGS detection.

Despite the promise of mNGS for infectious disease surveillance, there are still challenges associated with this method. For instance, <1% of the reads (typically) are for nonhuman materials as only a subset of the reads corresponds to the genetic materials of pathogens. Therefore, the sensitivity of detection is heavily dependent on the level of host genomic material background. In general, cell-free fluid samples offer higher sensitivity than tissue samples.6 It is also more challenging to define microbial populations in nonsterile samples such as stool as there can be contamination from environmental microbial organisms.

Faster and cheaper workflows

Another limitation of mNGS is that the process takes at least 20 hours and may last a few days, involving multiple steps of nucleic acid extraction and reverse transcription for RNA material. The manpower requirements and consumable costs also make mNGS prohibitive for infectious disease surveillance in resource-scarce communities.

To overcome this problem, researchers at the Shanghai Public Health Clinical Center created a streamlined clinical metagenomic sequencing protocol.7 The team used a chaotropic salt–based buffer with bead beating to enhance breakdown and magnetic bead extraction of DNA, followed by generation of Illumina sequencing libraries. This process did not make use of expensive nucleic acid extraction kits or other sample preprocessing, host depletion, and pathogen enrichment steps.

Clinical metagenomics, the comprehensive analysis of patient samples for host and microbial DNA and RNA, is moving from research to clinical laboratories. Now that clinical metagenomics is taking advantage of powerful tools such as next-generation sequencing, it is better able to detect pathogens of all kinds—bacteria, viruses, fungi, and parasites—while simultaneously assessing host responses. Valuable metagenomic next-generation sequencing applications include infectious disease diagnostics. [KATERYNA KON/SCIENCE PHOTO LIBRARY/Getty Images]

The whole process (for 20 samples) took about eight hours with coverage of 100% for samples such as hepatitis B–positive serum and influenza A virus stock. This high coverage is important to ensure that the sequenced DNA can be mapped to known reference bases. The influenza A sample was serially diluted up to 10,000 times, and while this reduced the coverage to 26%, it was still sufficient for reliable identification of the virus at a low viral load. The 50% embryo infectious dose (EID50) was 6.4 × 102/mL.

The researchers then applied their mNGS workflow to clinical samples including 20 cerebrospinal fluid samples from patients with meningitis/encephalitis, that is, with microbial infections of the brain and/or spinal cord. Toxoplasma gondii sequences were detected in one patient, who later had samples tested by an antibody assay that delivered a positive result.

The findings obtained with streamlined mNGS suggest that the optimized workflow to expediate DNA extraction and denaturation can be equally sensitive as conventional mNGS while offering diagnosis at a high throughput and
lower cost.

“Clinical metagenomic sequencing as a diagnostic tool for unbiased identification of pathogen has been increasingly used not just for research, but also for clinical purposes,” says Xiaonan Zhang, PhD, a principal investigator at the Shanghai Public Health Clinical Center and an associate professor at the University of Canberra. “However, to implement this technology in resource-poor countries, the complexity of the protocol and total cost of this procedure need to be significantly decreased.

“We aimed to cut the expenditures on nucleic acid extraction and library preparation by utilizing the most simple and low-cost elements in the literature and combining them in a streamlined fashion. This work only partially solved the cost and logistics problems that plague clinical metagenomics. Many innovations are needed to finally achieve automatic procedures from raw material to sequencing-ready libraries, and from raw sequence to diagnostic report. We are further developing new tools to help achieve this goal.”

Overcome false positivity

Infections in the bloodstream can lead to sepsis and septic shock, dire conditions that are associated with greater risks of morbidity and mortality. The mNGS approach can facilitate identification of causative agents and point to antibiotic therapies while improving understanding of how each patient may respond to different treatments. Despite its success in pathogen detection from blood, mNGS suffers from high false-positive rates due to contamination during specimen collection, specifically, from human flora and background microbes in laboratory reagents and the environment.6

At Peking University People’s Hospital, researchers sought to overcome high false-positive rate in mNGS by using data from multiple control groups as data filters.8 The team had three control groups: samples with negative results, samples from healthy people, and external negative controls that had been under long-term surveillance to identify common microbial contaminants. The team also introduced two markers to filter out false-positive calls introduced by closely related taxa.

This improved system provided similar sensitivity and specificity to blood culture, which takes several days and cannot be used to detect fastidious microorganisms. Notably, the improved system also performed better than competing clinical methods to detect fungal pathogens.

The system, however, failed to detect blood infection in 12 samples. The authors explained that this could be due to low availability of cell-free DNA in early stages of infection, as well as to stringent filters. They also postulated that as they further improve their filters with more positive and negative samples, the performance of their proposed methodology would be enhanced.

Enhancing cross-institutional collaboration

While promising, mNGS delivers results that are limited to individual laboratories, experimental protocols and workflows, and reference standards. This poses a challenge when healthcare institutions wish to compare results while attending to tasks such as monitoring outbreaks of infectious diseases in different regions.

Researchers based at the National Institutes for Food and Drug Control (Beijing) led a multicenter assessment consisting of 17 independent laboratories to compare mNGS results using a common set of reference reagents and performance metrics. The researchers hoped their work would lead to new opportunities to establish performance standards, guide result interpretation, facilitate assay and workflow development, and improve the chances of regulatory approvals.9

To mimic infection of the central nervous system, the team constructed a panel of nine pathogen reference reagents covering 30 microorganisms spiked in human host cell background. Some of the bacterial species were also closely related to determine whether they could be discriminated by the mNGS approach. To determine the dynamic range, the researchers used microorganisms with wide-ranging genome sizes (from 0.7 Kb to 19.05 Mb) and GC contents (from 33.2% to 70.4%). Most of the pathogens could be detected reproducibly across sites, except for RNA viruses. Strategies including depletion of host cells and the use of contamination filters could boost the detection results.

The average turnaround time for the results was about 24 hours with sequencing as the major time-consuming step in the assay. Consequently, the authors recommended that read lengths should be 75 base pairs or less to ensure that metagenomic assays are developed more efficiently. The authors also expressed the hope that with development of more sophisticated identification algorithms made possible by better genome coverage, mNGS will show greater data specificity and become a means of establishing regulatory or technical references.

“Our work represented the most comprehensive benchmarking study on shotgun metagenomics for pathogen detection to assess reliability, key performance determinants, reproducibility, and quantitative potential,” says Donglai Liu, PhD, National Institutes for Food and Drug Control. “This work provided a unique resource comprising nearly 600 billion reads (>5 Tb) for technical evaluation in clinical and regulatory settings.

“We believe that our multicenter analyses could be valuable to drive further advances in shotgun metagenomics–related experimental techniques and the development of bioinformatics tools.”

Liu and colleagues found that their technology’s specificity is challenged by background microbes derived from both common and workflow-specific sources. In the future, laboratory automation would be a way to deplete host genomes or to enrich microbial genomes, procedures that could improve assay sensitivity and data efficiency.


Infectious diseases continue to be one of the largest causes of mortality worldwide. However, with improved surveillance, pathogens could be detected early enough to save many more lives.

Although mNGS technology still suffers from limitations such as long turnaround times and sensitivity limits, it is a more reliable method than the culture method for pathogen detection. Compared with the molecular PCR diagnostic technique, mNGS also covers a broader range of pathogens, and it has been shown to be useful in identifying co-infection by multiple pathogens.

With the development of improved sample processing methods and improved algorithms for filtering false-positive signals and cross-institutional differences, mNGS is expected to play an increasingly important role in clinical surveillance of infectious diseases.



  1. Chiu CY, Miller SA. Clinical metagenomics. Nat. Rev. Gen. 2019; 20(6): 341–355. DOI: 10.1038/s41576-019-0113-7.
  2. Stefan CP, Koehler JW, Minogue TD. Targeted next-generation sequencing for the detection of ciprofloxacin resistance markers using molecular inversion probes. Sci. Rep. 2016; 6: 25904. DOI: 10.1038/srep25904.
  3. Loman NJ, Constantinidou C, Christner M, et al. A Culture-Independent Sequence-Based Metagenomics Approach to the Investigation of an Outbreak of Shiga-Toxigenic Escherichia coli O104:H4. JAMA 2013; 309(14): 1502–1510. DOI:10.1001/jama.2013.3231.
  4. Lefterova MI, Suarez CJ, Banaei N, Pinsky BA. Next-Generation Sequencing for Infectious Disease Diagnosis and Management: A Report of the Association for Molecular Pathology. J. Mol. Diagn. 2015; 17(6): 623–634. DOI: 10.1016/j.jmoldx.2015.07.004.
  5. Xie F, Duan Z, Zeng W, et al. Clinical metagenomics assessments improve diagnosis and outcomes in community-acquired pneumonia. BMC Infect. Dis. 2021; 21(1): 352. DOI: 10.1186/s12879-021-06039-1.
  6. Blauwkamp TA, Thair S, Rosen MJ, et al. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat. Microbiol. 2019; 4(4): 663–674. DOI:10.1038/s41564-018-0349-6.
  7. 7. Jia X, Hu L, Wu M, et al. A streamlined clinical metagenomic sequencing protocol for rapid pathogen identification. Rep. 2021; 11(1): 4405. DOI: 10.1038/s41598-021-83812-x.
  8. Jing C, Chen H, Liang Y, et al. Clinical Evaluation of an Improved Metagenomic Next-Generation Sequencing Test for the Diagnosis of Bloodstream Infections. Clin. Chem. 2021; 67(8): 1133–1143. DOI: 10.1093/clinchem/hvab061.
  9. Liu D, Zhou H, Xu T, et al. Multicenter assessment of shotgun metagenomics for pathogen detection. EBioMedicine 2021; 74: 103649. DOI: 10.1016/j.ebiom.2021.103649.


Read Infectious Disease Experts Proclaim “Many Voices, One Health“, Surveillance Perspectives – Part I, NORTH AMERICA, AFRICA, EUROPE