Home Resources Using Metabolomics in Biomarker Discovery

Using Metabolomics in Biomarker Discovery

September 15, 2008

September 15, 2008 (Vol. 28, No. 16)

Walter E. Gall

Mass Spectrometry Platform and Software Aid in the Development of Diagnostics

Metabolomics is the most recent systems-biology approach to complement the genomic, transcriptomic, and proteomic efforts to characterize an entire biological system. Since metabolites represent the end products of the genome and proteome, metabolomics holds the promise of providing an integrated physiological phenotype of a system.

Such metabolic profiling involves a comprehensive and accurate measurement of the types and concentrations of metabolites in a system, whether from cells, tissues, or some other biological sample. It also provides a sophisticated understanding of metabolic pathways and networks downstream of gene expression and enzymatic pathways.

This biochemical profiling technology has important applications in basic and clinical medical research regarding biomarker discovery for diagnostic and drug development. The latter usage provides insights for drug safety and efficacy, mechanisms of action, and bioprocess optimization, whereas the former provides a paradigm shift for accelerating the discovery and validation of novel biochemical diagnostics.

In addition to increased academic and industry efforts in this area, the NIH Roadmap Metabolomics Technology Development Initiative is additional evidence of the sustained interest in biochemical profiling .

Metabolomics is a tractable process due to the smaller number of analytes under study. Compared to 25,000 genes or 1,000,000 proteins, an estimate of the total number of small molecules in the human body is about 2,500 biochemicals and metabolites (< 1,500 Da).

In addition, the wealth of biological information from the medical literature on biochemical pathways favors a more sophisticated interpretation of metabolomics data. Therefore, using a nontargeted, unbiased discovery approach, a comprehensive metabolic snapshot of a biological system can be obtained.

Focusing on small molecules in diagnostic development is a path well-trodden, with many historic examples of important, validated biochemical biomarkers such as cholesterol and glucose. Given the close relationship between biochemicals and disease phenotype, there is considerable opportunity to develop diagnostics using mechanism-based biochemical markers.

Metabolomics is ideally suited for diagnostic development, especially since biochemicals are proven biomarkers for disease. In addition, metabolomics can be applied to a wide variety of diseases. This provides the ability to first identify disease areas with a large unmet diagnostic need, such as cancer and diabetes, and then use metabolomics to find the relevant markers.

Finally, because metabolomics is inherently high throughput, the product-development cycle can be up to five times shorter than for traditional diagnostic development.

Biochemical Profiling

As with any systems-biology effort, metabolomics does create a challenge due to the inherent biological noise in the system being analyzed—the signal-to-noise conundrum.

Successfully identifying and quantifying readily detectable small molecule biochemicals and metabolites in a set of biological samples, e.g., disease vs. nondisease, requires a robust and repeatable process protocol that involves uniform collection and processing of samples and rigorous quality-control systems, thus allowing for differentiation of increased and decreased levels of compounds with low process coefficient of variation.

Because mass spectrometry is such a sensitive bioanalytical process, discovery of disease biomarkers (subclinical disease, disease presence, or progression) by biochemical profiling requires well-powered clinical studies with appropriate controls as well as balanced demographic parameters. In addition to test samples, quality-control samples are prepared and incorporated into a LIMS tracking system.

Test and control samples are processed and analyzed side by side with a run-order randomization protocol, with QC and QA samples representing about 30% of the sample set. Such quality-control samples include process and solvent blanks, recovery and internal standards, derivatization standards, library reference compounds, and dilutions of test samples in the appropriate test matrix.

For consistent isolation of the small molecule biochemicals in each sample, sample preparation is a straightforward protein precipitation with an organic solvent. After pelleting the proteins, the supernatant is divided into aliquots that are subsequently dried down, with one reconstituted in an acidic buffer (UHPLC positive ionization), another in a basic buffer (UHPLC negative ionization), and another subjected to trimethylsilylation derivatization buffer (GC).

Basic molecules (e.g., some amino acids, sugars, nucleotides, carnitines, and phospholipids) ionize efficiently in positive ionization mode while acidic molecules (e.g., some amino acids, phosphates, sulfates, fatty acids, and steroids) ionize efficiently in negative ionization mode. GC covers compounds too hydrophobic or polar for the UHPLC (including small organic acids, diacyl lipids, some amino acids, and certain sugars).

Derived from this process, the wide variety of measured ionic species represented by separated peaks are integrated and quantified as they elute off the column. Associated with each peak is a mass spectrum, which the software compares to a database containing thousands of standard biochemical mass spectra. Using sophisticated algorithms, the spectra are filtered to reduce noise and positively identify each biochemical in the sample.

Data curation involves exclusion of false positives and confirmatory identification and relative quantitation of authentic, discrete biochemicals in each sample, referenced against the automated software calculations. Subsequent statistical analyses of the curated data are the next steps in the biomarker discovery and selection process, ultimately leading to biological interpretations and further mechanistic understanding of metabolic pathways under normal and diseased states.

Curated data with relative quantitation from the discovery screening process undergoes univariate statistical analysis to assess the statistical strength of individual small molecules as potential biomarker candidates.

To look for additive or synergistic biomarkers, subsequent multivariate statistical analyses are carried out on the data including variable selection procedures such as Random Forest and LASSO regression analyses. Top-ranked variables and models that appear the most frequently are selected.

Multiple biomarker candidates at this stage are selected for targeted data analysis by the three aforementioned mass spectrometric methods, resulting in analytical quantitation. Absolute quantitation includes running stable isotope-labeled internal standards and calibration standard samples, in order to calculate calibration curves and quantify the biomarker concentrations in your samples.

Identification of the best biomarker candidates/biomarker algorithms are derived from confirmatory targeted data results and a multitude of statistical techniques including, but not limited to, multiple linear, logistic, and spline regression models. These models are evaluated and the best ones are rationally selected.

The biomarker algorithms derived from a training set must then be validated against a test set from the same clinical study cohort. Moreover, this final biomarker algorithm must be secondarily validated in independent study cohorts.

The algorithm may be a collection of biomarkers, each of which independently contributes a statistically significant and additive correlation to a reference gold-standard diagnostic process or procedure. Such gold-standard diagnostic procedures may be clinically impractical due to labor-, cost-, or time-intensive reasons.

Case Study

Insulin resistance (IR) is a subclinical condition that precedes chronic diseases such as type 2 diabetes mellitus (T2DM) and cardiovascular disease by more than 10 years. In combination with pancreatic beta cell dysfunction, IR is known to be a major mechanistic cause for the development of T2DM. Even though IR and beta cell dysfunction are occurring in a silent manner over many years, no practical or routine lab tests exist for measuring IR or beta cell function.

Metabolon is involved in studies that are being carried out to address these unmet needs in order to discover biomarkers of IR and beta cell function. Pathophysiological processes like IR are not measured or readily detected by conventional glycemic measures such as fasting plasma glucose tests.

A labor-intensive, costly, and timely procedure that is the gold standard for measuring varying levels of insulin sensitivity in peripheral tissues (muscle and adipose) is called the hyperinsulinemic euglycemic clamp. This complex procedure can take several hours and is only performed in a clinical research setting.

Since it is known that a large fraction of hypertensive and dyslipidemic subjects are IR, just as observed in the majority of prediabetics, there is an unmet need to detect asymptomatic patients who are IR and in need of lifestyle and/or therapeutic intervention by their physician, due to the links of IR to macrovascular disease.

Therefore, base-line hyperinsulinemic euglycemic clamp samples were sourced and analyzed from a nondiabetic European cohort of patients, who are representative of the entire distribution of insulin sensitivity, with normal glucose tolerant insulin sensitive (NGT-IS), NGT-IR, or IGT subjects, with IGT being more IR than NGT.

The purpose of this study was to discover small molecule biomarkers from a fasted blood sample that was reflective of varying levels of insulin sensitivity. Such an IR test, with a measurement range from IS to varying degrees of IR, will enable physicians to detect and track dysmetabolic patients more effectively over time.

This clinical study is an example of a successful metabolic profiling program, resulting in the discovery of multiple biomarkers of IR. Currently, validation studies and product development efforts are under way to offer this IR test to the general population.

Walter E. Gall, Ph.D., is program manager for diagnostic development at Metabolon. Web: www.metabolon.com. Email: [email protected].