Scientists have adapted the algorithm used by Google’s web search engine to develop a computational program that can identify panels of biomarkers for cancer prognosis. Google’s PageRank algorithm takes into account the network of hyperlinks between web documents as well as the search terms themselves to determine which pages are most relevant to a particular search query. A Technische Universität Dresden-led team has adapted this concept to develop NetRank, a biomarker identification algorithm that ranks genes in expression datasets according to both the base expression data and gene interaction data. Effectively, the score assigned by NetRank to a gene is influenced by the scores of the most important genes linked to it.
Robert Grützmann, Ph.D., and colleagues applied their technology to identify prognostic biomarkers of pancreatic cancer. They collected and analyzed 30 tissue samples from patients with pancreatic ductal adenocarcinoma involved in a multicenter study in Germany. Applying NetRank to the genome-wide expression profiles identified a set of seven genes (STAT3, FOS, JUN, SP1, CDX2, CEBPA, and BRCA1), which demonstrated a prognostic accuracy of 72%.
The genes were subsequently validated by measuring protein levels in a second cohort of 412 samples, and used to determine two separate biomarker panels for predicting survival in pancreatic cancer patients according to whether they had undergone adjuvant therapy or not.
The team claims the NetRank biomarkers were on average 12% more accurate than other markers in the available literature. They report on their approach and results in PLoS Computational Biology in a paper titled “Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes.”
NetRank first assigns a score for each gene according to the absolute correlation of its mRNA expression levels with the patient survival time in the dataset. The gene interaction network is then applied to spread this correlation out, and genes with the highest overall NetRank score are then selected as signature genes. Having used NetRank to identify the seven-gene biomarker panel based on gene-expression data from the original 30-sample cohort, the researchers validated the findings by analyzing protein levels in an independent set of tissue samples from 412 patients. “We wanted to test how well the proteins encoded by the marker genes are indicative for the survival of a patient when assessed by immunohistochemical staining of the patient’s tumor,” they write.
The immunohistochemical staining data enabled the identification of two separate subsets of the seven biomarkers that could predict survival in patients according to whether they underwent adjuvant therapy or not. When measuring protein levels rather than mRNA levels, a signature comprising STAT3, FOS, JUN, CDX2, CEBPA, and BRCA1 demonstrated a predictive accuracy of 65% for patients with adjuvant prognosis. A marker signature comprising STAT3, JUN, SP1, CDX2, and BRCA1 demonstrated a 60% predictive accuracy for patients without adjuvant therapy.
However, the level of accuracy using immunohistochemical staining to identify proteins was about 7% lower than that achieved by measuring mRNA levels, the team points out. This observation most likely related both to the fact that protein levels don’t always correlate with mRNA levels, and that protein levels were scored not on an increasing scale, but only according to whether they were low or high staining.
Nevertheless, the authors conclude, when looking at gene-expression datasets, “the additional predictive value of the signature markers compared to clinical parameters is 9% for patients with and 6% for patients without adjuvant therapy (70% versus 61% for adjuvant, and 65% versus 59% for non-adjuvant therapy) ... Our signatures classify patients into high- and low-risk groups with one-year survival rates of 54% and 76%, respectively (adjuvant six gene signature) and 55% and 69%, respectively (non-adjuvant five-gene signature) ... Since these signatures could be used to stratify patients for adjuvant treatment of the disease, they are a potential additional piece of information in clinical decision making and can help to reduce costs, improve patient survival, and quality of life.”