Coping with Data Bottleneck
The continuing improvement of the sensitivity of extraction and amplification techniques has led to an increased amount of information available, owing to the increased likelihood of recovering DNA profiles from degraded and/or extremely small samples.
One of the few drawbacks of high throughput—if it can be called a drawback—is the mountain of data that it generates. “Greater automation is a big trend in the field,” said Mike Cariola, svp, forensic operations at The Bode Technology Group.
“We’ve noticed that, as we’ve developed automation to process samples in the lab, the bottleneck has shifted to data analysis—and that has become the new bottleneck of casework.”
Bode provides high-throughput DNA testing services, casework analysis, missing person identification, private and CODIS databanking of convicted offenders or arrestees, as well as paternity and nonforensic identification. “We do collections from crime scenes, handling about 6,000 cases a year, and about 100–200 thousand convicted offenders samples,” said Cariola.
“There are six million files in the U.S. database, and states are gradually taking on more of the processing. Although there is an increase in the number of players in this field, overall our numbers have generally increased, due in part to an increase in the number of backlog cases we take on, and also in part because of the global reach our company has.”
Cariola’s talk focused on the development of a forensic DNA case-management system to address these bottlenecks, as well as the increase in cold hits. He noted that the demands of the field test automation processes.
“We deal with very individual samples of varying sizes and quality,” he explained. “Degraded and challenged samples are a problem to a laboratory information management system (LIMS). You put in 2,000 database samples, you have one application. Two thousand forensic casework samples all need different amplifications, are of varying quality, and have individual requirements. None of the LIMS providers out there meet this need, so we made our own.”
Bode developed Bode-SIMS, which assists in standardizing processes, to integrate with robots for the automation of processing, sample analysis, and management of samples. This product, customized to meet the varied needs of law-enforcement crime laboratories, leads to enhanced quality control and efficiency of forensic DNA sample processing.
“This is the template most organizations are working from—building better analysis capability into software,” said Cariola. “A number of talks at the conference were on the software side, and one of the big trends we see emerging in casework is automating the process to interpret mixtures. And when you have complications in the sample, it’s a challenge for analysts to interpret that mixture.”
Building a Better Algorithm
Advances in automated mixture interpretation has been a key area of focus in the field by dint of necessity. Since higher throughput has generated more data, the pressure is on the human analyst to interpret the data as quickly as possible—hence the bottleneck. Martin Bill, scientific manager of Forensic Science Service, and his team are looking to alleviate that problem with new algorithms and software technologies that enable automated processing of this information.
“The current search/interpretation approach requires DNA analysts to make decisions based on the presence or lack of certain alleles within a given profile,” said Bill. “DNA is visualized as a signal, and the role of the scientist is to translate that signal into a simple set of numbers that defines the DNA profile. The unnecessary use of binary decisions during the analysis of DNA profiles is not only wasteful, but susceptible to errors.”
The challenges are not insubstantial. The binary approach requires the scientist to make an absolute decision on something that is not absolute. Bill’s group has developed a continuous DNA-interpretation model that enables automated interpretation of complex DNA profiles, thus avoiding the need to make decisions in the early stages of the search and interpretation processes.
“It allows for improved precision and accuracy of databases for these particular profiles and permits a meaningful assignment of the evidential value of complex DNA evidence,” said Bill.
The algorithm uses the probabilities to drive the databasing and evidential interpretation—and this further standardizes DNA interpretation. It can also be used to improve precision and accuracy of database searching, increase the percentage of data suitable for databasing, and improve court assessment.
Bill presented some case studies of this algorithm in action. “The technology has come a long way—the degree to which we can get a signal from a very small or degraded sample has increased tremendously,” he reported. “However, we’re at the point where we need to improve the ability to interpret those signals, and we’re in a really good position to create those solutions. There is still more we can do to improve interpretation, and our group is working on an advanced theory for reading low level and compromised data. Our current portfolio is excellent, but this is not the endpoint. We are still looking for ways to do it better, faster, and cheaper.”