Project Specific Technology and Development
There are nearly as many applications for high-content analysis as there are projects. MAIA Scientific (www.maia-scientific.com) is developing what it calls an intuitive data-mining application for use with its high-content fluorescence and bright field imaging high-throughput screening.
Researchers at the National Changhua University of Education in Taiwan, in another example, are combining multiparametric data mining with case-based reasoning to develop a system to diagnose and develop a prognosis for chronic diseases.
Off-the-shelf solutions aren’t necessarily optimal or available for all disciplines. Consequently, some researchers are building their own. Pfizer Research Technology Center (www.pfizerrtc.com) is using high-content data mining to predict drug-induced hepatotoxicity. Scientist Arthur Smith, Ph.D., and colleagues developed a database of drugs that were marketed and safe and therapies that failed because of toxicity.
Then, using text mining, high-content biology, and primary cells, they developed a database of toxicological and pharmacokinetic content. Multivarient analysis was used to develop a decision-tree algorithm to identify toxic drugs. The result provides a highly accurate, early toxicological screen, according to Dr. Smith. Savings have been substantial enough for the program to be expanded to other areas.
Seth Harris, Ph.D., research scientist II at Roche (www.roche.com), is another case in point. He is developing a multistructure data-mining application for x-ray crystallography. Traditionally, he says, structural biology would provide one or two structures in an area. Now, it’s feasible to determine 100 or more structures of a target complexed with various small molecules.
His application is “somewhere between back of the envelope and preliminary implementation,” he reports. The focus right now is to understand what’s important in the structure. Currently, computational chemists and crystallographers get together and analyze the structure, identifying the properties that are important in a given development project. “I want the computer to further facilitate that.”
Dr. Harris’ intention is to push the conceptual framework from simple distance-based analysis so that it yields increasingly sophisticated metrics that can, for example, tabulate electrostatic metrics between the protein and the ligand. Because the significance of similar interactions varies according to the protein environment in which they occur, determining the most important parameters is difficult, he explains. Data like that “is hard to tabulate into numbers.”
The application is conceived as a guide for chemists engaged in drug design but it could also have merit as a data organizer. It is particularly advantageous for those who are new to a program or who work on multiple projects to help discover and track the most pertinent or novel structures.