Almost all respondents to a recent GEN poll think a predictive data-modeling contest would be at least of some use to them. With 60% saying it would be very helpful and another 33.3% somewhat helpful, respondents are attracted to open competitions like the one hosted by Kaggle to develop new algorithms for predicting progression of the HIV virus.
In that competition, a self-taught data-miner from Baltimore outsmarted a team from IBM’s Thomas J. Watson research center, to capture the $500 prize. The contest shows how predictive data modeling could, and should, serve as a model for tackling some of the toughest problems in bioinformatics. Yet just 8% of the Kaggle scientific community has backgrounds in bioinformatics, biostatistics, and computational biology. And biopharma firms and institutes remain concerned about making data public as well as not being able to fully capitalize on the outcome of such a contest.