Kaggle generates revenue from sponsors who pay to host competitions. “If you think about where we’re most likely to make the majority of our money, it’s probably in financial services,” Goldbloom stated. That makes sense: The industry has the money to fund sophisticated number-crunching that can produce relevant predictive models.
“That’s probably where Kaggle will be most commercially successful, but we’re very much interested in doing biotech and life sciences,” he added. “Our community isn’t going to be very satisfied just solving problems for insurance companies or hedge funds all day long. They’re going to want to do meaningful work as well.”
The most lucrative award among competitions is the Heritage Health Prize, whose sponsor Heritage Provider Network will award a $3 million grand prize to the player that develops a breakthrough algorithm using patient data to prevent unnecessary hospitalizations by predicting which patients are most at risk of an in-patient stay over the next year. The competition ends April 3, 2013.
Kaggle also enjoyed the prestige of being selected by NASA and the Royal Astronomical Society for a competition that offered an all-expense paid trip to the Jet Propulsion Laboratory, valued at $3,000, to the player that could develop new algorithms applicable to measuring the distortions in galaxy images caused by dark matter.
“The society and NASA have both come back to us and said they want to do more,” Goldbloom said. “We’re in discussions with them about running a fellowship where an astronomer will spend three months with Kaggle and a Ph.D. student. Given the success of this project, I think we’ll see much more happen in astronomy than in other scientific fields.”
Kaggle, which moved earlier this year from Melbourne, recently completed an $11.25 million series A financing round led by Index Ventures and Khosla Ventures. Goldbloom said proceeds would be used for scaling up operations. “We want to get to the situation where we’re hosting 10,000 competitions a year.”
Kaggle also wants to grow its researcher community and its staff, which now stands at seven employees. “We would like to be roundabout, I’d say, 20 to 25 in a year’s time,” Goldbloom noted. “That will be a mix of technical, sales, data scientists, and developers.”
To reach its 10,000 competition goal, Kaggle will need to not only expand in lucrative areas like banking but to make itself much better known in life science circles, a challenge recognized by Goldbloom. Research institutes and universities with a strong bioinformatics bent would likely help Kaggle find new players and new problems to solve.
Kaggle will also have to avoid missteps like the failure to re-sort training data and export it to a new file, which occurred in Wikipedia’s Participation Challenge to build a predictive model for the number of edits an editor will make in the five months after the end date of the training dataset. Kaggle’s ability to learn from that and expand its business will determine growth not only for the company but also for its field of predictive data modeling.