Stretching the Pipeline
With the internet as your pipeline, anyone, anywhere can do the refining, distilling a model from the data using their particular skillset. The implications are tremendous—no matter the problem, specialized help is available and cheap!
Crowdsourcing solutions to technical problems through public contests has become very popular in recent years. While the million-dollar Netflix to predict movie rankings might be the most visible application of crowdsourced brilliance, it wasn't the first. In 2001, Eli Lilly launched Innocentive to farm out difficult chemical syntheses before "crowdsourcing" was even a word. It has since spun off and expanded its challenges and prizes; though you'll still find things with lots of aromatic rings, the company also trumpets successful solutions in oil spill recovery, water purification, ALS biomarkers (a $1 million prize), and even flashlights that make life without electricity a little easier.
Competing in an Innocentive challenge is a lot like answering a call for proposals—because it is. You don't know how you're doing until the judges look them over, but such is the nature of the problems, being mostly design. Only data and modeling challenges lend themselves to automatic evaluation and regularly updated leaderboards, and that's where Netflix, and later Kaggle, come in.
The Netflix prize only ran once (and successfully beat their in-house algorithm by over 10%), but Kaggle, formed in 2010, has already closed 22 competitions and has six others running. They too have good results to show off—their users beat state-of-the-art models in every competition, including mapping dark matter for NASA and predicting HIV progression in patients.