Improving on Intel
In March 2010, CLC Bio released a de novo assembler that constructs whole genomes of any size, including human and plant genomes, on a single workstation computer. CLC Bio says that its de novo assembler algorithm runs 50 times faster than existing products, and deciphers complete datasets in just a few hours.
The company also adds that its assembler requires 48 gigabytes of RAM compared to others that need 300 gigabytes of RAM. The software engineers at CLC Bio accomplished this by creating new data-compression algorithms to take advantage of computing power inherent in Intel microprocessors that generally lies dormant.
All Intel microprocessors contain the MMX™ technology that runs many calculations in parallel. The MMX technology, added in 1996, was intended for handling complex graphics but never caught on. However, “our skilled computer programmers realized it was an ideal technology for bioinformatics, which also runs lots of parallel calculations,” says Knudsen. The company then harnessed this built-in function to speed bioinformatics calculations. “If you use our algorithm, you can crunch a lot more data,” he says.
CLC Bio’s de novo assembler works through an intuitive, user-friendly graphical as well as a command-line interface, according to Knudsen. The de novo assembler also combines datasets generated by different HTS instruments including those sold by Illumina, Roche, and Applied Biosystems.