Fifteen tech-based companies and institutions have formed an alliance aimed at advancing DNA data storage by agreeing upon a “roadmap” of definitions and standards to help the industry achieve interoperability between solutions.
Illumina, Microsoft, Twist Bioscience, and data storage giant Western Digital are leading the effort as founding members of the DNA Data Storage Alliance. The four founding members and 11 other member companies and institutions have committed to addressing the explosive growth of digital data by establishing the foundations of a cost-effective commercial archival storage ecosystem.
That ecosystem, the Alliance, asserts, could potentially deliver a low-cost archival data storage solution alternative to current storage technologies, which have limited longevity and require data migration to achieve long-term data storage.
By contrast, DNA provides a stable format storage medium that is durable for thousands of years when properly stored, the Alliance says. DNA enables cost effective and rapid duplication within a tiny space: Ten full-length digital movies can be stored within the equivalent volume of a single grain of salt, though digital data preserved in DNA can be encased in glass beads or stored in capsules or pellets.
The Alliance cited a figure from Gartner projecting that by 2024, 30% of digital businesses will address the exponential growth of data that is poised to overwhelm existing storage technology by mandating DNA storage trials.
“In collaboration with University of Washington, we have demonstrated a fully automated end-to-end system capable of storing and retrieving data from DNA, and we have separately stored 1GB of data in DNA synthesized by Twist and recovered data from it,” Karin Strauss, PhD, senior principal research manager at Microsoft, said in a statement.
“We’re encouraged by the potential for more sustainable data storage with DNA and look forward to collaborating with others in the industry to explore early commercialization of this technology,” Strauss added.
Strauss announced the formation of the Alliance today at the Flash Memory Summit (FMS) 2020 Virtual Conference & Expo, a three-day storage technology event ending today.
To store data in DNA, a data file is first converted from its digital sequence of 0’s and 1’s into a DNA sequence of A’s, C’s, T’s and G’s. The DNA data file is synthesized in short segments of DNA from 200 to 300 bases long, then stored. Each short segment contains an index to indicate its place within the overall data file. To retrieve the data, the segments are sequenced and decoded back into the original file.
The DNA indexing system allows part of the file to be biologically recovered or “random access” before sequencing, so only data of interest is sequenced. Error-correcting algorithms are used during the encode/decode process, enabling all data to be recovered error-free.
In addition to developing an industry roadmap, the Alliance plans to develop use cases in various markets and industries as well as promote adoption of this future solution through efforts to educate the broader data storage community.
Joining Illumina, Microsoft, Twist, and Western Digital as members of the Alliance are:
- Ansa Biotechnologies, a DNA synthesis service provider for synthetic biology research.
- CATALOG, developer of what it says is the world’s first DNA-based digital data storage and computation platform.
- The Claude Nobs Foundation, focused on digital preservation of the audiovisual collection of its namesake, the founder of the Montreux Jazz festival.
- DNA Script, developer of SYNTAXTM, the world’s first benchtop DNA printer powered by enzymatic technology.
- EPFL (École Polytechnique Fédérale de Lausanne) – Cultural Innovation & Innovation Center (Montreux Jazz Digital Project)
- ETH Zurich – The Swiss Federal Institute of Technology
- Interuniversity Microelectronics Centre (Imec), an R&D hub for nano- and digital technologies
- Iridia, established in 2016 to develop the world’s first commercially-attractive, DNA-based data storage solution
- Molecular Assemblies, developer of an enzymatic DNA synthesis technology designed to power DNA-based products for industrial synthetic biology, precision medicine, and emerging applications that include DNA for data information storage
- Molecular Information Systems Lab at the University of Washington (UW), a partnership between UW Computer Science, Electrical Engineering, and Microsoft Research
“DNA is an incredible molecule that, by its very nature, provides ultra-high-density storage for thousands of years,” said Emily M. Leproust, PhD, CEO and co-founder of Twist Bioscience. “By joining with other technology leaders to develop a common framework for commercial implementation, we drive a shared vision to build this new market solution for digital storage.”