2022 journal article

DINOS: Data INspired Oligo Synthesis for DNA Data Storage

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 18(3).

By: K. Volkel n, K. Tomek n, A. Keung n & J. Tuck n

author keywords: DNA information storage; DNA synthesis; DNA assembly algorithms
TL;DR: The approach offers greater density by up to 80% over a prior general purpose gene assembly technique, and in an analysis of synthesis costs, it is estimated that DINOS is as 105× cheaper than de novo synthesis. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: December 5, 2022

As interest in DNA-based information storage grows, the costs of synthesis have been identified as a key bottleneck. A potential direction is to tune synthesis for data. Data strands tend to be composed of a small set of recurring code word sequences, and they contain longer sequences of repeated data. To exploit these properties, we propose a new framework called DINOS. DINOS consists of three key parts: (i) The first is a hierarchical strand assembly algorithm, inspired by gene assembly techniques that can assemble arbitrary data strands from a small set of primitive blocks. (ii) The assembly algorithm relies on our novel formulation for how to construct primitive blocks, spanning a variety of useful configurations from a set of code words and overhangs. Each primitive block is a code word flanked by a pair of overhangs that are created by a cyclic pairing process that keeps the number of primitive blocks small. Using these primitive blocks, any data strand of arbitrary length can be assembled, theoretically. We show a minimal system for a binary code with as few as six primitive blocks, and we generalize our processes to support an arbitrary set of overhangs and code words. (iii) We exploit our hierarchical assembly approach to identify redundant sequences and coalesce the reactions that create them to make assembly more efficient.