Friday 17 August 2012.

A group of US scientists have encoded a 53,000-word e-book - including 11 images and a computer program - entirely in DNA. It means DNA could become a future option for storing large amounts of data.


This is the largest amount of data stored artificially using chemically synthesised deoxyribonucleic acid - or DNA.

The book, program and images had to be converted to a sequence of 5.27 million zeroes and ones, which ended up as 54,898 strands of nucleotides - the main component of DNA.

After the information was encoded, drops of DNA were attached to a solid surface known as a microarray chip. These chips were kept at four degrees Celsius for three months before being dissolved and sequenced to re-read the data.

Each copy of each strand was sequenced up to 3,000 times to check the reliability of the information. There were only 10 read errors - and multiple copies of each block of data were synthesised as part of the project to help correct this.


The book can be decoded using DNA sequencing techniques commonly available in university-level science laboratories and hospitals.

The three researchers from Harvard Medical School stated in an article published in Science that "DNA is among the most dense and stable information media known". The biological molecules that make up DNA will always be able to be read without becoming obsolete.

The slow and costly nature of DNA sequencing is not suitable for data which has to be accessed and changed repeatedly, but this could be an effective long-term storage option for archiving large amounts of information.

Scientists estimate that one gram of DNA can store up to 455 billion gigabytes of data: more than 100 billion DVDs.

But the Harvard project did not involve any living organisms, which would have introduced a myriad of complex factors and risks, so carrying around an armful of data - literally - is still some way from becoming a reality.


Book written in DNA code

Scientists who encoded the book say it could soon be cheaper to store information in DNA than in conventional digital devices

Geraint Jones, Thursday 16 August 2012

Book of life: DNA is the ultimate compact storage medium. Photograph: Alamy

Scientists have for the first time used DNA to encode the contents of a book. At 53,000 words, and including 11 images and a computer program, it is the largest amount of data yet stored artificially using the genetic material.

The researchers claim that the cost of DNA coding is dropping so quickly that within five to 10 years it could be cheaper to store information using this method than in conventional digital devices.

Deoxyribonucleic acid or DNA – the chemical that stores genetic instructions in almost all known organisms – has an impressive data capacity. One gram can store up to 455bn gigabytes: the contents of more than 100bn DVDs, making it the ultimate in compact storage media.

A three-strong team led by Professor George Church of Harvard Medical School has now demonstrated that the technology to store data in DNA, while still slow, is becoming more practical. They report in the journal Science that the 5.27 megabit collection of data they stored is more than 600 times bigger than the largest dataset previously encoded this way.

Writing the data to DNA took several days. "This is currently something for archival storage," explained co-author Dr Sriram Kosuri of Harvard's Wyss Institute, "but the timing is continually improving."

DNA has numerous advantages over traditional digital storage media. It can be easily copied, and is often still readable after thousands of years in non-ideal conditions. Unlike ever-changing electronic storage formats such as magnetic tape and DVDs, the fundamental techniques required to read and write DNA information are as old as life on Earth.

The researchers, who have filed a provisional patent application covering the idea, used off-the-shelf components to demonstrate their technique.

To maximise the reliability of their method, and keep costs down, they avoided the need to create very long sequences of code – something that is much more expensive than creating lots of short chunks of DNA. The data was split into fragments that could be written very reliably, and was accompanied by an address book listing where to find each code section.

Digital data is traditionally stored as binary code: ones and zeros. Although DNA offers the ability to use four "numbers": A, C, G and T, to minimise errors Church's team decided to stick with binary encoding, with A and C both indicating zero, and G and T representing one.

The sequence of the artificial DNA was built up letter by letter using existing methods with the string of As, Cs, Ts and Gs coding for the letters of the book.

The team developed a system in which an inkjet printer embeds short fragments of that artificially synthesised DNA onto a glass chip. Each DNA fragment also contains a digital address code that denotes its location within the original file.

The fragments on the chip can later be "read" using standard techniques of the sort used to decipher the sequence of ancient DNA found in archeological material. A computer can then reassemble the original file in the right order using the address codes.

The book – an HTML draft of a volume co-authored by the team leader – was written to the DNA with images embedded to demonstrate the storage medium's versatility.

DNA is such a dense storage system because it is three-dimensional. Other advanced storage media, including experimental ones such as positioning individual atoms on a surface, are essentially confined to two dimensions.

The work did not involve living organisms, which would have introduced unnecessary complications and some risks. The biological function of a cell could be affected and portions of DNA not used by the cell could be removed or mutated. "If the goal is information storage, there's no need to use a cell," said Kosuri.

The data cannot be overwritten but, given the storage capacity, that is seen as a minor issue. The exercise was not completely error-free, but of the 5.27m bits stored, only 10 were found to be incorrect. The team suggests common error-checking techniques could be implemented in future, including multiple copies of the same information so mistakes can be easily identified.

The costs of DNA-handling tools are not yet competitive enough to make this a large-scale storage medium. But the costs and scale of the tools are dropping much more quickly than their electronic equivalents. For example, handheld DNA sequencers are becoming available, which the authors suggest should greatly simplify information stored in DNA.

Kosuri foresees this revolution in DNA technologies continuing. "We may hit a wall, but there's no fundamental reason why it shouldn't continue."


Soon you’ll be backing up your hard drive using DNA

Think the memory card in your camera is high-capacity? It’s got nothing on DNA. With data accumulating at a faster rate now than any other point in human history, scientists and engineers are looking to genetic code as a form of next-generation digital information storage.


The researchers claim that the cost of DNA coding is dropping so quickly that within five to 10 years it could be cheaper to store information using this method than in conventional digital devices.

Deoxyribonucleic acid or DNA