Exploring the Future of Big Data Storage

Exploring the Future of Big Data Storage

Floppy discs, CD’s, USBs and Memory Cards. What’s next for storage?


8 min read

This painting is of St. Jerome in his study by Vincenzo Catena, painted in 1510. Catena depicts Jerome seated at a tidy desk, accompanied by three books, a bookstand, and a simple candlestick. The study reflects his peaceful retreat from Rome to the desert for prayer and meditation. Unlike other depictions, the scene is uncluttered, emphasizing Jerome's focus and tranquillity as he reads the Bible. He reads comfortably, even removing his slippers. The image invites us into the room, with an open foreground and welcoming elements like the desk, space, cupboards, and open windows. What is clear here is openness and an invitation to study, learn and make the Bible more accessible through translation.

St. Jerome's efforts are the same efforts scientists, engineers and educators face every day when handling complex, mountainous and often mysterious volumes of information and data.

The information about the world is being compressed further and further. With revolutions in AI, it requires scientists and engineers to collect more abundant data for training these AI systems. How can this future of big data be realised? What are the methods and the means for storing data? And will CDs, USBs and magnetic storage units become redundant?

We have been imprinting data onto materials since we were able to carve and make tools. We have been drawing on cave walls, printing on wood sheets, punching holes and developing machines which read 1s and 0s effectively to store, transcribe and translate information. Today, scientists across the world are exploring new avenues to punch our data into new materials, some of which have already been doing so for millions of years.


DNA, the oldest and quite possibly the first data storage unit, has emerged as a promising method for data storage due to its remarkable properties. Unbeknownst to the organisms that depend upon it, DNA has evolved over millions (possibly billions) of years into a robust storage unit. While DNA is primarily known as the molecule that carries genetic information in living organisms, its ability to store vast amounts of data has attracted the attention of researchers and technologists.

DNA possesses several characteristics that make it an attractive medium for storing digital information. Firstly, DNA is incredibly dense. It can store information at a density of around one petabyte (1 million gigabytes) per cubic millimetre. This makes DNA a million times more efficient in terms of data storage density compared to traditional digital media like hard drives or magnetic tapes.

Secondly, DNA has exceptional longevity. When stored in appropriate conditions, DNA can remain stable for thousands of years. This durability surpasses the lifespan of most current storage technologies, which are subject to degradation over time.

Additionally, DNA has the potential for massive scalability. The world's DNA synthesis capacity is continuously improving, allowing large amounts of data to be synthesized and stored in DNA strands. As technology advances, the cost of DNA synthesis is expected to decrease, making it more accessible for data storage purposes.

Schematic diagram of coding data onto DNA [1]

The process of storing data in DNA involves converting digital information into the four nucleotide bases that make up DNA: adenine (A), cytosine (C), guanine (G), and thymine (T). Each nucleotide base represents a binary unit of information (0 or 1). By encoding data into a sequence of these bases, information can be stored in DNA. The encoding and decoding processes rely on DNA synthesis and sequencing technologies.

Despite its potential, there are several challenges associated with using DNA as a data storage medium. One significant challenge is the high cost and time required for synthesizing and sequencing DNA. Currently, the process is relatively expensive and time-consuming, limiting its practical applications for large-scale data storage. However, ongoing research aims to develop more efficient and cost-effective methods for DNA synthesis and sequencing.

Another challenge is the error rate during DNA synthesis and sequencing. While modern DNA synthesis and sequencing techniques have low error rates, errors can still occur, potentially leading to data loss or corruption. Improving the accuracy of DNA read and write processes is an area of active research.

Furthermore, the process of retrieving specific data from DNA storage is currently slow and complex. The sequencing process can take a significant amount of time, and specialized algorithms are required to reconstruct the original data from the DNA sequence. Developing faster and more efficient retrieval methods is an ongoing area of research.

In conclusion, DNA has the potential to revolutionize data storage by offering unprecedented data density and longevity. While there are still challenges to overcome, ongoing research and technological advancements are bringing us closer to realizing the full potential of DNA as a reliable and efficient data storage method. As the field progresses, DNA-based storage may become a viable solution for long-term and high-density data storage needs.

Holographic Crystal Storage

Holographic crystals have been explored as a potential method for data storage, leveraging the principles of holography to store and retrieve information. Holography is a technique that allows the storage and reconstruction of three-dimensional images using the interference patterns of light waves. By adapting this concept to data storage, holographic crystals offer the possibility of high storage density and fast data access.

Microsoft Research's lithium-niobate crystal

In holographic data storage, a laser beam is split into two parts: the signal beam and the reference beam. The signal beam, which carries the data to be stored, is directed onto a photosensitive material, typically a crystal, such as lithium-niobate or photopolymer. The reference beam, which serves as a reference for the interference patterns, is also directed onto the crystal. The interaction of the signal and reference beams creates an interference pattern that is recorded as a hologram in the crystal.

Apparatus of the holographic storage process. [2]

One of the key advantages of holographic data storage is its potential for high storage density. Holographic crystals can store multiple holograms in different regions or angles within the same volume of material. This enables a significant increase in storage capacity compared to traditional storage media. Additionally, holographic storage allows for parallel access to data, enabling fast retrieval times.

Holographic data storage is a promising technology for high-capacity data storage. Unlike magnetic and optical storage devices that store bits as distinct changes on the surface, holographic storage records information throughout the medium's volume. It can store multiple images in the same space by utilizing light at various angles.

Furthermore, while traditional storage methods record data sequentially, holographic storage allows for the simultaneous recording and reading of millions of bits. This parallel approach enables faster data transfer rates than conventional optical storage. Overall, holographic data storage has the potential to provide greater storage capacity and improved data access speed.

Brian Roemmele is a key advocate for advancing this technology in the development of personalised AI devices. His developments aim to create AI Chat devices which can store the user's personal information as opposed to holding it in the cloud where privacy and security may be jeopardised —especially when it comes to personal information.

In theory, holographic storage has the capability to store one bit of data per cubic block, where the size of the block corresponds to the wavelength of the writing light. To illustrate, using a helium-neon laser that emits red light with a wavelength of 632.8 nm, the ideal holographic storage system could potentially accommodate 500 megabytes of information within a cubic millimetre of space.

However, holographic crystals are not without limitations. One crucial aspect is their degradation over time, which can impact the long-term stability and reliability of stored data. Holographic crystals, particularly photopolymer-based ones, can suffer from degradation due to a phenomenon known as "hologram decay" or "fading."This degradation can lead to a loss of stored information or a decrease in the quality of the reconstructed data. As of today, this process of degradation makes the storage capacity of the crystal unknown, yet the CD is still a reliable short-term storage method.

The degradation half-life refers to the time it takes for the stored information to degrade by half. In the case of holographic crystals, the degradation half-life can vary depending on the specific material, recording parameters, and storage conditions. It is an important consideration when evaluating the suitability of holographic crystals for long-term data storage.

To mitigate degradation and improve the longevity of holographic crystals, researchers are investigating various approaches. These include optimizing the composition and structure of the crystal materials, refining the recording and reading processes, and developing protective layers or encapsulation techniques to shield the crystals from environmental factors that contribute to degradation.

It's worth noting that holographic data storage is still an evolving technology, and research efforts are ongoing to address its challenges and improve its performance. While holographic crystals hold promise for high-density data storage, achieving long-term stability and mitigating degradation remain active areas of investigation.

Overall, the journey towards realizing the future of big data storage involves continued exploration, innovation, and refinement of storage technologies. By leveraging the strengths of emerging methods and addressing their limitations, we can pave the way for efficient, secure, and long-lasting data storage solutions.


  1. De Silva PY, Ganegoda GU. New Trends of Digital Data Storage in DNA. Biomed Res Int. 2016;2016:8072463. doi: 10.1155/2016/8072463. Epub 2016 Sep 5. PMID: 27689089; PMCID: PMC5027317.

  2. Nobukawa, Teruyoshi & Barada, Daisuke & Nomura, Takanori & Fukuda, Takashi. (2017). Orthogonal polarization encoding for reduction of interpixel cross-talk in holographic data storage. Optics Express. 25. 22425. doi: 10.1364/OE.25.022425.