Chemical code used to store Jane Austen quote in plastic molecules

Written words and other information could be encoded in synthetic molecules and recovered by analysing the chemicals.

This ensures that microscopic items of plastic could potentially hold a lot more data than is placed on today’s computer hard drives, which use cumbersome codes and relatively large magnetic particles to store information, says Eric Anslyn of the University of Texas at Austin.

Currently, data is kept using binary code – long strings of 0’s and 1’s. Its simplicity makes the code easy to decipher, but this process requires significant space on a difficult drive, says Anslyn.

His approach may be a space saver, although the original aim wasn’t to encode data at all. Anslyn have been attempting to create complex molecules that could make products like pharmaceuticals and dishwasher detergents far better.

However when he discussed his work with computer programmer friends, Anslyn realised that the compounds he was working with – made from components including hydrogen, nitrogen, oxygen, and the hydrogen isotope deuterium – could each represent symbolic values for storing information.

Various molecules built from these could become their own code language predicated on a rich “molecular alphabet” of 16 characters – a hexadecimal code. That’s eight times the characters used in the binary system, making the approach particularly efficient for storing data.

Read more: Libraries of plastic molecules could store huge amounts of data

What is more, the liquid chromatography-mass spectroscopy (LC/MS) analytical system he had been using could easily analyse and sequence such complex substances.

Inspired by the possibilities, Anslyn’s team developed software that would encode regular text symbols right into a hexadecimal “molecular language”. Then, they created molecules representing the code needed to write a simple statement: “Hello World!”.

A number of molecules were had a need to store the message, so to keep them in the right order when reading the message, the team used a particular plate containing a normal array of wells and located the molecules in the wells sequentially – a lttle bit just like the way a mechanical hard drive uses physical location to store a computer’s data.

Encouraged by how easily the program reconstructed the words following the molecules have been sequenced with LC/MS, the researchers shifted to a far more complex sentence.

An avid Jane Austen fan, Anslyn chose what he describes as an “apt but timeless quote” from his favourite author’s 1814 novel  Mansfield Park : “If one scheme of happiness fails, human nature turns to another; if the first calculation is wrong, we make a second better: we find comfort somewhere.”

The researchers gave a chemically encoded version of the sentence to a colleague, who wasn’t associated with the project. Armed with the brand new software, the colleague successfully browse the Austen quote completely.

Other teams have previously developed prototypes of molecular storage, but using binary. Anslyn says the hexadecimal version has “mind-blowing” potential for storing data in a smaller physical space – partly since the basic idea of the molecular code itself is indeed simple and familiar.

“We always write in symbols, and molecules are just another group of symbols that people can assemble – not merely for building molecules analogous to those within nature, but to create our very own inventions,” he says. As well as, apparently, to store the literary inventions of 19th-century novelists.

Journal reference: Cell Reports Physical Science , DOI: 10.1016/j.xcrp.2021.100393

More on these topics:

  • chemistry
  • data

Leave a Reply

Your email address will not be published. Required fields are marked *