Skip to content

spaceth/goldenrecord

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Golden Record

or (DNA Encoding Algorithm for the Project Molecular Encoded Storage for Space Exploration)

Project Overview

The Molecular Encoded Storage for Space Exploration (MESSE) explores the possibility of preserving human knowledge in space DNA by implementing an algorithm to encode the musical notation into DNA storage and launch the pieces of DNA into space on a sub-orbital flight.

Process

Text-based Musical Notation Scheme

The notation is a concatenation of musical notes defined using two fundamental music note conditions, i.e., pitch and duration, in the form of string. The string is then compressed using the tendency of repeat patterns in a musical score which gives the capability to define a symbol used in the compressed result instead of long repeats chunks. Subsequently, The musical notation string is mapped to a binary string.

Optimized Musical Notation to Binaries

Similar to character encoding schemes in computers, such as ASCII or UTF-8, six bits are used to define one musical notation element. Since it is enough to store four-octaves notes, and essential musical components, which are enough for a proof of concept experiment and can easily be referred to as and converted into three nitrogenous bases. The larger n-bits size notation can be discussed in the future.

Error Correction Algorithm

By conditions and limitations of DNA Storage methods, there is a probability of mutations in the strand. Therefore, the usage of the error correction scheme is advised by a number of recent papers. As Reed-Solomon algorithm was utilized in the nature protocol earlier this year, we also employ this correction scheme using cho45's Javascript implementation of ZXing Reed-Solomon library.

Randomization

To strengthen the bond of the strand, we average the content of four types of nitrogenous bases by XOR the binaries content with another pre-created binary string. The Pseudorandom Number Generator (PRNG) is used to create a binary string, we select the xorshift algorithm to do the procedure, in contrast to the Mersenne Twister in the aforementioned protocol.

Conversion to Nitrogenous Bases

The binaries will be mapped into four types of the nitrogenous base in DNA: Adenine, (A) Thymine (T), Cytosine (C), and Guanine (G). Two bits of string are converted to one base.

Sample

In the MESSE project, we stored the song ความฝันกับจักรวาล (The dream and the universe) by a Thai famous rock band Bodyslam. By using such procedures, we successfully encoded the musical notation of the song within 640 Base Pairs of a DNA strand.

We use 9111934 and 20121996 which can be referred to as Carl Sagan's Birthdate and Date of Death as a seed for a randomization algorithm. As a tribute to him, in which many of his projects and writings ideate this work, and also have great influences as an inspiration and a motivation for our team.

(It may be noted that the distribution of 01 content the seed generated is tested before the usage in the final product, and the result are in acceptable rate)

The final algorithmic output was cut into three smaller DNA strands with the addition of an overlapping part to thereafter conduct biomolecular experiments such as the Gibson Assembly, prior to and following the sub-orbital launch which will take place on Blue Origin NS-13 in which DNA strands will be exposed to microgravity, radiation, and another space environment variables.

References

About

one aesthetic way to store musical notes into nucleic acid sequence

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published