Reference Genome

Script(s)/bots that do something with the Human Genome Project's nucleotide sequences

I've downloaded *Genome Reference Consortium Human Build 38 patch release 1 from the National Center for Biotechnical Information (thanks @vogon for pointing me to this!). I was using the one from Project Gutenberg, but that was only build 34. This is in the FASTA format.

I know very little about DNA and the Human Genome Project's, but since Project Gutenberg has nucleotide sequences from the Genome Project I thought I'd try to come up with interesting ways to look at them.

As best I can tell, they're in the FASTA format. I've taken a file (started with Chromosome 1) and stripped the top Project Gutenberg text out of it as well as the first identification line so that I'm left with only the nucleic acids. There are large sections that have only the letter N which seems (according to the FASTA format) be unknown nucleic acids. The other characters map to Adenine, Cytosine, (Guanine)[http://en.wikipedia.org/wiki/Guanine], and (Thymine)[http://en.wikipedia.org/wiki/Thymine].

~~All of the sequences can be downloaded from Project Gutenberg so I'm excluding them from this repository since they're rather large.~~

Twitter Image Bot

The first thing I've tried to do is to build an Twitter bot that tweets images of portions of the DNA sequence. It takes 28,419 acids at a time and builds an image that is 840x840. Each acid it finds, maps to a color 5x5 square. This bot will tweet a section every hour. At that rate it will take about a year to finish all 248,564,422 acids (8,760 images).

This is the image.py script.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
README.md		README.md
image.py		image.py
image.sh		image.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

image.py

image.py

image.sh

image.sh

Repository files navigation

Reference Genome

Twitter Image Bot

About

Releases

Packages

Languages

amarriner/ReferenceGenome

Folders and files

Latest commit

History

Repository files navigation

Reference Genome

Twitter Image Bot

About

Resources

Stars

Watchers

Forks

Languages