Wednesday, September 15, 2010

How Many Gigabytes are in a Human Finger?

I often like to marvel at how efficient DNA is at storing and copying information.

There are about 236 million base pairs in the honeybee genome, and at 2 bits for each base pair* it could contain 56 megabytes of information**. There are about 950,000 neurons in its brain, each one has a copy of the entire genome. When we do the multiplication we see that the honeybee brain can hold about 52,000 Gigabytes of information!

But how fast is this information being copied? It takes a male worker bee about 20 days to reach adulthood so not counting the first cell thats a rate of about 30 Megabytes/s. Thats not so fast considering a high speed hard drive can go over 1 Gigabyte/s, and technically CD/DVD technology can copy great quantities of information even faster.

All that information is packed into a volume I estimate to be less than 8 millimeters cubed. The following is not possible, but if we could extend that density of information to the volume of a typical flash drive, 9 cm^3, the brain matter could hold 60 million gigabytes.

It turns out that all this information is more than Nature really needs. In the human genome its estimated that %98 of the DNA doesn't even code for proteins. This means that the biological cost of extra DNA is almost zero. Its so cheap, so fast and so small that more compact genomes are hardly selected for!

The human genome is even bigger than a honeybee's, I wonder how much data is in a human finger . . . if anyone can hazard a guess for the number of cells in a finger, post a comment!

* 2 bits/base pair because there are 4 possible bases, 2^2 is two bits
** note it could contain less actual if the entropy is low, but that is irrelevant to my point


  1. A cell ~10um diameter, so ~10^3 um^3 volume. My pointer finger is, I'd estimate ~5 cm^3. So, ~5 x 10^9 cells? If 3 x 10^9 base pairs in human genome, that's around 715 MB. So, 3.4 x 10^6 TB, or 3.25 exabytes? Yikes.

  2. This comment has been removed by the author.

  3. Congratulations Yoni, you have now discovered an amazing way for me to store 715 MB of data in my finger, ~5 x 10^9 times. Although, I can almost fit my entire genome on a CD. However, if you found a way to encode cells with unique data and combined them into an organic hard drive, and this hard drive was about the size of an average human, it would store 6.35 zettabytes or around 6.8 billion terabytes. This is also approximately twice the amount of data observed by the entire US population in 2008 (not unique data). As an estimate if you take the 3 Lansey Brothers and convert them to organic storage mediums we could probably store all the unique information of all man kind

  4. There are certain times when shorter strands of DNA do get selected for. Viruses have very small genomes, and in some viruses those code for more proteins than genes. The genome is transcribed forwards, backwards and forwards again from some point in the middle, generating three proteins.

    Transcribing DNA is actually extremely energy-intensive. The really interesting thing is that those things with the shortest genomes are precisely the parasites that don't actually have to pay the complete cost of copying it.

  5. Worker bees are uniformly female. DNA encodes for much more than proteins as we're discovering now, the information is stored in codons which don't have a one-to-one correlation with the data and certainly don't have a two bit per pair correlation, and those viruses which have short genomes need them to be short precisely because they're hijacking cellular machinery. All of them need to escape detection, some of them need to be inserted into the host genome, and in any event they don't need to code for all the information they use, since they make use of existing cellular machinery.
    Entertaining, though.

  6. My pinkie is smarter than your pinkey. Nyah, nyah, nyah.

  7. I'll go with a bajjillion.

  8. I'd tell you, but then I'd also have to bill you by the hour...