So we have DNA.
DNA consists of a string of base pairs, which we describe using one of 4 letters, AGCU. *. Every 3 letters is a codon, which maps to one of 20 amino acids. Those of you who are computer scientists may have noticed that 444 multiplies to 48, not 20, that’s because DNA is inherently redundant so we can go out in the sunlight without it turning us into mutants.
Now, you’ve hopefully heard that we’ve sequenced the genome. What that means is that we’ve read out the all the letters in the 23 chromosomes to find out the letters. Then we can take those letters by threes and get the amino acids.
Then we stopped, because in the body, what happens is it assembles those amino acids into long strings. In the body, those strings of amino acids assemble into proteins, the little molecular sized machines that run our body. It turns out, its not obvious how you go from these strings of amino acids into a final machine. It’s like we took the Lego Death Star at 4016 pieces, and we don’t have the 190 MB building instructions. Instead, every lego is on a fishing line to the lego next to it, and we have to figure out the pieces snap together.
But we could! We learned a lot from the Human Genome Project
- There are about 22,300 genes producing these proteins.
- We’re 93% the same as a tree, we both have circulatory systems, etc.
- The junk DNA (the space between the genes) is important too.
The next step for the human race is to start getting the structure for the proteins produced by each gene. If we can do that that will tell us a huge amount about how the body works. That’s not trivial but its not hard, just wet and icky because its test tube work. To do that, we build one of these amino acid strings, heat it up to 98.6 and shake it up a bit. The proteins self-assemble, because that’s what they do. Then we cool it down until the proteins form crystals, and we take x-rays to figure out the structure.
we could do this. Like the Human Genome Project, we could divide up the 22,330 genes across the labs in the world, until we had the structure of all proteins.