Nature's search scheme

Go to interactive animation of  Nature's search scheme
This interactive animation was done using paperjs.  It will only work on browsers supporting HTML5 canvas




Here is nature's search problem. Imagine a box that can pack a moderate sized television (27 cubic feet box - 3x3x3 feet ), filled with some liquid that is a little syrupy - it can't be water because water is not syrupy.  Within that box, imagine a blob that that is one-eighth the size of a dice ( 0.125 cc - .5x.5x.5 centimeters cube. A dice is roughly 1 cc ). Now this little blob has grooves etched into its body - so its almost like a key. Now imagine a coiled tape almost 3 times the height of Empire State Building  (yes, nature can fit a 4000 feet long coiled tape in a 3x3x3 feet box). Its a tape with grooves etched on its sides functioning like keyholes. The little blob can latch onto this tape wherever the grooves on the blob perfectly complement the grooves on the tape, like a key fitting a lock.

Nature's search problem is how to find the right "keyhole" match for the  little blob on the 4000 feet coiled tape that is packed inside the television box. The little blob is randomly zipping in the syrupy medium  inside the television box at approximately 30 feet/sec - this movement is driven  by thermal fluctuations (think blobs zipping, not just moving around, in a lava lamp). At this speed it can go from one end of the box to the other in .1 seconds (100 milliseconds). Gravitational force on this little blob is negligible - viscous force from the syrupy liquid filling the box is a billion times stronger than gravity.

Nature can finish this search and find a right spot to dock the dice sized little blob on the 4000 feet tape, on average in 3-5 minutes. I skipped an important detail for effect. It cannot solve that for just one little blob, but if you have a certain amount of them, zipping around the box randomly, then it will complete a successful search and dock on one or more of them (assuming there are multiple matching key slots on the tape) on average in 3-5 minutes. Nature's search is not unlike a Google search, except nature has no index of documents like Google to quickly access a document! Search has to be done each time on the entire 4000 feet coiled tape.

The 3x3x3 feet box above is a single cell organism - a bacteria. The .5x.5x.5 cms little blob is a protein molecule. The 4000 feet long tape is a bacterial genome. All dimensions are magnified a million times to visualize the scope of the problem in "our world" dimensions.

A model was proposed in 1981 for how nature solved this search problem. This year, in June 2012, a paper published in Science confirmed key elements of the proposed model in a living single cell organism - a bacteria. It still remains to be seen if this search method is used in all living organisms, although proposed models claim it is, with some variations.

The search method is as follows. A little blob that is "randomly" zipping through the cell, with some probability comes across the tape and does a "loose docking" onto it, aided by an "on-board" loose docking machinery. It is a loose docking because it enables the blob to slide along the tape. So once it docks loosely, it slides for some length, driven again by thermal fluctuations, before it "randomly" disengages from the tape. But while it is sliding along the tape, it will test,"randomly" at some spots,if the grooves match. If there is a match, it docks tightly using an "on-board" tight docking machinery. The search is complete. If it disengages from the tape before finding a match, it may, with some probability, dock loosely again at some position on the coiled tape and perform the "slide and test" search again.

There is a small hitch while it slides along the tape though. There may be stumbling blocks - which are nothing but other blobs just like this one who are performing or have completed a search! So our little blob may hit an obstruction, and may disengage from the tape, since it is only loosely docked in sliding mode (remember it docks tightly only after search succeeds). As we saw earlier, it may also, with some probability, return and dock again to continue its sliding search, perhaps this time, past the obstruction. Interestingly,  the blob has also been observed to slide over the target match site on tape several times before tight binding - almost like a helicopter hovering over a landing site. So it appears nature has converged on a trade-off between rapid loose docking search (on non-matching areas so it can slide and potentially return and reengage once it is past an obstruction) and tight docking (where a match occurs).

You may wonder, how important is this search? This search is central to the functioning of a single cell - a cell wont exist if this search does not work. All organisms are made up of cells, starting from single celled bacteria to us humans - we have around 100 trillion cells.

Why do cells perform this search? Cells perform this search to make proteins. This search is happening right now, in almost every cell in your body. Each cell is a remarkable computing machine. The coiled tape is the genetic code, full of recipes to make different proteins. Somewhere located on this coiled tape, is the recipe to make a  particular protein - this recipe has to be first searched and found to make that protein. What do protein molecules do? Protein is perhaps nature's most ingenious and elegant design solution, both from a hardware and software standpoint. We shall look at these magnificent "nano machine" molecules separately, but for now, lets just say protein molecules come in different shapes and sizes and perform a wide array of functions:  they serve as  raw material for creating biological hardware (our bodies are held together by a protein - 25% to 35% of our bodies is this binding protein - our bones get their strength to withstand stretching from this protein),  transporting "stuff" around (oxygen is transported by a protein), messengers  initiating growth of body,  accelerating reactions (they can make reactions happen a million times per second), sensory input transducers - converting sensory input into signals to our brain (protein molecules in the eye capture light and converts it into a signal to our brain),  software execution control (some proteins can control the rate of their own "recipe reading" and the recipe reading of other proteins)  - the blob we saw earlier, is itself a protein.

So how do we create a protein? To create a protein, its recipe has to be read out from the tape. The recipe reading machinery does not dock successfully on the tape, under normal conditions to read the recipe (it does at times, but at a very low rate of success). However, when certain conditions are met, such as the docking of the blob we saw above, the docked blob assists the reading machinery to attach to the coiled tape and read the recipe. There are blobs that prevent the tape reading machinery from attaching to the coiled tape too, thereby preventing the reading of a recipe completely.

For those of us who know programming, it is just like the conditional expression that precedes a block of code. If the condition is met, the block of code that follows the condition executes - in the case of nature, the criteria for satisfying a condition is the presence, or in some cases, even absence of a docked blob.

Lets look at a real life example of the need for this search. Take a single cell life form such as a bacteria. Lets say it can "digest" two types of food - sugar and milk. Given a choice of sugar and milk, it would prefer sugar, only because it is easier to digest sugar than it is to digest milk. Digestion, in this case of a single cell bacteria, is the ability to break down a molecule of sugar or milk, so that it can extract energy from the broken down molecule. So when both sugar and milk are present, it shuts down its milk breaking down machinery, which is nothing but turning off the portion of the genetic tape that creates the milk digesting protein. This turning off requires a search for the milk breaking down recipe just like the one described above.

Lets take another real life example - "us". Pretty much every functioning cell in our bodies, performs search to create proteins for different tasks - the little blob that docks on the tape and blocks/enables the recipe reading, is itself a protein. In the interactive animation, a particular blob is shown in detail. It plays a central role in determining the life span of our cells. In more than 50% of all human cancers, this blob has been found to not function properly, causing search to fail. The malfunction has been attributed to its inability to complete a successful search.

So if protein recipe search takes 3-5 minutes, how long does full protein production take? The average time to read recipe from the gene is about 30 minutes in mammals and another 30 minutes to read recipe and make a protein (in single cell bacteria it takes about a minute to read recipe and about 2 minutes to make protein from recipe). If it takes so long, clearly protein production from scratch is not a viable strategy for quick responses to external stimuli, particularly for us humans (bacteria can get by - they can create proteins in minutes). Nature has other fast response methods, clearly. One of the fast response methods is based on the switching of a protein between active and passive states - it takes about 1-100 microseconds for proteins to switch states. It is this rapid switching that enables nature to respond quickly to input stimuli. For instance if a picture is flashed at you, you can consciously perceive it in around 100-200 milliseconds. This rate of communication is made possible by fast switching of proteins that facilitate the communication of the stimulus to the brain. However, if you remember the contents of this blog, say five years from now, then that retention of memory required the production of new proteins, that happens in the order of hours. If you forget this article, it is only because the retention of the memory involved switching of proteins, that were already produced, was lost - the contents of this article didn't capture your interest enough to be converted into a long term memory by creating new proteins. This example shows you different time scales - protein sensors in your eye sensing and switching in the scale of microseconds followed by protein molecules switching to communicate what you saw to your brain (again scale of microseconds), resulting in conscious perception (scale of milliseconds) of what you saw. The short term memory retention of this article also involves switching of proteins that have already been produced, and are ready for use. The long term memory storage of this article, however, if it ever gets to that, involves creation of new proteins, which happens in the order of hours.

Notes and references

Sizes and scales
Size of bacterial cell - 1 cubic micrometer. Scaled a million times - 1 meter or ~3 feet.

Average size of protein - 5 nanometers. Scaled a million times - .5 cms

Size of bacterial genome - 4.6 million base pairs. The distance between bases is .3 nm. Scaled a million times -  ~4000 feet.

It takes less than 100 milliseconds for a protein molecule to traverse a  single cell organism (a bacteria) that is roughly a micrometer in length.

The numbers above are from the following sources:
An Introduction to Systems Biology, design principles of biological circuits - Uri Alon
Bionumbers - database of numbers created by Harvard and Weizmann institute 
How big are genomes? - an FAQ on numbers in biology created by Harvard and Weizmann institute


The viscous force on molecules inside cells is in the range 1-1000pN (pico Newton). The other appreciable forces on a molecule are covalent bonding force ~10,000pN, thermal force 100-1000pN and electrostatic/Van der Waals force 1-1000pN. Gravitational force is negligible in comparison - a billionth of 1pN.
Mechanics of Motor Proteins & the Cytoskeleton - Jonathan Howard

The 3-5 minutes search completion time was observed in a living single cell organism - a bacteria. The June 2012 Science paper reports these findings.

Time to read the coiled tape (gene) and make a copy (transcription) is around a minute for single cell bacteria and about 30 minutes in mammals. Note these are average times - there are recipes that require 17 hours to  read in humans, due to the length of the recipe (dystrophin gene). The recipe copy has a lifetime of 2-5 minutes in bacteria.  In mammals the recipe copy has a lifetime of 10 minutes to over 10 hours. During this time, multiple copies of proteins can be created from a single copy of the recipe - this seems to be a "natural" optimization for nature to have converged on, given the time and energy expended to copy the recipe. Time to create a protein from the recipe copy (translation) is around 2 minutes for bacteria and about 30 minutes for mammals. An Introduction to Systems Biology, design principles of biological circuits - Uri Alon

Memory storage mechanisms
Long term memory - a molecular framework - Nature 1986
Molecular mechanisms to maintain long term memory Nature Neuroscience 2011

Proteins - few examples of them performing various functions
Collagen - this protein is the main component of the connective tissue that holds our bodies together. They are found in bones too, giving bones their tensile strength, while a calcium based mineral gives bones their ability to withstand compression.

Hemoglobin - this multi protein molecule is responsible for carrying oxygen in our blood.

Proteins serve as sensory input transducers. For instance a protein called Rhodopsin is involved in capturing light and converting into a signal to the brain. These sensory input transducers reside on cell membranes and capture external input such as molecules (taste buds sense food, smell receptors sense odor) or even light, and convert them into signals for further cellular processing.

Publications
Science Vol 336, June 2012 - The lac Repressor Displays Facilitated Diffusion in Living Cells - Petter Hammar, Prune Leroy, Anel Mahmutovic, Erik G. Marklund, Otto G. Berg, Johan Elf

Biophysical journal, May 2012 - Generalized Facilitated Diffusion Model for DNA-Binding Proteins with Search and Recognition States - Maximilian Bauer and Ralf Metzler

PNAS, 2010 - A single-molecule characterization of p53 search on DNA -Anahita Tafvizi, Fang Huang, Alan R. Fersht, Leonid A. Mirny, and Antoine M. van Oijen

Biochemistry, 1981 -Diffusion-Driven Mechanisms of Protein Translocation on Nucleic Acids - Otto G. Berg, Robert B. Winter and Peter H. von Hippel


Go to interactive animation of  Nature's search scheme
This interactive animation was done using paperjs.  It will only work on browsers supporting HTML5 canvas




No comments:

Post a Comment