Friday, November 30, 2007

Extreme Gene Transfer and Speciation, Part I

This is a bit off the topic of protein folding, but I feel it's an extremely important topic to think about, if you work in molecular biology.

Think of how you would explain the following riddle.

If you go in your back yard with a spoon and dig up a spoonful of topsoil, I can virtually guarantee that you will be able to find Pseudomonas aeruginosa somewhere in that soil sample.

I can do the same thing. My back yard has (some) topsoil, and I' m sure it contains Pseudomonas aeruginosa.

As it turns out, I live in Connecticut, about an hour from New York City. But if I go to any grassy area of Central Park and dig down an inch or two into the topsoil, I'm confident I will be able to find our good friend Pseudomonas aeruginosa.

Bear in mind, on a bacterial scale, my back yard is about as far from Central Park as Earth is from Venus.

Now then. If I travel 10,000 miles to Sydney, Australia, I will find topsoil there, too, and in the topsoil I am quite confident that I can (once again) find Pseudomonas aeruginosa. This time, on a bacterial scale, I have done the equivalent of traveling something like ten solar-system diameters.

One might ask how it is that an organism of Pseudomonas aeruginosa's size and limited ability to travel could possibly be found over such a wide area (encompassing my back yard in Connecticut, Central Park in Manhattan, and somebody's back yard is Sydney). True, Pseudomonas aeruginosa has a flagellum. But I don't think a flagellum helps, it this case.

Someone can say, "Well, you can explain the widespread distribution of Pseudomonas aeruginosa by the physical transport of dirt through the air, or by birds carrying the organism on their claws." This kind of answer is, if not particularly satisfying, at least within the bounds of plausibility.

But it gets more complex. If I dig through the topsoil in my back yard (or in Central Park, or in someone's back yard in Sydney), I will find not just Pseudomonas aeruginosa, but Clostridium tetani. The "wind-borne dust" and "travel-by-bird" theories suddenly aren't as plausible. Clostridium tetani is a strict anaerobe. Exposure to air kills it.

"Well," the dust/bird advocate will argue, "Clostridium tetani forms spores, and the spores can survive a journey like that."

So far, so good.

But now it gets harder. About a hundred meters from my house, there's a freshwater stream that leads to a large pond. If you dig a couple meters down in the mud at the bottom of that pond, you'll find various species of Methanobacterium. These are non-motile, non-spore-forming strict anaerobes that are killed immediately upon exposure to oxygen, and they grow only in deep sediments.

If I go to Linchuan, China (a remote village with many freshwater ponds) and dig down into the sedimentary mud at the bottom of a pond, I will find some of the same species of Methanobacterium. How did they get there?

We can (but won't) carry this sort of argument on, to include organisms that grow in deep igneous rock acquifers; thermophiles found inside rock in miles-deep mineshafts; and so on.

How did these species (many of which either have no plausible means of transport across large distance, or would be killed by transport) become widely distributed?

(to be continued)

Thursday, November 29, 2007

Bioinformatics Tooling Textbook

I stumbled onto a nice resource at the Genome Canada web site: the Canadian Bioinformatics Help Desk Web Textbook, a 52-chapter online guide to nontraditional and/or lesser-known bioinformatics software tools. (HTML only. I couldn't find a PDF version.)

Now I just need to find time to go through it all. It's a lot of stuff. Nicely written, though.

Wednesday, November 28, 2007

Chaperone Genes in Viruses

It turns out mimivirus isn't the only virus whose genome encodes a heat shock protein. The Closteroviridae also have what looks like a Hsp70 gene. (This explains why things like "strawberry chlorotic fleck associated virus" and "raspberry mottle virus" show up in my BLAST searches when I compare Mycoplasma genitalium's dnaK against virus genomes.) In contrast to mimivirus, which is extremely large (1200 genes), the Closteroviridae are relatively compact RNA viruses with 8 to 12 genes.

As far as I know, no other viruses encode heat-shock proteins. One wonders why.

Tuesday, November 27, 2007

Hsp70: Mycoplasma vs. Mimivirus

I'm still trying to get over the fact that one percent of the genome of Mycoplasma genitalium (an organism with an extremely stripped-down genome) is devoted to genes for heat-shock proteins. (See earlier blog.) This is a prokaryote with a genome smaller than that of many viruses (just 580K base pairs).

For fun, I decided to do a BLAST-n search to see if the dnaK gene of M. genitalium has any homologs in the virus world. My search scored a weak hit (61.9 bits) on a gene in Acanthamoeba polyphaga mimivirus. The (enormous) 1.2Mbp mimivirus genome is known to encode for a heat-shock protein of the Hsp70 type. That's where my hit was.

I followed up with a protein (BLAST-p) comparison of the M. genitalium dnaK gene product against the mimivirus Hsp70 protein, which confirmed the match (452 bits; identities = 255/606; positives = 370/606).

So not only does the smallest known prokaryotic organism have an Hsp70 gene, but the largest known virus has one as well.

Monday, November 26, 2007


After reading the Scripps press release about the "FoldEx" model proposed in ""An Adaptable Standard for Protein Export from the Endoplasmic Reticulum" (Wiseman, et al., in Cell, Vol 131, 809-821, 16 November 2007), I decided to listen to the podcast with coauthors William Balch and Evan Powers. The press release had me thinking FoldEx was a folding model. It's actually a kinetics model. The essence of it can be seen in this graphic.

Sunday, November 25, 2007

Luminescent Discrimination of Prions

A recent paper in Nature Methods describes a technique for using luminescent conjugated polymers (LCPs) for characterizing prion strains. The LCPs emit conformation-dependent fluorescence spectra when applied to brain sections containing aggregated prions. Using the technique, the authors were able to discriminate between four immunohistochemically indistinguishable prion strains from sheep scrapie, chronic wasting disease (CWD), bovine spongiform encephalopathy (BSE), and mouse-adapted Rocky Mountain Laboratory scrapie prions.

I thought it was interesting that you can obtain conformation-dependent fluorescence spectra.

Wednesday, November 21, 2007

Implications of Tight Packing

Cell proteins are incredibly tightly packed in vivo. According to a paper due to be published in PNAS later this month, the estimated spatial density is around 43%. I can't help noting that this is not far from Kepler's ball-packing optimum of 74%, which (if my math is correct) ultimately means that proteins are separated by well under one protein-radius; viz., the average inter-protein distance is probably a few van der Waals radii; enough for a solvation shell and not much more. (Note to self: If proteins are packed shoulder-to-shoulder, what are the implications for nearest-neighbor hydrophobic interactions? Can neighbor proteins interfere with "hydrophobic collapse" of a protein that's in the process of folding?)

The authors of the PNAS paper, who used apoflavodoxin in their study, note that crowding (as simulated in silico, as well as in vitro, in a separate experiment) "made the native state of the protein 20 degrees Celsius more resistant to thermal perturbations."

The authors also found that "The secondary structure of the folded protein increased by as much as 25 percent based on circular dichroism data."

Perhaps at least some MD-simulation studies of protein folding should be repeated with and without crowding?

Native State Always a Low-Energy State?

Note to self: Consensus view (reasonable in most cases) is that a folded protein is at a lower energy state than an unfolded one. But is this always the case? Or are some folded proteins "spring-loaded"?

Tight packing would favor the existence of at least some "spring-loaded" proteins.

Tuesday, November 20, 2007

Heat Shock Proteins in Mycoplasma

I was surprised to find that Mycoplasma genitalium has a rather elaborate heat-shock protein system. The details are in "Transcriptional Heat Shock Response in the Smallest Known Self-Replicating Cell, Mycoplasma genitalium," (J Bacteriol. 2006 April; 2845–2855) by Oxana Musatovova, Subramanian Dhandayuthapani, and Joel B. Baseman.

What's surprising about this is that Mycoplasma genitalium, an obligatory parasite, lives in a carefully temperature-controlled environment (the human body) that rarely fluctuates more than a couple degrees. Heat shock is not something Mycoplasma genitalium sees a lot of.

Because it is extraordinarily well-adapted to an unvarying habitat rich in nutrients, M. genitalium (like other Mycoplasma species) has shed many unneeded genes over its evolutionary history. The genome for M. genitalium is only 580K base-pairs long, with fewer than 500 open reading frames. Its genome is stripped to the bare minimum. For an organism this stripped-down to have a robust hsp system is remarkable.

It suggests that "heat shock proteins" are playing a crucial role even in the most minimalistic proteome.

Unfolding vs. Folding

I'm no expert, but I would expect that protein unfolding is not the reverse of folding, and thus MD simulations of unfolding are not particularly relevant to understanding how folding works. I don't know if that's the consensus view. But it agrees with Dinner and Karplus in "Is protein unfolding the reverse of protein folding? A lattice simulation analysis" (JMB Vol. 292, Issue 2, 17 September 1999, pp. 403-419).

Intuitively, if one subscribes to the view that protein folding is (typically) a highly mediated, highly orchestrated process, involving "supervisors" of various kinds, why would one expect such a process to run correctly in reverse?

Ribosomal Protein Model

Note to self: In prokaryotes, ribosomes constitute ~30% of cell mass. Ribosomal proteins are therefore some of the most-produced proteins in a cell. One can imagine that some ribosomes spend their entire lives producing ribosomal proteins.

Monday, November 19, 2007

The Protein-Folding Metasystem

I'm just beginning to read the literature on protein folding. It's all a bit boggling. When I was in grad school, we didn't know about chaperones. (Take that any way you want.) It was more-or-less assumed that proteins folded as they formed on the ribosome, and that's that.

It seems the protein-folding system has many components: ribosome, chaperones, enzyme substrates (perhaps), water, ions, post-processing enzymes, etc. I start to get a mental picture of a ribosome being encased in a tRNA pseudo-shell, inside a chaperone-foam, everything fairly tightly packed.

Friday, November 16, 2007

Inaugural Blog

I didn't fully appreciate until recently that so much work remains to be done on the Protein Folding Problem.

This blog is my way of making sure work still needs to be done. ;^)