Friday, December 14, 2007

Is Folding Influenced by Nearest-Neighbor Gene Products?

Question: On a given stretch of chromosome in which DNA encodes for proteins A, B, and C (in that order), do proteins A and C aid (directly or indirectly) in the folding of B?

Note to self: This is something that could perhaps be tested in silico. Investigate folding of B in MD simulation, with/without gene products A and C present.

Wednesday, December 12, 2007

Nucleomorphs and Lateral Gene Transfer

Sharing of DNA between and across species, genus, family, and other lines can happen in many ways. I'm reminded of this by a forthcoming PNAS paper (Lane et al., below) that characterizes the genome of a nucleomorph found in the cryptophyte Hemiselmis andersenii. Nucleomorphs are small DNA-containing nuclei found in the plastids of certain cryptomonads (flagellated unicellular plants). They are thought to represent the remnants of ancient endosymbionts.

The authors of the PNAS paper explain: "The nucleomorphs of cryptophytes and chlorarachniophytes are derived from red and green algal endosymbionts, respectively, and represent a stunning example of convergent evolution: their genomes have independently been reduced and compacted to under one megabase pairs (Mbp) in size." The authors found that the two nucleomorph genomes they studied encoded no introns. Moreover, proteins encoded by nucleomorph DNA "are significantly smaller than those in their free-living algal ancestors."

I think a larger point that bears remembering here is that unicellular plants have no business having flagella in the first place. Not to put too fine a point on it, but: The existence of something like Hemiselmis andersenii is not easily explained in evolutionary terms without invoking a theory of lateral gene transfer.

Lane et al., "Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein structure and function" in PNAS, December 6, 2007, 10.1073/pnas.0707419104.

Tuesday, December 11, 2007

Another Example of Extreme Gene Transfer?

The only reason I have a question mark at the end of the title above is that the following study was not a genetic analysis but a protein-based analysis. Nevertheless it is suggestive of wholesale gene transfer having occurred between a retrovirus and mouse mitochondrial DNA.

Hayashida et al., "An integrase of endogenous retrovirus is involved in maternal mitochondrial DNA inheritance of the mouse" in Biochemical and Biophysical Research Communications (article in press), doi:10.1016/j.bbrc.2007.11.127.

Sunday, December 9, 2007

Extreme Gene Transfer: How Widespread?

A theme I've been developing (clumsily) in recent blogs is that in the real world, DNA is shared between organisms, particularly microorganisms, across species lines (maybe genus, family, and other boundaries as well), rather more frequently than most people are prepared to believe.

Note to self: How would one determine how much free DNA (extracellular, non-viral DNA) is present in a gram of topsoil? Or a milliliter of benthic mud?

Hypothesis: Promiscuous, freeform DNA-sharing is a default behavior of (nearly all) microorganisms. The cell wall is a specialized organelle that exists to rate-limit this process.

Why make such a hypothesis? Two reasons:

1. Because it explains speciation (in microorganisms, at least) better than point-mutation trial-and-error.

2. Because it explains certain novelties of nature that are hard to explain otherwise, such as the recent finding of an entire bacterial genome incorporated in the genome of Drosophila. See: Dunning-Hotopp, Clark, Oliveira, Foster, Fischer, Torres, Giebel, Kumar, Ishmael, Wang, Ingram, Nene, Shepard, Tomkins, Richards, Spiro, Ghedin, Slatko, Tettelin & Werren, "Widespread lateral gene transfer from intracellular bacteria to multicellular eukaryotes" in Science doi:10.1126/science.1142490.

Thursday, December 6, 2007

Autism: Autoimmunity to hsp90?

Today, a study published in Pediatrics confirms that autistic children experience an attenuation of characteristic symptoms (specifically: irritability, hyperactivity, stereotypy, and inappropriate speech) during periods of fevers. See Curran et al., "Behaviors Associated With Fever in Children With Autism Spectrum Disorders" in Pediatrics Vol. 120 No. 6 December 2007, pp. e1386-e1392 (doi:10.1542/peds.2007-0360).

I note with interest that the authors of the study make no mention of earlier work showing that antibodies to heat shock protein 90 are significantly elevated in autistic individuals. See Evers et al., "Heat shock protein 90 antibodies in autism" in Molecular Psychiatry (2002) 7, S26–S28. doi:10.1038/

It's tempting to hypothesize that autoimmunity to hsp90 is the salient feature of autism, and that restoration of hsp90 to near-normal levels in the brain in the course of normal "heat-shock response" explains the salutary effect of fever observed by Curran et al.

Tuesday, December 4, 2007

Extreme Gene Transfer and Speciation, Part 3

The phylogeography of prokaryotes (indeed of microbial forms in general) has received scant study. Relatively little is known about the forces that shape microbial biogeography. Nevertheless, we do know that at the family and genus level, certain prokaryotic "regulars" are very widely distributed (geographically), despite obstacles to physical transport (and obstacles to survival during transport). For example, we can find methanogenic anaerobes belonging to the same family in anoxic lake sediments on different continents. Given the inaccessible habitats of these organisms (i.e., deep lake sediments), the fragility of the organisms with respect to exposure to air, and the unlikelihood of an organism the size of a methane bacterium migrating thousands of kilometers on its own, it's hard to explain the ubiquity of certain signature species of microorganisms around the world. Finding the same families of bacteria in the deep sediments of a lake in China, and a similar lake in North America, is tantamount to finding turtles on Mars.

The temporal dimension of the problem is just as baffling in its own way. Many landlocked microbial habitats ("disjunct refugia") have supported microbial populations for thousands, even millions of years. That's astronomical numbers of generations. Applying the concept of dog-years, we can imagine that a bacterial-year is on the order of a few human-minutes. To put it another way: in bacterial time, a month is eons. The potential for genetic drift is enormous.

And yet we find the same signature families of microorganisms over and over again, despite the huge time scales and distances involved.

Against this backdrop, it's a bit of a challenge to explain how speciation occurs in microbial flora and why the same species seem to emerge in the same types of habitats the world over. (We shouldn't get sidetracked on the precise meaning of the word "species" here. The point is that we can identify the same genomic and phenotypic motifs, packaged in readily identifiable cell types with familiar names, in different points in the biosphere.) Did today's species evolve from common ancestors who were somehow physically distributed uniformly around the world? What was the mechanism of that distribution? More to the point, what happened after the ancestral organisms were laid down? How do you get from there to today's ecosystem of commonly seen microbial communities, with its many self-similarities around the world?

I'll leave as an exercise for the reader the question of whether evolution occurred along parallel paths. I, for one, don't rule out that pseudomonads in Taiwan evolved to their present-day form independently of pseudomonads in Ohio.

I think the amazing taxonomic regularity seen in the microbial world demands flexible thinking when it comes to explaining the emergence of new species. Survival pressure keeps bacterial genomes from drifting very far outside an evolutionary "noise" zone. A substantial barrier has to be crossed in order to arrive at a new species. Accumulation of point mutations probably won't do the job. That just gives "noise." Transfection by viruses probably isn't an important mechanism, either, although the jury is certainly still out on what role (if any) viruses play in speciation.

My suspicion is that "extreme gene transfer" (including inter-species DNA transfer) plays a greater role in microbial speciation than is presently assumed. The bacterial genome inside Drosophila (see prior blog) is a clue that shouldn't be dismissed. DNA is probably more promiscuous than most of us are willing to consider.

Monday, December 3, 2007

Extreme Gene Transfer and Speciation, Part 2

A basic riddle of biology is how members of the same prokaryotic species can be found in so many far-removed places. For example, sulfate-reducing members of the genus Desulfotomaculum have been found in South African gold mines as well as deep basalt aquifers of Washington State. (See Baker et al., below.) On a bacterial scale, Washington State is about as far from South Africa as Earth is from Mars for you or me. Considering that the bacteria in question are bound in rock thousands of feet underground, it seems implausible that the Washington State bacteria somehow propagated to their current location from ancestors living in South Africa (or vice versa).

What are the possible explanations, then?

The easiest is creationism: A Higher Force created these organisms in situ, just as they are, when the Earth itself was created.

Another is panspermia: Some natural force (as yet unknown) caused all of Earth's microhabitats to be seeded with the same types of organisms, at the same time.

A third possibility is genetic convergence: All of the bacterial species we see today evolved independently, in separate locations, in parallel manner, starting from some unknown number of (possibly common) ancestors.

I say possibly common ancestors because yet another possibility exists, which is that given a sufficiently complex local ecosystem, a new member of the ecosystem can emerge on its own through mixing and matching of "borrowed genes" from existing species. Here's the thought-experiment: Imagine that we have a soil sample, and imagine that through some combination of suitable experimental techniques (remember, this is just a thought experiment) we can enumerate all of the different microbial species present in the soil sample. Homogenize the soil sample and divide it in two. Suppose there are 357 prokaryotic species in the sample, and 10 of them are Bacillus species. Now suppose you can completely eradicate all 10 Bacillus species from one of the two samples. (Pretty hard to do, but again, this is a thought experiment.)

Add water and nutrients to each soil sample (separately so as not to cross-contaminate them) on a daily basis. Prediction: After a sufficient period of time, one or more Bacillus species reappears in the soil that previously had none.

A bacteriologist will complain that this is not a terribly strict experiment, because even if a Bacillus cell were to evolve "out of nothing," it probably actually would come about through modification of a preexisting Clostridium species in the soil. (Clostridia are close relatives of Bacillus.)

Fair enough. Repeat the experiment with Pseudomonas instead of Bacillus.

The point is, if the environment favors the existence of Bacillus, the experiment will eventually find Bacillus emerging "from nothing." Or at least that's the hypothesis. A new organism, from borrowed genes.

Sounds a bit fanciful, doesn't it?

It does, until you start to read about things like an entire bacterial genome having been found within the genome of a fruit fly (Dunning-Hotopp et al., cited below.)

(to be continued)


1. Brett J. Baker, Duane P. Moser, Barbara J. MacGregor, Susan Fishbain, Michael Wagner, Norman K. Fry, Brad Jackson, Nico Speolstra, Steffen Loos, Ken Takai, Barbara Sherwood Lollar, Jim Fredrickson, David Balkwill, Tullis C. Onstott, Charles F. Wimpee, David A. Stahl (2003): "Related assemblages of sulphate-reducing bacteria associated with ultradeep gold mines of South Africa and deep basalt aquifers of Washington State," Environmental Microbiology 5 (4), 267–277. doi:10.1046/j.1462-2920.2003.00408.x

2. Dunning-Hotopp, Clark, Oliveira, Foster, Fischer, Torres, Giebel, Kumar, Ishmael, Wang, Ingram, Nene, Shepard, Tomkins, Richards, Spiro, Ghedin, Slatko, Tettelin & Werren. Widespread lateral gene transfer from intracellular bacteria to multicellular eukaryotes. Science doi:10.1126/science.1142490

Friday, November 30, 2007

Extreme Gene Transfer and Speciation, Part I

This is a bit off the topic of protein folding, but I feel it's an extremely important topic to think about, if you work in molecular biology.

Think of how you would explain the following riddle.

If you go in your back yard with a spoon and dig up a spoonful of topsoil, I can virtually guarantee that you will be able to find Pseudomonas aeruginosa somewhere in that soil sample.

I can do the same thing. My back yard has (some) topsoil, and I' m sure it contains Pseudomonas aeruginosa.

As it turns out, I live in Connecticut, about an hour from New York City. But if I go to any grassy area of Central Park and dig down an inch or two into the topsoil, I'm confident I will be able to find our good friend Pseudomonas aeruginosa.

Bear in mind, on a bacterial scale, my back yard is about as far from Central Park as Earth is from Venus.

Now then. If I travel 10,000 miles to Sydney, Australia, I will find topsoil there, too, and in the topsoil I am quite confident that I can (once again) find Pseudomonas aeruginosa. This time, on a bacterial scale, I have done the equivalent of traveling something like ten solar-system diameters.

One might ask how it is that an organism of Pseudomonas aeruginosa's size and limited ability to travel could possibly be found over such a wide area (encompassing my back yard in Connecticut, Central Park in Manhattan, and somebody's back yard is Sydney). True, Pseudomonas aeruginosa has a flagellum. But I don't think a flagellum helps, it this case.

Someone can say, "Well, you can explain the widespread distribution of Pseudomonas aeruginosa by the physical transport of dirt through the air, or by birds carrying the organism on their claws." This kind of answer is, if not particularly satisfying, at least within the bounds of plausibility.

But it gets more complex. If I dig through the topsoil in my back yard (or in Central Park, or in someone's back yard in Sydney), I will find not just Pseudomonas aeruginosa, but Clostridium tetani. The "wind-borne dust" and "travel-by-bird" theories suddenly aren't as plausible. Clostridium tetani is a strict anaerobe. Exposure to air kills it.

"Well," the dust/bird advocate will argue, "Clostridium tetani forms spores, and the spores can survive a journey like that."

So far, so good.

But now it gets harder. About a hundred meters from my house, there's a freshwater stream that leads to a large pond. If you dig a couple meters down in the mud at the bottom of that pond, you'll find various species of Methanobacterium. These are non-motile, non-spore-forming strict anaerobes that are killed immediately upon exposure to oxygen, and they grow only in deep sediments.

If I go to Linchuan, China (a remote village with many freshwater ponds) and dig down into the sedimentary mud at the bottom of a pond, I will find some of the same species of Methanobacterium. How did they get there?

We can (but won't) carry this sort of argument on, to include organisms that grow in deep igneous rock acquifers; thermophiles found inside rock in miles-deep mineshafts; and so on.

How did these species (many of which either have no plausible means of transport across large distance, or would be killed by transport) become widely distributed?

(to be continued)

Thursday, November 29, 2007

Bioinformatics Tooling Textbook

I stumbled onto a nice resource at the Genome Canada web site: the Canadian Bioinformatics Help Desk Web Textbook, a 52-chapter online guide to nontraditional and/or lesser-known bioinformatics software tools. (HTML only. I couldn't find a PDF version.)

Now I just need to find time to go through it all. It's a lot of stuff. Nicely written, though.

Wednesday, November 28, 2007

Chaperone Genes in Viruses

It turns out mimivirus isn't the only virus whose genome encodes a heat shock protein. The Closteroviridae also have what looks like a Hsp70 gene. (This explains why things like "strawberry chlorotic fleck associated virus" and "raspberry mottle virus" show up in my BLAST searches when I compare Mycoplasma genitalium's dnaK against virus genomes.) In contrast to mimivirus, which is extremely large (1200 genes), the Closteroviridae are relatively compact RNA viruses with 8 to 12 genes.

As far as I know, no other viruses encode heat-shock proteins. One wonders why.

Tuesday, November 27, 2007

Hsp70: Mycoplasma vs. Mimivirus

I'm still trying to get over the fact that one percent of the genome of Mycoplasma genitalium (an organism with an extremely stripped-down genome) is devoted to genes for heat-shock proteins. (See earlier blog.) This is a prokaryote with a genome smaller than that of many viruses (just 580K base pairs).

For fun, I decided to do a BLAST-n search to see if the dnaK gene of M. genitalium has any homologs in the virus world. My search scored a weak hit (61.9 bits) on a gene in Acanthamoeba polyphaga mimivirus. The (enormous) 1.2Mbp mimivirus genome is known to encode for a heat-shock protein of the Hsp70 type. That's where my hit was.

I followed up with a protein (BLAST-p) comparison of the M. genitalium dnaK gene product against the mimivirus Hsp70 protein, which confirmed the match (452 bits; identities = 255/606; positives = 370/606).

So not only does the smallest known prokaryotic organism have an Hsp70 gene, but the largest known virus has one as well.

Monday, November 26, 2007


After reading the Scripps press release about the "FoldEx" model proposed in ""An Adaptable Standard for Protein Export from the Endoplasmic Reticulum" (Wiseman, et al., in Cell, Vol 131, 809-821, 16 November 2007), I decided to listen to the podcast with coauthors William Balch and Evan Powers. The press release had me thinking FoldEx was a folding model. It's actually a kinetics model. The essence of it can be seen in this graphic.

Sunday, November 25, 2007

Luminescent Discrimination of Prions

A recent paper in Nature Methods describes a technique for using luminescent conjugated polymers (LCPs) for characterizing prion strains. The LCPs emit conformation-dependent fluorescence spectra when applied to brain sections containing aggregated prions. Using the technique, the authors were able to discriminate between four immunohistochemically indistinguishable prion strains from sheep scrapie, chronic wasting disease (CWD), bovine spongiform encephalopathy (BSE), and mouse-adapted Rocky Mountain Laboratory scrapie prions.

I thought it was interesting that you can obtain conformation-dependent fluorescence spectra.

Wednesday, November 21, 2007

Implications of Tight Packing

Cell proteins are incredibly tightly packed in vivo. According to a paper due to be published in PNAS later this month, the estimated spatial density is around 43%. I can't help noting that this is not far from Kepler's ball-packing optimum of 74%, which (if my math is correct) ultimately means that proteins are separated by well under one protein-radius; viz., the average inter-protein distance is probably a few van der Waals radii; enough for a solvation shell and not much more. (Note to self: If proteins are packed shoulder-to-shoulder, what are the implications for nearest-neighbor hydrophobic interactions? Can neighbor proteins interfere with "hydrophobic collapse" of a protein that's in the process of folding?)

The authors of the PNAS paper, who used apoflavodoxin in their study, note that crowding (as simulated in silico, as well as in vitro, in a separate experiment) "made the native state of the protein 20 degrees Celsius more resistant to thermal perturbations."

The authors also found that "The secondary structure of the folded protein increased by as much as 25 percent based on circular dichroism data."

Perhaps at least some MD-simulation studies of protein folding should be repeated with and without crowding?

Native State Always a Low-Energy State?

Note to self: Consensus view (reasonable in most cases) is that a folded protein is at a lower energy state than an unfolded one. But is this always the case? Or are some folded proteins "spring-loaded"?

Tight packing would favor the existence of at least some "spring-loaded" proteins.

Tuesday, November 20, 2007

Heat Shock Proteins in Mycoplasma

I was surprised to find that Mycoplasma genitalium has a rather elaborate heat-shock protein system. The details are in "Transcriptional Heat Shock Response in the Smallest Known Self-Replicating Cell, Mycoplasma genitalium," (J Bacteriol. 2006 April; 2845–2855) by Oxana Musatovova, Subramanian Dhandayuthapani, and Joel B. Baseman.

What's surprising about this is that Mycoplasma genitalium, an obligatory parasite, lives in a carefully temperature-controlled environment (the human body) that rarely fluctuates more than a couple degrees. Heat shock is not something Mycoplasma genitalium sees a lot of.

Because it is extraordinarily well-adapted to an unvarying habitat rich in nutrients, M. genitalium (like other Mycoplasma species) has shed many unneeded genes over its evolutionary history. The genome for M. genitalium is only 580K base-pairs long, with fewer than 500 open reading frames. Its genome is stripped to the bare minimum. For an organism this stripped-down to have a robust hsp system is remarkable.

It suggests that "heat shock proteins" are playing a crucial role even in the most minimalistic proteome.

Unfolding vs. Folding

I'm no expert, but I would expect that protein unfolding is not the reverse of folding, and thus MD simulations of unfolding are not particularly relevant to understanding how folding works. I don't know if that's the consensus view. But it agrees with Dinner and Karplus in "Is protein unfolding the reverse of protein folding? A lattice simulation analysis" (JMB Vol. 292, Issue 2, 17 September 1999, pp. 403-419).

Intuitively, if one subscribes to the view that protein folding is (typically) a highly mediated, highly orchestrated process, involving "supervisors" of various kinds, why would one expect such a process to run correctly in reverse?

Ribosomal Protein Model

Note to self: In prokaryotes, ribosomes constitute ~30% of cell mass. Ribosomal proteins are therefore some of the most-produced proteins in a cell. One can imagine that some ribosomes spend their entire lives producing ribosomal proteins.

Monday, November 19, 2007

The Protein-Folding Metasystem

I'm just beginning to read the literature on protein folding. It's all a bit boggling. When I was in grad school, we didn't know about chaperones. (Take that any way you want.) It was more-or-less assumed that proteins folded as they formed on the ribosome, and that's that.

It seems the protein-folding system has many components: ribosome, chaperones, enzyme substrates (perhaps), water, ions, post-processing enzymes, etc. I start to get a mental picture of a ribosome being encased in a tRNA pseudo-shell, inside a chaperone-foam, everything fairly tightly packed.

Friday, November 16, 2007

Inaugural Blog

I didn't fully appreciate until recently that so much work remains to be done on the Protein Folding Problem.

This blog is my way of making sure work still needs to be done. ;^)