The last post about an operon (the lac operon) is the most viewed post on this blog, so I thought that it might be helpful to follow this up with another operon, this time concentrating on the trp (tryptophan) operon. This operon is another really elegant example of transcriptional regulation in E.coli and the mechanism is pretty cool.
see the last post on their composition!) and cells synthesize amino acids using a variety of enzymes. When nutrients are plentiful, such as E.coli would encounter in nutrient broth in the laboratory setting, cells no longer need to waste energy producing biosynthetic enzymes when they can utilize nutrients already in excess. The trp operon contains several enzymes that are coordinately regulated and involved in the production of tryptophan. When tryptophan is present in the cell's environment, it doesn't need to make any of these enzymes, but if the cell needs tryptophan, these enzymes are transcribed and shortly thereafter translated. Control of this operon, thus, controls how much energy the cell is going to put into making tryptophan.
Similar to the lac operon, the trp operon contains an operator (O) sequence, within the promoter sequence, where an operator binds and prevents transcription. In the presence of tryptophan, the operator binds the promoter and prevents RNA polymerase from transcribing genes. In the absence of tryptophan, however, transcription occurs at a basal rate. Sounds simple enough, right? Let's take it a step further and consider...
An important concept in gene regulation is that of attenuation, which is fine-tuning of gene expression. You might think that attenuation is mediated by protein factors that bind the DNA and affect gene expression; however, attenuation of the trp operon is a little different and, instead, depends on mRNA structure to modulate gene expression.
Before moving forward, let's look at the trp operon (diagrammed to the right). Briefly,t here are four regions, and these four regions have differing levels of complementarity to each other. Thus, when the DNA is transcribed into mRNA, the mRNA folds into all kinds of shapes and the regions of the trp operon fold on each other.
In the presence of high amounts of tryptophan within the cell, the ribosome plows through these two tryptophan codons, adding in the appropriate amino acids, and continuing through region 1 of the mRNA. This results in region 1 and 2 mRNA sequences binding together, and then regions 3 and 4 bind together as well. This interaction between regions 3 and 4 results in the creation of a transcription-termination hairpin, basically a structure in the mRNA that kicks out RNA polymerase and prevents further transcription of the mRNA. Thus, transcription (and then translation) are stopped because
In the absence of tryptophan, however, the ribosome cannot quickly add tryptophan during the translation process and it stalls before region 1. This results in the folding of the mRNA such that regions 2 and 3 bind to each other. When this structure forms, no transcriptional termination hairpin is formed, and mRNA synthesis continues. Thus, the entire mRNA sequence for the trp operon is made and can be translated into enzymes that will synthesize tryptophan.
Lots of tryptophan: Ribosome zooms through the mRNA, regions 1 & 2 and 3 & 4 bind (in pairs) and create a termination hairpin
End result: Transcription terminates and tryptophan synthetic enzymes not created (cell saves energy!)
Lack of tryptophan: Ribosome stalls immediately before region 1, regions 2 and 3 bind each other, no termination hairpin is formed
End result: Transcription continues and biosynthetic enzymes are eventually synthesized
This scheme is similar for other operons encoding amino acid biosynthetic enzymes (in bacteria, that is). The trp operon is an elegant scheme to finely-tune transcription via mRNA structure to prevent the cell from wasting energy.
Monday, October 17, 2011
Thursday, October 13, 2011
Brief aside: In my chemistry class in undergrad, my TA helped us remember the order of the chemical bonds following polymerization by saying N-H, C-H, C-O, N-H, C-H, C-O, ...
You may remember briefly from any stint in chemistry class that a carbon atom that is covalently bound to four different chemical entities (in this case, a side chain, a hydrogen atom, a carboxyl group, and an amino group) can take two different conformations, depending on how these bonds are spatially oriented. In the case of amino acids, the vast majority of amino acids found in our bodies and used to generate proteins are L stereoisomers. This is a result of the amino acid synthesis machinery structure exclusively generating L amino acids. There are exceptions, but we won't get into that.
As I mentioned, amino acids have a side chain: the part of the amino acid that endows it with its identity. These side chains can be broken into a few groups that we will explore now:
The next set is composed of the aromatic side chains, which includes phenylalanine, tyrosine, and tryptophan. These amino acids all contain an aromatic ring, which makes them relatively nonpolar; thus, they do not interact favorably with water. These amino acids are involved in mediating protein protein interactions and are frequently found at the active sites of enzymes.
Next up: polar, uncharged side chains: asparagine, cysteine, glutamine, serine, and threonine. These amino acids contain hydroxyl, sulfhydryl, or amide groups that mediate interactions with water, but they carry no net charge. An amino acid of note in this family is cysteine, which can react with itself to form cystine, which is important in mediating the formation of disulfide bonds in protein structures.
We'll consider basic side chains next. These amino acids consist of arginine, histidine, and lysine, which all carry a net positive charge in solution. Of note, histidine is commonly found at the active site of enzymes to serve as a protein donor or acceptor.
Finally, we find acidic side chains: aspartate and glutamate. In solution, these amino acids carry a negative charge and are considered acidic.
In the diagram at right, I've drawn up each of the amino acids along with their three-letter and one-letter codes. These codes are frequently used to abbreviate long lists of amino acids.
Another brief aside: Did you know that a single woman designated the amino acid abbreviations? She chose letters that made sense for most amino acids (as you can see above). For tryptophan, for example, she chose W because she envisioned saying tryptophan as twyptophan. Kind of cool, huh?
As a summary, here are the amino acid abbreviations:
- A, ala, alanine
- C, cys, cysteine
- D, asp, aspartate
- E, glu, glutamate
- F, phe, phenylalanine
- G, gly, glycine
- H, his, histidine
- I, ile, isoleucine
- K, lys, lysine
- L, leu, leucine
- M, met, methionine
- N, asn, asparagine
- P, pro, proline
- Q, gln, glutamine
- R, arg, arginine (think aRRRRginine)
- S, ser, serine
- T, thr, threonine
- V, val, valine
- W, trp, tryptophan (tWWWyptophan)
- Y, tyr, tyrosine
So there you have it: 20 amino acids. In addition to these amino acids, our bodies contain several more, including selenocysteine (identical to cysteine but containing selenium rather than sulfur) and ornithine (remember this from glycolysis?). Amino acids can also undergo modifications: for instance, lysine residues can be acetylated. More amino acids and their variants are always being discovered as well.
Now that we have the building blocks of proteins established, the next blog post will focus on how these amino acids can be combined (polymerized) into long structures that make up polypeptides and proteins.