Poor, Hungry, and Drowsy: 2010

Tuesday, June 22, 2010

Initiation of Eukaryotic Transcription IV

Changing Up the Chromatin

All of the factors that affect transcription that have been described are involved in changing the ability of RNA polymerase and transcription factors to bind to and initiate DNA transcription. An additional method of modulating transcription is via changes to the chromatin (see writings on that here). Histones and other proteins are able to change the packaging of DNA and, thereby, affect transcription. The compaction of the DNA directly affects the accessibility of it to RNA polymerase and transcription factors. Euchromatin is the active form of chromatin that is not fully compacted; in contrast heterochromatin is generally not active or accessible. The presence of histone H1 (the linker histone, and, let’s be honest, everyone’s favorite histone) also affects the compaction of the chromatin.

How exactly do histones affect transcription? Put simply, they prevent other proteins from binding the DNA, especially the DNA that is facing the histone core. Further, DNA is wrapped around histones, which changes the structure of the DNA. Such distortions can affect binding sites and preclude transcription factor binding. If the transcription factors and RNA polymerase are to bind to the DNA, the chromatin must be unraveled and the DNA must become accessible. One way to do this is to move the histones out of the way, opening up binding sites. Histones are incredibly dynamic and move around on the DNA frequently, wrapping and unwrapping different sequences. With the fluctuations of the chromatin, transcription factors can bind while the histones “breathe.”

An additional mechanism to allow access to the DNA is via histone modifications on the N-terminal domains. The N-termini of the histones contain lysine groups that affect DNA-histone interactions. Therefore, affecting the charged residues via acetylation or methylation, for example, will change the interactions between the histones and DNA. Acetylation of a histone tail effectively neutralizes its charge such that its structure is altered and no longer binds DNA as tightly. Modifications on histones can also form binding sites for transcription sites. Chromodomains bind to methylated lysine residues, while bromodomains bind acetylated lysines. There are a number of different modifications that affect transcription, and new effects are still being elucidated.

- Acetylation of lysine / arginine: transcription induction

- Phosphorylation of serine, threonine, or tyrosine: transcription induction; chromatin compaction

- Methylation of arginine: transcription induction

- Methylation of lysine: gene silencing

- Ubiquitination of H2A and H2B: degradation; transcription; growth regulation

- ADP-ribosylation: histone repelled from DNA at sites of repair

Monday, June 21, 2010

Activation of Transcription Initiation III

Activation Domains

Activation domains are another important portion of the regulatory protein that is involved in altering the activity of a promoter. There are three main types of activation domains:

1. Acidic, such as Gal4

2. Glutamine-rich, such as Sp1

3. Proline-rich, such as CTF

These different types of activation domains have different mechanisms and may also be involved in allowing the regulatory elements to function at a distance. Importantly, many regulatory proteins may have multiple activation domains.

As mentioned previously, the mediator complex works to integrate the signals from multiple activation domains and passes this signal along to RNAPII. There are about 20 different subunits that bind to RNAPII and different activation domains.

To determine the functional domains of an activator, we can use reporter genes. Ideally, we would cotransfect a plasmid containing the protein of interest and a plasmid containing a reporter (such as lacZ) that is transcribed only when the activation domain of the protein of interest is transfected. In this way, we can examine different regions of proteins to determine the precise domains that are involved in activating transcription.

An additional way to determine where an activation domain is in a protein is to use a Gal4 hybrid assay. This method involves using a domain swap, which uses the DNA-binding domain of Gal4 and other domains from the protein of interest. By measuring the activity of a reporter gene, such as lacZ, we can determine if the domain that is bound to Gal4 is an important activation domain.

Co-activators

First, we have activators that are recruited to genes, which are involved in regulating transcription and also bind the DNA directly. Co-activators, in contrast, are recruited to the promoter but do not bind DNA. They form complexes and can assemble on the DNA-binding proteins. In this way, co-activators can interact with proteins essential for transcription, such as the machinery, histone modifiers, and chromatin-remodeling complexes. Important to note is that some co-activators, such as VP16, CBP, and GCN5, have acetyltransferase activity.

VP16 is a herpesvirus protein that contains an acidic activation domain and interacts with host cell factor (HCF). When VP16 binds HCT and OCT1, which is a DNA-binding activator but no activation domain, it promotes the assembly of the PIC and helps to initiate transcription by targeting TBP, TFIIB, and TAF40.

GATA4 is another important co-activator that is a zinc finger DNA-binding protein that is involved in heart development. It works via the recruitment of TBX-5.

Architectural Factors that Affect Transcription

The main way that architectural factors affect transcription is via DNA bending. These proteins do not have a transactivation domain, as do other regulatory factors, but they do affect the interactions between activators, co-activators, and the PIC. This is often accomplished by bending the DNA and shortening the distance between cis-acting elements. Such bending of the DNA allows for transcriptional regulators to act at a distance.

The HMG proteins are small, abundant proteins that function to change the DNA architecture. These proteins do not have high sequence specificity and can bind the minor groove to induce a bend in the DNA. Bending of the DNA facilitates complex assembly and nucleosome remodeling, which may change the rate of transcription.

Tuesday, June 15, 2010

Activation of Transcription Initiation II

Does the Protein Bind DNA?

To determine what DNA sequence a specific protein binds, an electromobility shift assay, or EMSA, is implemented. This method involves labeling DNA fragments with the sequence of interest and then mixing this DNA sequence with the DNA-binding protein. We can then run an acrylamide gel with and without the DNA-binding protein, and if the protein does bind the DNA, we should note a shift in the mobility of the DNA, as detected by autoradiography. Namely, the DNA will become heavier, with the protein bound to it, meaning that it will not travel through the gel as quickly. Therefore, we should see a shift up. If the protein does not bind the specific DNA of interest, we should note that the two bands migrate through the gel at the same rate. We can further test the specificity of the interaction by adding unlabelled competitor DNA: if we still note a shift in the band, the DNA-binding protein is very specific to the DNA sequence of interest.

Another method to determine if a protein binds to a DNA sequence is to perform a chromatin immunoprecipitation, which is an extension of an immunoprecipitation. This technique involves several steps, starting with crosslinking of proteins and DNA. If the DNA-binding protein is located on the DNA, it will be crosslinked with the DNA (via formaldehyde). Post-crosslinking, the DNA is sheared using sonication (enzymes can also be used), and the protein of interest is immunoprecipitated. Crosslinking is then reversed, and we can purify the DNA that was bound to this protein. Performing PCR on the purified DNA will indicate if a specific DNA region is associated with the DNA-binding protein.

Chromatin immunoprecipitation simplified

1. Formaldehyde crosslink the DNA and bound proteins

2. Sonicate to break up DNA

3. Perform immunoprecipitation with antibodies against the protein of interest

4. Reverse crosslinking (via high salt, for example)

5. Purify DNA that co-immunoprecipitated

6. Perform PCR with gene-specific primers

Friday, June 11, 2010

Activation of Transcription Initiation

We’ve briefly covered the initiation of transcription but the process is more complex than transcription factors floating to a promoter and starting up RNAPII. In thermodynamic terms (ouch, I know), the transcriptional activator proteins shift the equilibrium of free transcription factors to the formation of the preinitiation complex (PIC): these activators increase or decrease the association rate of proteins and affect the formation of the PIC. Proteins can affect transcription initiation by altering accessibility to the promoter or changing the stability of the PIC. Activators (also called transcription factors and gene regulatory proteins) bind to specific DNA sequences and promote transcription. Co-activators interact with these activators to promote transcription, without interacting directly with the DNA. Essentially, activators and co-activators function to recruit, position, and modify GTFs and RNAPII by altering the transcriptional machinery, bending the DNA, or changing the chromatin structure.

Regulatory elements are important in eukaryotes and come in several forms. The core promoter contains the start site of transcription and the TATA box. Here is where GTFs and RNAPII bind to form the PIC. The proximal promoter (or the upstream activator sequence in yeast) is located within 200 bp upstream of the start site and contains sites for regulatory factors to bind. Finally, enhancer sequences exist from 200 to 50 000 bp from the start site and can also bind regulatory factors. Enhances act independent of function, and they can act at a distance due to DNA looping.

What are the components of a transcriptional activator protein?

TAD: trans-activation domain

DBD: DNA-binding domain

NLS: nuclear localization signal

Regulatory domains: catalytic function of the activator protein

Dimerization domain: for dimerization of activators (especially important for DNA-binding)

DNA-Binding Domains

The DNA-binding domain can read DNA sequences and has several structural motifs: the helix-turn-helix (HTH), homeodomain, zinc finger, basic leucine zipper, and helix-loop-helix. In general, these domains contain an alpha helix that fits snuggly in the major groove of the DNA and makes specific contacts with the DNA. These DBDs can thereby recognize response elements in the DNA to carry out their functions.

The helix-turn-helix motif binds DNA as a monomer and recognizes DNA via a C-terminal helix, and the N-terminal helix positions the C-terminal helix in the major groove of the DNA. In contrast, the homeodomain binds DNA as a monomer and contains three helices, one of which binds the DNA and the other two bind other proteins or the DNA backbone.

The zinc finger motif uses a zinc ion to coordinate the structure of the protein. The Cys2/His2 zinc finger motifs act as a monomer or a dimer and use cysteine and histidine to coordinate the zinc and bind the DNA major groove. Additionally, Cys4 zinc finger motifs also act as monomers or dimers and coordinate the zinc with four cysteine residues to allow for DNA interactions. The basic helix-loop-helix domain and leucine zipper motifs are additional DNA-binding motifs that act as dimers and are commonly found in DNA-binding proteins.

Tuesday, June 8, 2010

RNA Polymerase and Basal Transcription

Part 3 of 3. Of part 1 of 4. So I guess it's like part 3 of 12, but that sounds too intimidating. Let's stick with 3 of 3.

RNA Polymerase II

RNA polymerases in general consist of about 10 subunits and making a protein of greater than 500 kDa. Five subunits are common to all of the three polymerases. However, RNAPII contains the all-important C-terminal domain (CTD): YSPTSPS, which is repeated 52 times in mammals (26 times in yeast). RNAPII that can initiate transcription has a CTD that is unphosphorylated, but upon initiation and movement of the polymerase from the promoter, the CTD becomes phosphorylated. RNAPII alone, however, is not enough to initiate transcription, as it requires a number of other factors for transcription actually begin. These include six GTPS: TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH. Once these and RNAPII have assembled at the promoter, the pre-initiation complex (PIC) has formed, which allows for basal transcription. How often this PIC is formed is regulated by upstream activator and repressor proteins.

Motifs Required for Basal Transcription

A number of DNA sequences are necessary for the core promoter to actually lead to transcription of a gene:

The TATA box: located at about -25, it binds the TBP and is found mainly in tissue-specific genes. Consensus sequence of TATA(A/T)AA(G/A). This element is involved in positioning RNAPII to start transcription, so any mutations in this region can be devastating to transcriptional activity.

The BRE (TFIIB response element): located at about -32 to -35, binds TFIIB

The INR (initiator): located at -2, binds TFIID, and can stimulate TATA box activities, though weakly. Used by about 65% of genes in place of a TATA box.

The DPE (downstream promoter elements): located roughly from +28 to +32 and stimulate gene transcription.

The Steps in Transcription Initiation

Formation of the preinitiation complex (PIC) is the initial step in transcriptional initiation and involves the assembly of GTFs on the gene:

TBP binds the minor groove of the TATA box, causing a bend in the DNA and promoting the binding of more factors
About 10 TAFs bind TBP to form TFIID
TFIIA binds TFIID complex
TFIIB binds the TFIID-TFIIA complex
TFIIF recruites RNAPII to the promoter
TFIIE and TFIIH join to form the functional PIC

TFIIH acts as a helicase to promote initiation and also has kinase activity to phosphorylate the CTD of RNAPII for promoter clearance.

TAFs are a diverse set of proteins that affect the ability of TBP to interact with the promoter, and these TAFs are particularly important when there is no TATA box on the gene. These proteins can act as co-activators, functioning to recruit TFIID or interact with other transcription factors, for example. Additionally, other TAFs have acetyltransferase, kinase, and ubiquitin-conjugating activities.

Mediator is a large protein complex that stimulates or inhibits the activity of RNAPII. Other activators and inhibitors of transcription interact with mediator, sometimes at a long distance, and these signals are integrated to promote or inhibit RNAPII activity. While not all subunits of mediator are necessary for transcription, some are required.

After the formation of the PIC, transcription begins and the promoter is cleared, at which point the CTD on RNAPII is phosphorylated and the GTFs are released, except for TBP.

Monday, June 7, 2010

More about Eukaryotic Transcription Initiation

Is it just me or does eucaryotic just look funny without the k?

Compared to prokaryotes (discussed previously), transcription in eukaryotes is complicated due to chromatin, multiple complexes, regulatory proteins, and a lack of transcription-translation coupling (one is in the nucleus; one is in the cytoplasm). The complex that general transcribes genes into RNA is RNA polymerase II, which binds to specific sequences on the eukaryotic genome. Genes in eukaryotes have several components, including enhancers, promoters and proximal elements, the TATA box, and the exons and introns of the gene. The regulatory sequences surrounding a gene determine its transcription and utilization, accounting for temporal and spatial regulation of gene transcription.

Specific factors are involved in the initiation of eukaryotic transcription. Basal transcription factors (GTFs) are required for transcription from all promoters, regardless of tissue-specificity. RNA polymerase II (discussed above) is also required for transcription. TATA-binding protein (TBP), which binds the TATA box, is also important in initiation of transcription, as are TBP-associated factors (TAFs) and coactivators of transcription.

How do we analyze the activity of regulatory regions of genes? Reporters, such as luciferase, GFP, or β-galactosidase are reporters, which can be used to measure the amount of transcription from a promoter or regulator element. By placing a promoter or enhancer upstream of one of these reporters on a plasmid, transforming this plasmid into a cell, and measuring the amount of reporter gene transcribed, one can analyze the promoter activity. The total amount of reporter protein that is synthesized is directly related to the activity of the promoter.

Before being able to perform these reporter assays, however, the DNA sequences that regulate transcription must be putatively identified. This can be accomplished via 5’ deletion analysis, in which DNA fragments upstream of the 5’ untranslated region (UTR) of a gene are introduced to a reporter vector. As described above, the activity of the promoter is measured as a function of the reporter protein, such as luciferase.

Additionally, one can perform linker scanning analysis, in which regions of the DNA are mutated with synthetic linker DNA. These mutations should abolish the activity of the particular region of DNA that they “cover,” and the changes due to these mutations can be analyzed with reporter genes. Now, with so many regions mapped and analyzed, bioinformatics can be used more frequently to identify shared enhancer sequences.

The core promoter of a gene consists of the site at which RNA polymerase II (RNAPII) binds and initiates transcription. This site is approximately 35 bp upstream or downstream of the transcription initiation site, which allows RNAPII to interact with the basal transcription machinery.

(Note: What about RNA polymerases I and III? RNAPI is involved in the production of rRNA, and RNAPIII with tRNA. RNAPII is highly abundant and is inhibited by α-amanatin, which interferes with the translocation of RNA and DNA and is found in poisonous mushrooms. RNAPII synthesizes approximately 50% of the RNA in an active cell.)

Sunday, June 6, 2010

Initiation of Eukaryotic transcription

We're eukaryotes, right? So let's delve into eukaryotic processes, starting with transcription. Bacterial transcription is also pretty important, so if you would like a refresher on that topic, check out that post. Since eukaryotes are more complex, transcription will be covered in several posts in order to hammer out all the important details.

Transcription is essential for differentiating cells: all cells contain the same genome but have different expression patterns of the genome. The differential expression of the genome creates unique protein compositions in each cell type. While a number of functions are the same in cells and they, therefore, express many of the same proteins, cell specialization is dependent on different protein expression patterns. For example, some proteins are abundant in specialized cells but not other types of cells. One method for detecting which proteins are expressed in a cell is via 2D electrophoresis, which separates proteins based on their pI and molecular weight.

The genes of the human genome are regulated temporally and quantitatively: only approximately 10,000 genes are expressed in any singular cell type. In fact, the expression levels of nearly every active gene is different in different cell types. Temporally, expression is regulated by many factors, including cell cycle, external stimuli, tissue types, and embryogenesis. Genes are also regulated quantitatively, which is determined by the rate of transcription initiation and elongation, as well as the actions of activators and repressors. Constitutive genes are those that are expressed throughout the cell cycle; inducible genes are transcribed at different levels depending on the position in the cell cycle or developmental stage.

Modulation of gene expression can occur at several stages:

When and how frequently a gene is transcribed
Splicing
Exporting and localization of mRNA
Translation initiation
Stability of mRNA in the cytoplasm
Activation, degradation, and compartmentalization of proteins

The main method of modulating gene expression is via selective transcription. Studying transcription can occur via analysis of cDNAs, which can be probed to determine expression levels. Microarrays have become important for comparing the expression levels of nearly every gene in a genome. Analysis of microarrays provides a characteristic expression pattern that can even be diagnostic.

Regulation of transcription at the level of initiation

Transcription control can be the result of different expression patterns in different cell types, developmental stages, or in response to stimuli. As mentioned previously, changing the rate of mRNA transcription initiation is the main mechanism by which the cell regulates gene expression. Additionally, control at the stage of transcription initiation ensures that the cell does not waste energy producing mRNAs that are not useful to it.

How do we determine transcription rate? Via run-on transcription analysis. To perform this type of assay, we must isolate nuclei of the cell of interest, incubate it with radio-labeled ribonucleoside (³²P) and then allow the transcriptional machinery to work. The processes are allowed to continue for a short while in order to ensure that only RNA elongation and not transcription initiation are measured. Hybridization of the labeled RNA to specific DNA allows for quantification of its relative transcription rate, compared to standards or other genes.

Saturday, May 8, 2010

Bacterial Transcription: Induction and The Lac Operon

A continuation from previous transcription posts: Regulation and Attenuation and Initiation, Elongation, and Termination.

Induction

Gene induction is a phenomenon that is incredibly fast, taking only two to three minutes for a cellular response. During induction, the actual enzyme levels in the cell rise, and inhibitors of protein synthesis prevent induction (providing further evidence that it is, in fact, protein synthesis that is necessary for induction). The lac operon is the most commonly studied gene that uses induction for regulation.

During initial studies of the lac operon, there were two types of genes considered: structural and regulatory genes. Structural genes are those that encode the actual metabolic enzymes; regulatory genes are involved in controlling the expression of the structural genes. The lac operon was convenient for study because it had an observable phenotype (the production of a gene in the presence of glucose or lactose) and because mutants could be generated that had different phenotypes. After mutagenizing bacteria, the researchers screened E.coli on plates with glucose and X-galactose. Colonies of bacteria that were inducible turned white and did not express β-galactosidase, and regulatory gene mutants would be able to express β-galactosidase in the absence of lactose (and turn blue). Mapping of the genes that were responsible for these phenotypes led to the identification of the o and i regions. Mutations in either of these regions resulted in a constitutive phenotype.

Further analysis of the lac operon and induction led to the creation of a model: the i region codes for the inducer, which binds the lac operon DNA in the promoter region (identified via DNA-binding assays and footprint analysis). Later structural studies identified the i protein contains an HTH motif, as well as IPTG-binding domains. The i gene, called the repressor can bind the promoter region of the lac operon and prevent RNA polymerase from binding. With the presence of glucose but not lactose, the lac repressor binds the operator sequence of the genome and it prevents RNA polymerase and its helper protein CAP (bound to cAMP). CAP-cAMP induces a bend in the DNA, which allows RNA polymerase to bind, and CAP has its own binding site the DNA that helps to position the polymerase. With the presence of lactose, the repressor binds to the lactose and no longer binds the operator. Therefore, RNA polymerase can bind the promoter sequence and promote transcription of the lac genes. However, in the absence of glucose, which is indicative of a high concentration of cAMP (low ATP), CAP binds cAMP and promotes stronger interaction of polymerase and the lac promoter.

Friday, May 7, 2010

Bacterial Transcription: Control and Attenuation

A continuation from yesterday's post: Bacterial Transcription Initiation

Transcriptional Control and Attenuation

Like eukaryotes, bacteria must be able to control their gene activity. Gene expression can be controlled at the transcriptional level in a few ways. One is via alternative sigma factors (the protein that binds the -10 and -35 sites and positions RNA polymerase), which are involved in controlling expression of specialized operons. For example, σ32 is involved in regulating heat shock genes, σ28 is for genes involved with motility and chemotaxis, σ54 is involved in nitrogen metabolism, and σ70 helps transcription of most genes.

Another method of transcriptional control is via attenuation. The most frequently cited example of attenuation is the trp operon, which has been studied extensively. Initial observations indicated that when tryptophan was present for the bacteria, mRNA corresponding to the trp gene were short. However, when tryptophan was limiting in the media, the mRNA transcript was longer. If the researchers removed a short sequence of DNA, the mRNA was transcribed in full and genes were fully expressed. This short sequence of DNA was termed the attenuator, or premature transcriptional stop.

The trp operon codes for an mRNA with four different regions that can differentially bind to each other: The second region can bind the first or third; the third can bind the second or fourth. The first region contains two successive codons for tryptophan incorporation, which important for determining how the transcript is formed. With high tryptophan, the ribosome moves along through the first region, without stopping at the successive tryptophan codons. Because RNA polymerase has not had time to release the transcript before the ribosome translates through region one, causing regions three and four to bind, polymerase is forced off the mRNA and transcription is prematurely stopped. This results in a shortened transcript when the cell has sufficient tryptophan. In contrast, with low tryptophan levels in the cell, the ribosome will stall at the successive tryptophan codons because it is not able to quickly translate the mRNA. This stalling allows for RNA polymerase to continue on its merry way and finish the full transcript because regions two and three (not three and four) bind.

Thursday, May 6, 2010

Bacterial Transcription: Initiation, Polymerization, Termination

After the last post, I thought I would be updating more often, but that just didn't happen. Either way, here's a post about bacterial transcription!

Bacteria as an Experimental System

Bacteria are a common genetic system, but why exactly do we use these tiny organisms to perform so many experiments? Because they’re easy to use, of course. There area number of benefits to using bacteria, including:

- Establishing basic biological principles

- Genetic manipulation

- Short generation time

- Simple growth conditions

- High population density

- Ability to witness rare events

- Ability to select for rare variants

Bacterial Transcription

Transcription is function that has been heavily studied in bacteria. This first step in gene expression is facilitated by a single RNA polymerase of six subunits. Eukaryotes, in contrast, have four polymerases (I, II, and III, as well as a mitochondrial or chloroplast polymerase). In bacteria and eukaryotes, the initiation of transcription requires a complex of proteins to assemble and facilitate polymerization of RNA from DNA templates.

Bacterial RNA polymerase consists of six subunits, as mentioned previously. The β and β’ subunits perform the polymerization reaction. Two α subunits regulate the frequency of initiation. The ω subunit is involved in stability and assembly of the polymerase enzyme.

RNA polymerase first binds to the promoter region of DNA using a σ factor, which binds two specific regions of the promoter. The core polymerase and σ factor slide along DNA until they come upon a promoter. This closed complex that finds the promoter converts to an open complex (not requiring any ATP for this action), which favors the separation of the DNA strands. At this point, RNA polymerase begins to make short RNA segments, as it “stutters” along the DNA. Small RNA oligos are formed, and σ factor begins to dissociate from the polymerase enzyme. At this point, elongation of RNA transcripts can occur, which results from a tightening of the clamp and the formation of the RNA exit channel. During elongation, RNA polymerase adds nucleotides to the growing RNA transcript at a rate of about 50 per second. With σ factor dissociated, the “rudder” of RNA polymerase pries the DNA/RNA hybrid apart.

When RNA polymerase is to stop the transcription of a gene, it has a few options. First, the gene itself may have an AT-rich region that forms secondary structures that inhibit transcription after they have been copied. These hairpin secondary structures may open the exit channel, and due to the less stable A-U base-pairing between the DNA and RNA, the transcript is released. Additionally, there is a rho-dependent transcriptional termination method. Rho is a hexameric protein that wraps approximately 60 bp of mRNA. Rho, once bound to mRNA, activates and uses its ATPase activity to move as an RNA-DNA helicase. Once it has become active, rho begins to unwind the RNA from the DNA, and when it approaches the active site of RNA polymerase, the transcript is released from the DNA.

Bacterial genes in general can be found in either direction on the genome and are very rarely overlapping. RNA polymerase recognizes a distinct region on the chromosome to initiate transcription. To identify this site, DNA footprinting is used.

DNA footprinting:

1. Bind RNA polymerase to a DNA strand of known length

2. Randomly cleave the DNA by nuclease or chemical agents

3. Remove RNA polymerase from the DNA

4. Separate the DNA strands on an agarose gel.

By DNA footprinting, it was recognized that there is a specific region that is “empty” (the footprint) on the agarose gel corresponding to where RNA polymerase binds. Genetic analysis has identified two regions where RNA polymerase binds: at -35 and -10 relative to the initiation site. The consensus sequences are TTGACA and TATAAT, respectively.

Friday, April 23, 2010

More about the Nucleus: Matrix, Envelope, Pores, Lamins

I needed that little break. Exams have calmed temporarily and I have begun studying in earnest yet again. Now I plan to take a slightly different approach with these updates by moving through the material chronologically, as it was taught. Maybe using this method, I'll be able to refer back and interlink posts more efficiently.

The Nuclear Matrix

As mentioned, DNA wrapped in nucleosomes loops and attaches to the nuclear matrix, but what exactly is the nuclear matrix? Technically, the nuclear matrix consists of what is left after the DNA, lipid, and protein content of the nucleus has been cleared. It consists of the nucleolus, the nuclear pore complex and lamina, and the internal nuclear matrix. Many have argued that the nuclear matrix is simply an artifact of the extraction process.

The nucleolus in the nuclear matrix consists of ten chromosomes that converge to form a small compartment. These ten chromosomes all contain genes for rRNA, and the nucleolus is where the rRNA is synthesized. The ribosomal rRNA is synthesized by RNA pol I and III, and after synthesis, proteins are added to the rRNA while it is still in the nucleus. Various subcompartments of the nucleolus have also been identified: the fibrillar center consists of the nucleolar organizer and the rDNA genes; the dense fibrillar (pars fibrosa) consists of the sites of transcription; and the granular (pars granulose) makes up the ribosome subunits.

The Nuclear Envelope and Pore Complex Lamina

The nuclear envelope consists of a double membrane that connects to the ER on the outside and to the nuclear lamina (and heterochromatin) on the inside. The lamin B receptor (LBR) can be found in the nuclear envelope, as it is an integral membrane protein. Also contained in the envelope is the nuclear pore complex. The NPC is from 50-to-150 nm in size, with about 5,000 found on the membrane per nucleus. At the NPC is where the inner and outer membranes of the nuclear envelope come together, and the NPC is involved in communication between chromatin and events outside the nucleus.

The nuclear pore complex can pass 500 macromolecules per second, with molecules less than 5000 Da passing freely; those macromolecules larger than 60 kDa barely enter. The channel that allows passage of these molecules is only approximately 10 nm wide. The complex itself is made of 1000 proteins called nucleoporins, forming an octomeric structure. It functions to import nuclear proteins via nuclear localization sequences (NLSs), a protein sequence that signals nuclear import. A protein with more NLSs is imported more frequently, though an NLS does not mediate retention of the protein. The protein importin, made of α and β subunits, assists in protein import by binding the NLS. The importin receptor recognizes the NLS and then migrates on FG repeats of the nucleoporins to dock and translocate through the pore. The cargo (with its NLS) is then released to the nucleus.

Export of mRNA and proteins from the nucleus is also important to the cell. It has been discovered that the sequence of the RNA does not affect its ability to export from the nucleus where it originated. However, it has also been discovered that different RNAs are exported via different pathways, which may be facilitated by RNA binding proteins that contain a nuclear export sequence (NES). Exports are the receptors for these NESs and facilitate protein movement out of the nucleus. Those proteins with both an NES and NLS are considered shuttle proteins.

Nuclear import and export is heavily regulated in the cell. Phosphorylation can affect nuclear import: direct phosphorylation o the NLS can inhibit transport. Ran-GTP is another important factor that affects transport. Ran-GTP in the nucleus binds to empty receptors and transport them to the cytoplasm, where they can reload with a piece of cargo. Ran-GAPs in the cytoplasm facilitate the hydrolysis of GTP to GDP, which promotes translocation of Ran-GDP to the nucleus. Once in the nucleus, Ran-GEFs promote the exchange of GDP for GTP. The interaction of the different forms of Ran allow for the recycling of shuttling proteins, allowing them to move proteins in and out of the nucleus.

The Nuclear Lamina

As mentioned, the nuclear envelope surrounds the nucleus and connects to the lamina on its inner face. The nuclear lamina itself is about 75 nm thick and is composed of proteins similar to intermediate filaments, called lamins. Lamins come in three forms: A, B, and C. Lamins A and C bind heterochromatin. Lamin B binds the lamin B receptor (LBR), which is connected to the nuclear envelope as an integral membrane protein. The lamin complexes also peripherally bind to the NPC. Lamins maintain the nucleus in its spherical shape, and phosphorylation of A and C subunits solubilizes them during prophase, allowing dissolution of the nuclear lamina. Lamin B remains attached to its receptor during prophase, however. Importantly, mutations in these proteins can cause laminopathies because they are involved in nuclear organization, and mutations can result in inhibited DNA synthesis.

The Inner Nuclear Matrix

Study of the inner nuclear matrix has shown that the protein composition is cell-type specific. Additionally, chromosomes are not positioned randomly in the nucleus. This has been exported by FISH using whole-chromosome probes. It appears that chromosomes occupy specific territories and, at least in yeast, they take the Rabl conformation, with the telomeres and centromeres directly interacting with the nuclear lamina. Matrix proteins function in both replication and transcription, and mRNAs from active genes can be found in the matrix, as can newly replicated DNA. Additionally, matrix proteins of cancerous cells are different from normal cells.

Chromatin Review Articles:

Campos, E. I. and D. Reinberg. 2009. Histones: Annotating chromatin. Annu. Rev. Genet. 43:559-599.

Clapier, C. R. and B. R. Cairns. 2009. The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 73:273-304.

Kouzarides, T. 2007. Chromatin modifications and their function. Cell. 128:693-705.

Margueron, R. and D. Reinberg. 2010. Chromatin structure and the inheritance of epigenetic information. Nat. Rev. Genet. 11:285-296.

Rando, O. J. and H. Y. Chang. 2009. Genome-wide views of chromatin structure. Annu. Rev. Biochem. 78: 245-271.

Sunday, April 11, 2010

Oncogenesis Part 2: Genetic Instability, Colon Cancer, Transformation

Genetic Instability in Cancer Cells

A gene that receives a great deal of attention in cancer research is p53. This protein acts as a tumor suppressor and is mutated in about 50% of all cancers. Further, p53 is involved in a number of pathways, including apoptosis and genetic stability, so misregulation of the protein is a common factor in tumor cells. Normally, very little p53 is present in cells, but it is induced during cellular stress. When the cell experiences stress, p53 can induce apoptosis or cell cycle arrest by binding DNA and increasing p21 transcription, which acts as a CKI (see mitosis posts). If p53 is lost, as it is in many cancers, the cell will replicate when it is not supposed to, and DNA accumulates a number of mutations (genetic instability). Additionally, most cells will stop dividing when the telomeres shorten to a critical length, which is facilitated by p53. When p53 is lost, even shortened telomeres don’t stop the cell from dividing, and genetic instability, again, is increased. Some of these genetically unstable cells will upregulate telomerase, which will allow for continued proliferation.

Colon Cancer Example

In normal colon cells, the APC protein inhibits cell cycle progression by preventing Wnt from activating c-myc, which is required for progression from G₁ to S. If APC becomes mutated, the cell can progress through mitosis unchecked. Because APC acts as a tumor suppressor, an individual must have two alleles that become mutated. Individuals with a germline mutation in APC have an increased risk of colon cancer. Further, if Ras becomes unregulated, it can stimulate MAPK signaling, leading to uncontrolled proliferation.

Colon cancer progresses through a number of stages:

Normal epithelium
Hyperplastic epithelium (via loss of APC)
Early adenoma
Intermediate adenoma (via activation of K-Ras)
Late adenoma (via loss of Smad4 and other tumor suppressors)
Carcinoma (via loss of p53)
Metastasis

There exist a number of pathways that colon cells can become cancerous through the above stages. The exact number of steps involved in malignant tumor progression is unknown, and the steps also vary based on type of tumor, though the general mechanism is similar.

Cell Senescence and Telomerase

In a study performed by Hayflick, cells that were explanted from tissue were shown to double roughly 60 times before entering senescence, a period when the telomeres are short, and the cells no longer proliferate. Some cells are able to pass through the senescence stage and enter crisis, which lasts roughly 10-20 generations. If a cell is able to pass through crisis and still undergo mitosis, it is considered immortal.

Telomerase is the enzyme that can prevent cells from entering senescence. When the catalytic subunit of telomerase (hTERT) is expressed, the telomeres are no longer degraded with each division. With telomeres that are no longer shorted, there is no signal to p16^INK4A through pRb and p53 to enter senescence, and the cell continues to divide. The expression of hTERT in HEK (hamster embryonic kidney) cells prevents the entry of the cells into senescence.

Growth Signaling and Transformation

In order for a cell to divide, it must receive a number of signals that indicate that the environment is appropriate for it to divide. In addition to dividing, tumor cells must be able to grow in size. The growth signaling pathway that has received the most attention has been that involving Ras. Nonetheless, there are a number of pathways that feed into cell proliferation signals, and these are often the genes that are altered in cancerous cells.

In culture, cells that have been transformed exhibit the ability to form foci. Cells in tissue culture are usually inhibited when a confluent monolayer has been established. Those cells that are able to grow on top of each other in an unregulated fashion are considered transformed. In a 3T3 cell, a common cell type used for understanding oncogenes, those cells that form a focus and are transformed have a mutation in p16, leading to a loss of function (p16 is a CKI)

The 3T3 Transformation Assay

Transfect 3T3 cells with DNA from cancer cells
Allow the cells to form foci
Isolate DNA from foci and transform into new 3T3 cells
Isolate DNA from new foci and generate a phage library
Screen the phage library (with Alu probe) to identify human sequences

It is important to note that 3T3 cells are mouse cells, which facilitates the identification of human sequences using Alu elements (the mouse cells will not have these sequences).

Using the 3T3 transformation assay, the Ras onocogene was found. The assay has also allowed for the identification of several other proto-oncogenes (genes that have the ability to become oncogenic). Proto-oncogenes are usually activated via a gain-of-function mechanism, such as constitutive activity. Approximately 100 oncogenes have been identified through the 3T3 assays and other methods. Oncogenic collaboration is the cooperation between oncogenes to facilitate the faster formation of tumors.

Proto-oncogenes can become oncogenes in several ways:

Point mutations conferring constitutive activity
Gene amplification leading to overexpression
Chromosomal translocations putting the proto-oncogene under the control of a different promoter
Chromosomal translocations that fuse two genes to make a chimeric protein with constitutive activity

Thursday, April 8, 2010

Oncogenesis Part 1: General Introduction

Oncogenesis

Benign tumors consist of cells that closely resemble and may function as normal cells that do not form malignant tumors. They remain localized and stay small, with a fibrous capsule bounding the tumor. However, if the benign tumor begins to interfere with normal function of other cells, such as via secretion of substances (such as hormones), the tumor can become a problem.

In the case of malignant tumors, the cells express some characteristic proteins, but they grow out of control and more rapidly. Malignant tumors also invade other tissues and can grow in sites vastly different from where they originated in a process terms metastasis. The most diagnostic characteristic of a malignant tumor is its ability to invade other tissues. These cells can break contacts with other cells and pass through the basal lamina to reach different areas of the body. To facilitate this, cancer cells can secrete proteases such as plasminogen activator. Plasminogen activator results in the formation of active plasmin from plasminogen, which helps to break down the basal lamina. When the cell has passed the basal lamina, it can enter the blood stream and move to nearly any site in the body. Roughly on in a million cells will be able to colonize another tissue, a hallmark of malignant tumors.

Malignant tumor cells also have a high nucleus-to-cytoplasm ratio, with prominent nucleoli and many mitochondria, indicating high metabolic rates and significant growth. Additionally, these cells appear to be de-differentiated, with few specialized structures that we would normally see in a cell. Based on the cell’s gene expression and morphology, one usually is able to identify the source of a malignant tumor because it does retain some of the characteristics of its cellular origin.

Carcinomas are cancers that have derived from the endoderm or ectoderm, while sarcomes are derived from the mesoderm. Approximately 200 different cancer types have been identified, and there are approximately 300 different cell types in the body, meaning that cancer can arise from nearly any cell.

When a benign tumor grows, it is largely limited in size due to the inability for the tumor to acquire nutrients. These tumors rely on diffusion to fuel the tumor cells. In contrast, malignant tumors grow large, and in order to provide the nutrients necessary for growth of the tumor, they must recruit the formation of blood vessels in a process termed angiogenesis. When a malignant tumor reaches a size over one million cells (or about 2 mm in diameter), it will induce the formation of blood vessels to provide nutrients to more cells. Factors such as bFGF, TGFα, and VEGF are secreted by many tumors to facilitate angiogenesis. Malignant tumor cells can also secrete factors that affect nearby cells to promote angiogenesis. The larger the primary tumor, the higher the risk of metastasis is.

Steps in metastasis of epithelial cells

Upregulated cell growth in epithelium
Invasion of the basal lamina via an invadopodium
Entry into blood vessels; traveling through the bloodstream
Adherence to blood vessel wall
Escape from blood vessel
Colonizing of foreign tissue

The invadopodium consists of all the factors one would expect in an appendage to the cell: actin regulators such as WASP and Arp2/3; signaling molecules such as Cdc42; adhesion molecules such as integrins; and membrane remodeling complexes.

Tumors can be viewed as complex tissues in which the cell types have mutated to make a new phenotype (neoplastic phenotype). It is important to note that these tumors are not growing in isolation, and they require interactions with non-cancer cells to grow as well. The newest cancer therapies have begun to target these interactions in hopes of stemming tumor growth.

Genetically, cancer cells are messed up. Therefore, one mutation in a cell does not cause cancer: there are many different genes and factors that must become differently active for cancer to evolve. Many different types of cancers evolve over time, meaning that the longer one lives, the higher the chance of cancer developing. The multi-hit model indicates that successive mutations in a cell, each of which confers a growth advantage, will promote cancer development.

For cancer cells to survive and proliferate unchecked, there are a number of functional capabilities that it must alter:

Self-sufficiency in growth signals
Insensitivity to anti-growth signals
Tissue invasion (metastasis)
Limitless replication
Angiogenesis
Evasion of apoptosis

There are a number of ways for a cancer cell to alter these pathways, and the order in which these capabilities emerge is different based on cell type and cancer type. Additionally, there are a number of ways that cells are able to alter their function. For example, the mutation of pRb can allow the cell to become immortalized and resist growth inhibition. Mutations in p53 confer apoptosis evasion, resistance to growth invasion, and immortalization. hTERT, which is the catalytic domain of telomerase, mutations will allow unlimited replication. Mutations in Ras will allow for apoptotic evasion, growth in the absence of signals, angiogenesis, and metastasis. Finally, mutations in PP2A (protein phosphatase 2A) affects signaling pathways that can affect a number of these alterations in cellular function.

Sunday, April 4, 2010

Chromatin Chapter 3: Higher Orders of Chromatin Structure

I don't have an illustration today because (a) I am too lazy to make one, (b) no topic in this post really requires an illustration, and/or (c) I have other things to do. :)

Higher Orders of Chromatin Structure

The two prior posts concerning nucleosomes and histones compose the first order of chromatin structure, but this packaging alone does not explain how so much DNA is packaged into cells. Four additional levels have been hypothesized, with varying degrees of evidence for each.

The second order of chromatin structure is the 30-nm fiber, which consists of six nucleosomes and binds about 1200 bp. EM images have shown a twisting of the nucleosomes around each other to form a structure that is about 30 nm in diameter when the nucleosomes are in high salt. There are a few differing models for how the nucleosomes wind around each other. The first model consists of nucleosomes forming a solenoid, in which they nucleosomes wind around each other in a helix, with six nucleosomes in each turn of the helix. Histone H1 is also involved in this model, which is the most widely accepted. The second model is the double-solenoid structure, which consists of two parallel rows of nucleosomes that wind around each other. Recent evidence via nucleosome arrays have lent more evidence to this model, though this area of research is still active.

The compaction introduced by the 30-nm fiber can result in the condensation of DNA such that it is not available for the proper factors to bind the DNA. Heterochromatin consists of nucleosomes that condense so highly that the genes contained in the heterochromatin are repressed. This condensation is is facilitated by heterochromatin protein 1 (HP1), which is a non-histone protein that binds methylated lysine 9 of histone H3 (Me-H3K9). In contrast, euchromatin consists of DNA that is not repressed and is accessible to factors required for transcription. This access is facilitated by H2A.Z, which prevents full condensation of the nucleosomes into a 30-nm fiber.

The third order of chromatin structure is the chromatin loop, which holds fifty 1200-bp fibers, packaging a total of about 60,000 bp of DNA. These loops are attached the nuclear matrix, at matrix attachment regions (MARs), which promotes supercoiling. The factors involved in transcription are supercoiled and attached to the nuclear matrix, so those genes in the loops that are transcriptionally active tend to interact with the nuclear matrix as well. Supercoiling of prokaryotic DNA is performed by DNA gyrases, but eukaryotic DNA gyrases have not been discovered. Nonetheless, supercoiling does facilitate eukaryotic gene transcription as well. Further, loops of the chromatin can interact to affect gene regulation. Studying the interactions of a gene with the nuclear matrix indicates which parts of the DNA are transcriptionally active. By isolating the nuclear matrix and detecting DNA sequences associated with the nuclear matrix (via probing with radio-labeled ³²P dATP), one is able to detect which regions of the DNA are directly associated with the nuclear matrix via MARs.

Initially, it doesn’t make logical sense that active DNA is supercoiled. However, this is due to the preferential binding of transcription factors to supercoiled DNA. Additionally, repression factors tend not to bind as well. The job of removing supercoiling is left to topoisomerases, which cut and reseal the DNA, while unwinding DNA and releasing tension. Topoisomerase I cuts and re-seals one strand, while topoisomerase II cuts and re-seals two strands. Additional details about the mechanism of topo II will be discussed in the meiosis posts.

Real-time fluorescent microscopy has led to a better understanding of the dynamics of chromatin. While electron microscopy gives a static picture of chromatin, the addition of GFP labels to the chromatin and measuring the movement of chromatin in relation to the nuclear pores (the reference points) has revealed movement of the chromatin that relates to metabolic activity. During both transcription and chromatin remodeling, the chromatin has been shown to be highly dynamic.

The fourth order of chromatin structure is called the miniband, which consists of 18 loops, making up a total of one million bp of DNA. The miniband looks like a helix of loops, with the nuclear matrix inside the helix. Little else is known about it, other than it is highly condensed.

Finally, the fifth order of chromatin structure is the chromosome, which we know consists of roughly 75 million bp of DNA, depending on which chromosome you consider. The structure of the chromosome is interesting in its own right, and there are a number of features to be discussed. The telomeres of the chromosomes consist of tandem repeats of the sequence TTAGGG, up to 15 kbp. The telomeres cap the ends of the chromosomes and indicate the replicative capacity of the chromosome. The telomere hypothesis posits that cells become senescent at a threshold telomere length, meaning that cells have a finite number of divisions. Telomeres shorten with each chromosome replication (and cell division) due to the inability of the cell to replicate the linear ends of the chromosomes. Telomeres in sex cells are long because they replicate frequently, while somatic cells that do not actively replicate have short telomeres. Therefore, the telomere length and its regulation make up the major aging mechanism in the cell. Those cells that will divide many, many times express the protein telomerase, which is involved in lengthening the telomeres at the ends of the chromosomes by providing an RNA template and polymerase function.

One can stain different regions of chromosomes to obtain an idea of how actively transcribed it is. Giemsa staining and visualization by light microscopy is the most frequently used method. The chromosomes are trypsin digested and stained with Giemsa. G-light areas, which are areas that the Giemsa does not stain well are considered unfolded and relaxed. The genes in these G-light bands are susceptible to radiation and are often oncogenes. Additionally, these genes are usually constitutively active (housekeeping genes). In contrast, G-dark bands are not sensitive to trypsin digestion and appear dark when stained with Giemsa. The genes in these bands are considered tissue-specific and are replicated later during S-phase. Finally, C-bands (for constitutive heterochromatin) remain condensed and are also dark when stained with Giemsa. The areas of the chromosome found in C-bands replicate late in S-phase and may consist of telomeres, the centromere, and satellite DNA.