Richard Crooks's Website

Gene Structure

Gene Structure

Genes are the components of the chromosomes, the long sequences of DNA, that provide the instructions to an organism for how to produce a gene product, such as a protein.

Genes are not just sequences of nucleotides, the sequences also have a regulated structure (Figure 1). This means that rather than simply being a sequence of DNA that is always transcribed, how and when the DNA is transcribed can be regulated. This allows the DNA to be transcribed in response to the organism's environment, or other factors, such as growth signals to stimulate the growth of particular types of cell. This allows a complex multicellular organism, such as a human for example, to have the same genome in all of its vastly different tissues.


Figure 1: The structure of a gene. At the 5’ end of the gene is the promoter region where transcription factors bind and RNA polymerase begins transcription. This promoter allows for specific genes to be transcribed in response to different specific stimuli, as different transcription factors are activated by different stimuli. Following the promoter is the 5’ untranslated region (5’-UTR) which while is transcribed, is not translated, as it is before the methionine start codon (). In eukaryotes, such as plants, animals and fungi, the coding sequence is punctuated by introns, which are sections of the gene which are removed, leaving only the translated exons. After the stop codon ( , or ) is the 3’ untranslated region (3’-UTR), which like the 5’-UTR is not part of the translated protein. The untranslated regions do play an important role in gene regulation, as mutations which cause early termination of the protein can cause the process of nonsense mediated decay, whereby the cell detects that the gene has suffered a mutation due to the excessive length of the 3’-UTR.

At the 5' end of a gene is the promotor region. The promotor region is a region of DNA that can bind to particular proteins known as transcription factors (Figure 2). Transcription factors are proteins that have DNA binding regions, as well as regions that interact with other proteins. Transcription factors are a crucial link in the cell, being the downstream endpoints of signalling pathways that the cell uses to respond to its environment. The signalling pathways activate the transcription factors, and these in turn bring the enzymes responsible for RNA transcription into contact with the DNA, thus expression of the DNA in response to the environment relies on the promotor and the transcription factors which bind to it.


Figure 2: Transcription factors, like the cJun-cFos Activator Protein 1 (AP-1) transcription factor bind to specific sequences of DNA in the promoter regions of genes. These bring other proteins, including RNA polymerase, into close proximity to the genes and thus allow gene transcription to take place. This creates a physical link between genes and the environmental stimuli which encourage their expression, as the transcription factors are a downstream endpoint of many signalling pathways.

The region of a gene that encodes for a protein always begins with a methionine residue, which is the codon , and end with a stop codon that is either TAA, TAG, or TGA. Because there are nucleotide sequences between the promotor and the start codon, and between the stop codon and the end of gene, this means there are both 5' and 3' untranslated regions (UTRs). These regions aren't encoded, but have some role in regulating the gene. A mutation in the 5'UTR that introduces a new start codon would cause a different protein to be produced, while a mutation that causes early termination of a protein would lead to a lengthened 3'UTR that would be detected and the resulting mRNA would be removed by the process of nonsense mediated decay.

Introns are regions within the coding regions of genes which are not translated into protein, and are instead removed after being transcribed. Introns allow genes to have flexibility in varying which proteins are translated from them. Thus the presence of introns changes the model of one gene to one protein into a one gene to many proteins model. Introns are contrasted with exons, which are regions of genes which are translated. Introns are only found in complex eukaryotic organisms, which is all plants and animals, and many complex single cellular organisms too, and they are not found in bacteria.

The complexity of gene structure allows an organism to vary how and when it responds to the environment. However it also presents challenges for clinical geneticists and biotechnologists who seek to use the information in the genome to diagnose disease and develop products.

Back to About Genetics
Back to Biology Index
Back to Website Home

About this background