800.227.0627

The Genetic Code and its Nomenclature

The genetic code is a set of rules mapping codons to amino acids. This is the alphabet used to encode genetic information for the synthesis of proteins. 

•    There are 64 codons. Each is a triplet of nucleotides.

•    Only twenty (20) amino acids are used, called standard amino acids.

•    XYU and XYC always code the same amino acid.

•    XYA and XYG often code the same amino acid.

•    In 8 out of 16 possible cases, XY• encodes a single amino acid, where • represents any
      of the four bases.

•    The code is nearly universal. It appears that the vast majority of living organisms on
      Earth us this code. This is known as the Canonical Genetic Code.

•    There appears to be an underlying order. For example, all codons with a U in the
      second place code for hydrophobic amino acids. 

The information for protein synthesis is stored in genomic deoxyribonucleic acid (DNA). However, ribonucleic acid (RNA) carries out the instructions encoded in DNA. Proteins carry out most biological activities. For cells to function well, proteins need to be accurately synthesized. The linear order of amino acids in each protein determines its function. Therefore, mechanisms that maintain the synthesis order during protein synthesis are critical. Many textbooks covering molecular cell biology describe the central dogma in detail and are available for review. Numbering conventions are used for the chemical description of nucleic acids, the building blocks of oligonucleotides.

In cells, the synthesis of DNA, RNA and protein is circular. The flow from nucleic acids to proteins has been called the central dogma of molecular biology. This information flow can be written as follows:

1.    DNA directs the synthesis of RNA,

2.    RNA directs the synthesis of proteins,

3.    Proteins catalyze the synthesis of both RNA and DNA.


The premises are:    DNA encodes mRNA, and mRNA, protein

The conclusion is that    “Genes are the blueprint for life”

In past years, the central dogma has guided research to determine the causes of diseases and phenotypes, and also guided the development for tools that allowed theses scientific studies to occur. The final relay and expression of genetic information in a time-dependent manner depend on molecular nano-machines present in cells. Many of these function as nucleic acid translocases and many molecular machines have now been studied and are described in detail. The majority appear to function as enzymes that couple a thermodynamically spontaneous chemical reaction such as nucleotide hydrolysis to a mechanical task.


Figure 1: The central dogma of molecular biology  with expanded functions is illustrated in this figure.


Four general rules have emerged from the review of experimental data:

1.  Proteins and nucleic acids are made up of a limited number of different subunits.

2.  Subunits are added one at a time.

3.  Each chain has a specific starting point. Growth proceeds in one direction to a
     fixed terminus.

4.  The primary synthetic product is usually modified.

 

The canonical genetic code


(Ref.: Harvey Lodish, Arnold Berk, S Lawrence Zipursky, Paul Matsudaira, David Baltimore, and James Darnell. Molecular Cell Biology, 4th edition. Molecular Cell Biology, 4th edition. New York: W. H. Freeman; 2000. ISBN-10: 0-7167-3136-3
).

 

RNA to Amino Acids

First Position (5’ end)

Second Position

Third Position (3’ end)

 

 

U

C

A

G

 

 

 

U

 

 

Phe (F)

Phe (F)

Leu (L)

Leu (L)

Ser (S)

Ser (S)

Ser (S)

Ser (S)

Tyr (Y)

Tyr (Y)

Stop (och)

Stop (amb)

Cys (C)

Cys (C)

Stop

Trp (W)

U

C

A

G

 
 
 
 

 

C

 

 

Leu (L)

Leu (L)

Leu (L)

Leu (L)

Pro (P)

Pro (P)

Pro (P)

Pro (P)

His (H)

His (H)

Gln (N)

Gln (N)

Arg (R)

Arg (R) Arg (R) Arg (R)

U

C

A

G

 

 

A

 

 

Ile (I)

Ile (I)

Ile (I)

Met (Start)

Thr (T)

Thr (T)

Thr (T)
Thr (T)

Asn (N)

Asn (N)

Lys (K)

Lys (K)

Ser (S)

Ser (S)

Arg (R)

Arg (R)

U

C

A

G

 

 

G

 

 

Val (V)

Val (V)

Val (V)

Val (V) (Met)

Ala (A)

Ala (A)

Ala (A)

Ala (A)

Asp (D)

Asp (D)

Glu (E)

Glu (E)

Gly (G)

Gly (G)

Gly (G)

Gly (G)

U

C

A

G

 

 

Note: “Stop (och)” stands for the ochre termination triplet, and “Stop (amb)” stands for the amber, named after the bacterial strains in which they were identified. AUG is the most common initiator codon. 

There are three different stop codons in the standard genetic code:

 

DNA

TAG ("amber")

TAA ("ochre")

TGA ("opal" or "umber")

RNA

UAG ("amber")

UAA ("ochre")

UGA ("opal")


[For more info on codons and their variation see: https://en.wikipedia.org/wiki/Genetic_code, and
https://en.wikipedia.org/wiki/Genetic_code#Variations_to_the_standard_genetic_code]

 IUPAC Codes used for DNA   [INTERNATIONAL UNION OF PURE AND APPLIED CHEMISTRY]


Recommendations on Organic & Biochemical Nomenclature, Symbols & Terminology etc.

Nucleic Acids, Polynucleotides and their Constituents

Amino Acids and Peptides

Nucleotide Code

Base

Mnemonic

A

Adenine

 

C

Cytosine

 

G

Guanine

 

T

Thymine

 

U

Uracil

 

R

A or G

puRine

Y

C or T

pYrimidine

S

G or C

Strong interaction

W

A or T

Weak interaction

K

G or T

Keto group

M

A or C

aMino group

B

C or G or T

Not A

D

A or G or T

Not C

H

A or C or T

Not G

V

A or C or G

Not T/U

N

any base

aNy

. or -

gap

 

 

IUPAC amino acid code

Three letter code

Amino acid

A

Ala

Alanine

C

Cys

Cysteine

D

Asp

Aspartic Acid

E

Glu

Glutamic Acid

F

Phe

Phenylalanine

G

Gly

Glycine

H

His

Histidine

I

Ile

Isoleucine

K

Lys

Lysine

L

Leu

Leucine

M

Met

Methionine

N

Asn

Asparagine

P

Pro

Proline

Q

Gln

Glutamine

R

Arg

Arginine

S

Ser

Serine

T

Thr

Threonine

V

Val

Valine

W

Trp

Tryptophan

Y

Tyr

Tyrosine

 

However, due to our new understanding of molecular processes taking place in a cell, the definition of the central dogma is expanding. A new synthesis of the central dogma is emerging. In this new view genetic information moves within and between different networks without strict directionality. The notion is that information now flows within and between genomic, transcriptomic, metabolomic, and proteomic networks in the cell.


Selected References

 

Bustamante C, Cheng W, Meija Y. Revisiting the Central Dogma One Molecule at a Time. Cell. 2011;144(4):480-497. doi:10.1016/j.cell.2011.01.033.

 

Johnson, A.D.; An extended IUPAC nomenclature code for polymorphic acids. Bioinformatics 2010, 26(10): 1386 – 1389. [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2865858/]


Franklin
S, Vondriska TM. Genomes, Proteomes and the Central Dogma. Circulation Cardiovascular genetics. 2011;4(5):576. doi:10.1161/CIRCGENETICS.110.957795.

 

Koonin EV. Does the central dogma still stand? Biology Direct. 2012;7:27. doi:10.1186/1745-6150-7-27.

 

McManus J, Cheng Z, Vogel C. Next-generation analysis of gene expression regulation – comparing the roles of synthesis and degradation. Molecular bioSystems. 2015;11(10):2680-2689. doi:10.1039/c5mb00310e.

 

Piras V, Tomita M, Selvarajoo K. Is central dogma a global property of cellular information flow? Frontiers in Physiology. 2012;3:439. doi:10.3389/fphys.2012.00439.

 

Wright LK, Fisk JN, Newman DL. DNA → RNA: What Do Students Think the Arrow Means? Campbell AM, ed. CBE Life Sciences Education. 2014;13(2):338-348. doi:10.1187/cbe.CBE-13-09-0188.

 

Woese CR, Dugre DH, Saxinger WC, Dugre SA. The molecular basis for the genetic code. Proceedings of the National Academy of Sciences of the United States of America. 1966;55(4):966-974.

 

Young E, Alper H. Synthetic Biology: Tools to Design, Build, and Optimize  Cellular Processes. Journal of Biomedicine and Biotechnology. 2010;2010:130781. doi:10.1155/2010/130781.

 

-.-