Deoxy Ribonucleic Acid
Introduction:
DNA molecules are made up of a string of deoxy ribonucleotides held by phosphodiester bonds. These molecules are made on polynucleotide templates; so also, RNA molecules. Quantity of DNA per cell for a given organism is constant and proportional to the ploidy i.e. per set of chromosomes, which varies from one organism to another organism. Lower organisms such as bacteria and viruses don’t have chromosomes and their genomic DNA is suspended in the cell or enclosed in a capsid (virus). The size of the genome in plants and animals vary by several folds, which is why it is called C-DNA (haploid DNA) value paradox. Ex. Humans contain 3.2x10^9bp genome but Paris (Pieris) japonica contains 52.3 times that of human genome. 1pg = 978x10^6bp = 0.978 x 10^9bp
Genome sizes:
Prokaryotic Organisms (few):
Name of the species
|
Genome size |
Number of Genes |
Size of Genes |
Mycoplasma genitalium,G37 |
0.582 x 10 6 bps |
521 (protein 475 genes) |
1040 bp |
Mycoplasma pneumoniae |
1 x 10 6 bps |
|
|
Myxophyceae: Dictyostelium discoideum |
5.4 x 10 7 bp |
|
|
Rikettsia prowazeskii |
|
|
|
Homophilus influenzae |
1.83x10^6bp |
1.83x10^6 |
|
E.coli k12 |
4, 641,652bp |
4141+176 |
1072 |
Eukaryotic organisms: Chromosome Numbers:
Plants (few haploid numbers):
Species |
N=? |
Saccharomyces |
16 |
Neurospora crassa |
7 |
Aspergillus nidulens |
8 |
Penicillium species |
4 |
Dictyostelium discoideum |
7 |
Chlamydomonas reinhardii |
16 |
Actabularia |
10 |
Antirrhinum majus (snap dragon) |
8 |
Lycopersicon esculentum |
12 |
Nicotiana tabaccum |
24 |
Pinus sylvestris |
12 |
Phaseolus vulgaris (French bean) |
7 |
Pisum sativum |
7 |
Vicia faba |
6 |
Allium cepa |
8 |
Allium sativum |
? |
Hordeum vulgare (barley) |
7 |
Triticum monococcum (wheat) |
12 |
Zea mays |
10 |
Animals (few haploid numbers):
Species |
Chromosome number, n=? |
Hydra |
16 |
Planaria torva |
8 |
C.rhabditis |
2n=11 male 2n=12 female |
Musa domestica |
6? |
Drosophila melanogaster |
4 |
Bombax mori |
28? |
Rana pipens |
13 |
X laevis |
13? |
Gallus domesticus (chick) |
39 |
Felis domesticus (cat) |
19 |
Cannis familiaris (dog) |
39 |
Bos taurus (cattle) |
30 |
Equs calibus (horse) |
32 |
Mus musculus (mice0 |
20 |
Rattus norvigicus |
22 |
Mesocricetus auratus (golden hamster0 |
22 |
Cavia cobaya (guinea pig) |
32 |
Oryctologus caniculus (rabbit) |
22 |
Macaca mulata (Rhesus monkey) |
21 |
Orangutan |
24 |
Gorilla gorilla |
24 |
Homo sapiens |
23 |
|
|
|
|
Genome size:
Plants (few).
Species |
Genome size |
Gene Numbers |
|
Pyrenoma salina (an algae) |
6.6 x 10^5 bp |
|
|
Neurospora crassa |
2.7 x 10^7bp |
|
|
Saccharomyces cerevisiae, n=16 |
1.35 x 10^7 bp |
~6034 |
|
Chloro archinophyte, an algae, and a nucleomorph (lives in a nucleus) |
3.8 x 10^5 bp |
|
|
Cryptophyte, a nucleomorph, n=3 |
6 x 10^5 bp |
|
|
Nicotiana tabaccum |
4.8 x 10^9 bp |
|
|
Prunus parsica |
2.62 x 10^8 bp |
|
|
Ricinus communis |
3.23 x 10^8 bp |
|
|
Citrus sienensis |
3.67 x 10^8 bp? |
|
|
Petunia paradii |
1.221 x 10^9 bp |
|
|
Pisum sativum, n=7 |
1.1315 x 10^9 bp |
|
|
Arabidopsis thaliana |
1.5 x 10^8 bp |
10000 (proteins=10^5) |
|
Zea mays |
5 x 10^9 bp |
20000 (Proteins=4x10^6). 250000 |
|
Oriza sativa |
4.66X 10^8 |
40 000 |
|
Avena sativa |
1.315 x 10^10 bp |
|
|
Tulipa |
2.47 x 10^10 bp |
|
|
Lillium formosanum-Lily |
3.6 x 10^10 bp |
15000
|
|
Ophioglossum petiolatum |
1.6x10^12 |
20000
|
|
Allium cepa |
1.8x10^9 |
20000 |
|
Psilotum nudum |
|
|
|
Fritillaria assyrica |
130x10^9bp=132pg |
|
|
Paris japonica |
150x10^9bp=152g |
|
|
Few Animals (few):
Species |
Genome size |
Plasmodium falciparum |
2.7 x 10^7 bp |
C.elegans (a nematode), n=6 |
8 x 10^7 bp |
D.melanogaster, n=4 |
1.4-1.65 x 10^8 bp |
House fly, n=4 |
8.6 x 10^8 bp |
Pond frog |
2.3 x 10^10 bp |
X.laevis, n=13? |
3.1 x 10^9 bp |
Sea urchin |
8.0 x 10^8 bp |
Zebra fish; Danio revio |
1.7 x 10^7 bp |
Lung fish, n=19 |
1.02 x 10^11 bp |
Gallus domesticus (chick), n= |
1.2 x 10^9 bp |
Mus musculus (mice) |
2.6 x 10^9 bp |
Rattus norvigeticus (rat) |
3 x 10^9 bp |
Homosapiens, n=23 |
2.9 x 10^9 bp in males? 3.3 x 10^9 bp in females? |
Polychaos dubium |
670X10^9bp, 670GB |
Protopterus aethiopicus |
130x10^9= 130Gb |
Organelle DNA (few):
Chloroplasts: Plastome:
Species |
Genome size |
Genome number per organelle |
Organelle per cell |
||
Chlamydomonas |
180kbp, circular, codes for 120 to 150 proteins |
80 |
One |
||
Maize= Zea mays; nucleoids |
120-200kbp, circular, codes for 120 to 150 proteins |
20-40 |
20-40 |
||
Liverwort (Bryophyte) |
121kbp, circular |
|
|
||
Nicotiana tabaccum |
155,939bp, circular |
|
|
||
Arabidopsis thaliana |
154,478Bp |
|
|
||
Cocos nucifera |
154,731Bp |
|
|
||
Vigna radiata |
151,271Bp |
|
|
||
Welwitschia |
120-160,000Bp |
|
|
||
Phuyllostachys edulis(woody bamboo) |
139,679bp |
|
|
||
Mitochondria:
Species |
Genome size, number of proteins produced |
Genome numbers per organelle |
Number of organelle per cell |
||||
Homo sapiens |
16.569bp, circular, codes for 13 proteins, 5-7 URFs, no introns |
5-10 |
200-1000 or more |
||||
S. cerevisiae |
84kbp, contain introns, codes for 13 proteins, few URFs, contain introns |
20-50 |
1-200 |
||||
Frog’s Oocyte |
|
5-10 |
10^6 to 7 |
||||
Neurospora |
19-108kbp, contain introns |
10-20 |
50-100 |
||||
Chlamydomonas |
16000, linear |
5-10 |
1? |
||||
Species |
Master Circle Size (kb) |
Sub-genomic Circle Size (kb) |
Repeat Size (kb) |
||||
Turnip |
218 |
135 + 83 |
2 |
||||
Cauliflower |
217 |
172 + 45 |
? |
||||
Black Mustard |
231 |
135 + 96 |
7 |
||||
White Mustard |
208 |
none |
none |
||||
Radish |
242 |
139 + 103 |
10 |
||||
|
|
|
|
|
|||
Rice |
-490,520kBp |
|
|
|
|||
Arabidopsis thaliana- |
366,974 (57genes),
|
|
|
|
|||
Wheat- |
105-704kbp |
|
|
|
|||
Cucumber |
~200kbp |
|
|
|
|||
Hu-
|
16,9569, bp, |
|
|
|
|||
Melon,
|
-2.4mbps |
|
|
|
|||
Maize-
|
569630bp, |
|
|
|
|||
C.elegans- |
13,794 bps,
|
|
|
|
|||
S.cerevisiae Yeast-
|
85.779KBp |
|
|
|
|||
Corn- |
570KBp |
|
|
|
|||
Turnip-
|
218KBp |
|
|
|
|||
Oenothers-
|
195KBP |
|
|
|
|||
Muskmelon- |
2400 kbp |
|
|
|
|||
|
|
|
|
|
|||
Some of the Genomic works completed:
E. coli: 1991-1997; n = 1; = 4.6x10^6; no. of genes = 4405;( genes similar to humans, ChaC of E. coli homolog of ChaC2 of humans.
Yeast: 1989-1996; n=16 = 1.2 x10^7 bp; number of genes = 6241; there are 120 out of human’s 289 genes on mutations to cause disease. Information indicates that its mechanisms are as complex as any other organism.
C.elegans:1990-98; n = 6 = 1x10^8bp; no. of genes = 19000; 9500 genes are approximately similar to Human genes. This is an excellent system to understand cell lineage and tissue construction.
Fruit fly: 1999-2000; n = 4; = 1.8x10^8 bp, no. of genes = 13000; among them 177 genes are similar to humans.
Arabidopsis thaliana: 1996-2000; n= 5; = 1.18x^8; no. of genes = 25500, about 100 genes are similar to that of Humans.
Pieris japonica- 152pg = 978x10^6bp,
Frittilaria aethiopicus- 132.5pg,
Polychoas dubia - 670 x 10^9bp.
Mouse: Gallus gallus; 1999-2005; n = 40? Number of genes = 40000; a majority of them are related to or similar humans.
Human: 1990-2003; n = 23; 3pg = 3.3x10^9; estimated number of gene varies, theoretically 150, 000, but actually the number is about 19-20 000 (protein coding genes), ENCODE- CRISPER-Cas9. Functional Non-Coding RNA genes (snRNA, miRNA and others), number? Cellular Barcode-regulatory sequences?
Interesting genomic sizes:
Encephalitozoon intestinalis: 0.0023pg human intestine parasite (smallest genome).
Candidatus Carsonella ruddii: 160kb
Fritillaria assyriaca: (Trillium × hagae), Fritillary (132.50 pg).
Proteopterus aethiopicus: marbled lung fish with 132.83 pg.
Paris japonica: 152.3 pg; = 152.3 x 978 x 10^6 bp; Number of genes? ~148.9 x 10^9bp (1pg = 0.978 x10^6)
Polychaos dubium: 670 x 10^9 base pairs; an amoeba-largest genome.
Organisms used and time taken for size/sequence analysis:
Organism |
#bp |
Time in hrs/days |
#genes |
#bp/gene |
PhiX174 |
5386 bp |
1.5hrs |
9 |
598 |
E.coli |
4639 221 bp |
54 days |
4288 |
1072 |
S.cerevisiae |
12 057849 bp |
140 |
6269 |
1923 |
C.elegans |
~97000000bp |
3.1yrs |
19099 |
5079 |
A.thalliana |
~125000 000bp |
4yrs |
25498 |
4902 |
D.melanogastor |
~180 000 00bp |
5.7yrs |
13600 |
13235 |
H.sepians |
~3.2 x 10^9bp |
108yrs |
22000 |
113 333 |
Pieris (Paris) japonica |
~3.2x15x10^9 |
|
|
15 times human genome |
|
|
|
|
|
Genome Sizes:
The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like us) contain two genomes, one inherited from mother, the other from father. In flowering plants, it can be bisexually or unisexually inherited.
The majority of modern genome size estimates is based on either Feulgen densitometry (more recently using computerized image analysis) or flow cytometry, although DNA reassociation kinetics, bulk fluorometry, static fluorometry, DAPI method, Propidium iodide, Electrophoretic methods, Quantitative real-time PCR and complete genome sequencing have also been used.
Genome sizes are typically given as gametic nuclear DNA contents (‘C-values’) either in units of mass (picograms, where 1 pg = 10−12 g) or in number of base pairs (in eukaryotes, most often in mega bases, where 1 Mb = 106 bases). These are directly interconvertible as 1 pg = 978 Mb (or 1 Mb = 1.022 × 10−3 pg). Note: 1pg of ds DNA= 0.978 x 10^9 bp. The term "C-value enigma" represents an update of the more common but outdated term "C-value paradox. "C-value" (Swift 1950) refers to haploid nuclear DNA content. The term was coined by Canadian biologist Dr. T. Ryan Gregory of the University of Guelph in 2000/2001.
Tree of life with genome sizes as outer bars; http://www.en.wikipedia.org
The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.
Table of Genome Sizes (haploid) |
|||
|
Base pairs |
Genes |
Notes |
5,386 |
11 |
virus of E. coli |
|
Human mitochondrion |
16,569 |
37 |
|
172,282 |
80 |
causes mononucleosis |
|
Nanoarchaeum equitans |
490,885 |
552 |
This parasitic member of the Archaea has the smallest genome of a true organism yet found. |
551,264 |
511 |
all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote |
|
580,073 |
485 |
two of the smallest true organisms |
|
816,394 |
680 |
||
1,042,519 |
936 |
this bacterium causes the most common sexually-transmitted disease (STD) in the U.S. |
|
1,111,523 |
834 |
bacterium that causes epidemic typhus |
|
1,138,011 |
1,039 |
bacterium that causes syphilis |
|
Mimivirus |
1,181,404 |
1,262 |
A virus (of an amoeba) with a genome larger than the six cellular organisms above |
Pelagibacter ubique |
1,308,759 |
1,354 |
smallest genome yet found in a free-living organism (marine α-proteobacterium) |
1.44 x 106 |
1,738 |
bacterium that causes Lyme disease [Note] |
|
1,641,481 |
1,708 |
frequent cause of food poisoning |
|
1,667,867 |
1,589 |
chief cause of stomach ulcers (not stress and diet) |
|
1,564,905 |
1,509 |
These unicellular microbes look like
typical bacteria but their genes |
|
1,664,970 |
1,783 |
||
1,669,695 |
1,885 |
||
1,751,377 |
2,008 |
||
1,830,138 |
1,738 |
bacterium that causes middle ear infections |
|
2,160,837 |
2,236 |
the pneumococcus |
|
Neisseria meningitidis |
2,184,406 |
2,185 |
Group A; causes occasional epidemics of meningitis in less developed countries. |
2,272,351 |
2,221 |
Group B; the most frequent cause of meningitis in the U.S. |
|
2,507,519 |
1,997 |
(plus 69 RNA genes); a parasitic eukaryote. |
|
Propionibacterium acnes |
2,560,265 |
2,333 |
causes acne |
2,944,528 |
2,926 |
2,853 of these encode proteins; the rest RNAs |
|
3,284,156 |
3,187 |
on 2 chromosomes and 2 plasmids; bacterium noted for its resistance to radiation damage |
|
3,573,470 |
4,003 |
a marine cyanobacterium ("blue-green alga") |
|
4,033,460 |
3,890 |
in 2 chromosomes; causes cholera |
|
4,411,532 |
3,959 |
causes tuberculosis |
|
3,268,203 |
1,604 |
causes leprosy |
|
4,214,814 |
4,779 |
another bacterium |
|
E. coli K-12 |
4,377 |
4,290 of these genes encode proteins; the rest RNAs |
|
E. coli O157:H7 |
5.44 x 106 |
5,416 |
strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12 |
Agrobacterium tumefaciens |
4,674,062 |
5,419 |
Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti |
4,809,037 |
4,395 |
+ 2 plasmids with 372 active genes; causes typhoid fever |
|
4,857,432 |
4,450 |
+ 1 plasmid with 102 active genes |
|
4,826,100 |
4,052 |
on 1 chromosome + 3 plasmids; causes plague |
|
12,462,637 |
4,929 |
Fission yeast. A eukaryote with fewer genes than the four bacteria below. |
|
Ralstonia solanacearum |
5,810,922 |
5,129 |
soil bacterium pathogenic for many plants; 1681 of its genes on a huge plasmid |
6.3 x 106 |
5,570 |
Increasingly common cause of opportunistic infections in humans. |
|
Streptomyces coelicolor |
6,667,507 |
7,842 |
An actinomycete whose relatives provide us with many antibiotics |
6,691,694 |
6,204 |
The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids. |
|
12,495,682 |
5,770 |
Budding yeast. A eukaryote. |
|
Cyanidioschyzon merolae |
16,520,305 |
5,331 |
A unicellular red alga. |
22,853,764 |
5,268 |
Plus 53 RNA genes. Causes the most dangerous form of malaria. |
|
Thalassiosira pseudonana |
34.5 x 106 |
11,242 |
A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins |
38,639,769 |
10,082 |
Plus 498 RNA genes. |
|
Naegleria gruberi |
41 x 106 |
15,727 |
This free-living unicellular organism
lives as both an amoeboid and a flagellated form. |
100,258,171 |
21,733 |
The first metazoan to be sequenced. |
|
115,409,949 |
~28,000 |
a flowering plant (angiosperm) See note. |
|
122,653,977 |
~17,000 |
the "fruit fly" |
|
278,244,063 |
13,683 |
||
Tetraodon nigroviridis (a pufferfish) |
3.42 x 108 |
27,918 |
Although Tetraodon seems to have more protein-encoding genes than we do, it has much less "junk" DNA so its total genome is about a tenth the size of ours. |
Rice |
3.9 x 108 |
28,236 |
|
8.14 x 108 |
~23,300 |
|
|
1.2 x 109 |
15,761 |
|
|
Dogs |
2.4 x 109 |
19,300 |
|
Humans |
3.3 x 109 |
~21,000 |
|
Mouse |
3.4 x 109 |
~23,000 |
|
Amphibians |
109–1011 |
? |
|
Psilotum nudum |
2.5 x 1011 |
? |
Note, 3000 times the genome of A. thaliana |
Porcine circovirus type1 |
1,759bp |
|
|
|
|
|
|
|
|
|
|
Note: The DNA from a single (diploid) human cell, if the 46 chromosomal DNAs were connected end-to-end and straightened, would have a length of ~2 m and a width of ~2.4 nanometers
Origin of the term Genome:
The term was adapted in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg (Germany). In Greek, the word genome (γίνομαι) means "I become, I am born, I come into being". The Oxford English Dictionary suggests the name to be a blend of the words gene and chromosome. A few related -ome words such as proteome, transcriptome, biome and rhizome, forming a vocabulary into which genome fits systematically. The sum total of “genes” in a cell genetic material is termed as Genome. Does it include organelle Genes? Similarly, Viral DNA or viral genetic RNA is called viral genome and bacterial DNA in the cell is called bacterial genomes. A full complement of cell proteins is called “Proteome”.
GENOME: Refer to ENCODE: ENC lopedia Of human DNA Elements
An analogy to the human genome stored in DNA is that of written material stored in a book:
· The book, (genome) assume, would contain 23 chapters (chromosomes);
· Each chapter contains 48 to 250 million letters (A, C, G, T) without spaces;
· Hence, the book contains over 3.2 billion letters total;
· The book information in the form of alphabets fits into a cell nucleus the size of a pinpoint;
· At least one copy of the book (all 23 chapters) is contained in most cells of our body. The only exception in humans is found in mature red blood cells which become enucleated during development and therefore lack a genome. www. Wikipedia.org