Deoxy Ribonucleic Acid

 

Introduction:

DNA molecules are made up of a string of deoxy ribonucleotides held by phosphodiester bonds. These molecules are made on polynucleotide templates; so also, RNA molecules.   Quantity of DNA per cell for a given organism is constant and proportional to the ploidy i.e. per set of chromosomes, which varies from one organism to another organism.   Lower organisms such as bacteria and viruses don’t have chromosomes and their genomic DNA is suspended in the cell or enclosed in a capsid (virus).  The size of the genome in plants and animals vary by several folds, which is why it is called C-DNA (haploid DNA) value paradox. Ex. Humans contain 3.2x10^9bp genome but Paris (Pieris) japonica contains 52.3 times that of human genome. 1pg = 978x10^6bp = 0.978 x 10^9bp

 

Genome sizes:

 

Prokaryotic Organisms (some):

 

Name of the species

 

Genome size

Number of Genes

Size of Genes

Mycoplasma genitalium,G37

0.582 x 10 bps

521 (protein 475 genes)

1040 bp

Mycoplasma pneumoniae

1 x 10 6 bps

 

 

Myxophyceae: Dictyostelium discoideum

5.4 x 10 7 bp

 

 

Bacteria:

 

Name

Size of the Genome

Number of Genes

Size of genes

Synochocystis(cytobacteria)

3.7 x 106  bp

 

 

Rikettsia prowazeskii

1.1 x 10^6 bp

834

 

Homophilus influenzae

1.83x10^6bp

1.83x10^6

 

E.coli k12

4, 641,652bp

4141+176

1072

 

 

Eukaryotic organisms: Chromosome Numbers:

 

Some Plants (haploid numbers):

 

Species

N=?

Saccharomyces

16

Neurospora crassa

7

Aspergillus nidulens

8

Penicillium species

4

Dictyostelium discoideum

7

Chlamydomonas reinhardii

16

Acetabularia

10

Antirrhinum majus (snap dragon)

8

Lycopersicon esculentum

12

Nicotiana tabaccum

24

Pinus sylvestris

12

Phaseolus vulgaris (French bean)

7

Pisum sativum

7

Vicia faba

6

Allium cepa

8

Allium sativum

?

Hordeum vulgare (barley)

7

Triticum monococcum (wheat)

12

Zea mays

10

 

Some Animals (haploid numbers):

 

Species

Chromosome number, n=?

Hydra

16

Planaria torva

8

C. rhabditids’

2n=11 male

2n=12 female

Musa domestica

6?

Drosophila melanogaster

4

Bombax mori

28?

Rana pipens

13

X laevis

13?

Gallus domesticus (chick)

39

Felis domesticus (cat)

19

Cannis familiaris (dog)

39

Bos taurus (cattle)

30

Equs calibus (horse)

32

Mus musculus (mice0

20

Rattus norvigicus

22

Mesocricetus auratus (golden hamster0

22

Cavia cobaya (guinea pig)

32

Oryctolagus cuniculus (rabbit)

22

Macaca mulitas (Rhesus monkey)

21

Orangutan

24

Gorilla gorilla

24

Homo sapiens

23

 

 

 

 

 

 

Genome size:

 

Few Plants:

 

Few Animals:

 

Species

Genome size

Gene Numbers

Pyrenoma salina (an algae)

6.6 x 10^5 bp

 

Neurospora crassa.

2.7 x 10^7bp

 

Saccharomyces cerevisiae, n=16

1.35 x 10^7 bp

~6034

Chloro arachnophile, an algae, and a nucleomorph (lives in a nucleus)

3.8 x 10^5 bp

 

Cryptophyte, a nucleomorph, n=3

6 x 10^5 bp

 

Nicotiana tabaccum

4.8 x 10^9 bp

 

Prunus parsica

2.62 x 10^8 bp

 

Ricinus communis

3.23 x 10^8 bp

 

Citrus sienensis

3.67 x 10^8 bp?

 

Petunia paradii

1.221 x 10^9 bp

 

Pisum sativum, n=7

1.1315 x 10^9 bp

 

Arabidopsis thaliana

1.5 x 10^8 bp

10000 (proteins=10^5)

Zea mays

5 x 10^9 bp

 20000 (Proteins=4x10^6). 250000

Oriza sativa

4.66X 10^8

40 000

Avena sativa

1.315 x 10^10 bp

 

Tulipa

2.47 x 10^10 bp

 

Lillium formosanum-Lily

3.6 x 10^10 bp

15000

 

Ophioglossum petiolatum

1.6x10^12

20000

 

Allium cepa

1.8x10^9

20000

Psilotum nudum

 

 

Fritillaria assyrica

130x10^9bp=132pg

 

Paris japonica

150x10^9bp=152g

 

 

 

Few Animals:

 

Species

Genome size

Plasmodium falciparum

2.7 x 10^7 bp

C.elegans (a nematode), n=6

8 x 10^7 bp

D.melanogaster, n=4

1.4-1.65 x 10^8 bp

House fly, n=4

8.6 x 10^8 bp

Pond frog

2.3 x 10^10 bp

X.laevis, n=13?

3.1 x 10^9 bp

Sea urchin

8.0 x 10^8 bp

Zebra fish; Danio revio

1.7 x 10^7 bp

Lung fish, n=19

1.02 x 10^11 bp

Gallus domesticus (chick), n=

1.2 x 10^9 bp

Mus musculus (mice)

2.6 x 10^9 bp

Rattus norvigeticus (rat)

3 x 10^9 bp

Homosapiens, n=23

2.9 x 10^9 bp in males?

3.3 x 10^9 bp in females?

Polychaos dubium

670X10^9bp,

670GB

Protopterus aethiopicus

130x10^9=

130Gb

 

 

Organelle DNA:

 

Chloroplasts: Plastome:

 

Species

Genome size

Genome number per organelle

Organelle per cell

Chlamydomonas

180kbp, circular, codes for 120 to 150 proteins

80

One

Maize= Zea

mays; nucleoids

120-200kbp, circular, codes for 120 to 150 proteins

20-40

20-40

Liverwort (Bryophyte)

121kbp, circular

 

 

Nicotiana tabacum

155,939bp, circular

 

 

Arabidopsis thaliana

154,478Bp

 

 

Cocos nucifera

154,731Bp

 

 

Vigna radiata

151,271Bp

 

 

Welwitschia

120-160,000Bp

 

 

Phuyllostachys edulis(woody bamboo)

139,679bp

 

 

 

 

Mitochondria:

 

Species

Genome size, number of proteins produced

Genome numbers per organelle

Number of organelle per cell

Homo sapiens

16.569bp, circular, codes for 13 proteins, 5-7 URFs, no introns

5-10

200-1000 or more

S. cerevisiae

84kbp, contain introns, codes for 13 proteins, few URFs, contain introns

20-50

1-200

Frog’s Oocyte

 

5-10

10^6 to 7

Neurospora

19-108kbp, contain introns

10-20

50-100

Chlamydomonas

16000, linear

5-10

1?

Species

Master Circle Size (kb)

Sub-genomic Circle Size (kb)

Repeat

Size (kb)

Turnip

218

135 + 83

2

Cauliflower

217

172 + 45

?

Black Mustard

231

135 + 96

7

White Mustard

208

none

none

Radish

242

139 + 103

10

 

 

 

 

 

Rice

-490,520kBp

 

 

 

Arabidopsis thaliana-

 366,974 (57genes),

 

 

 

 

Wheat-

105-704kbp

 

 

 

Cucumber

~200kbp

 

 

 

Hu-

 

16,9569, bp,

 

 

 

Melon,

 

-2.4mbps

 

 

 

Maize-

 

569630bp,

 

 

 

C.elegans-

13,794 bps,

 

 

 

 

S.cerevisiae

Yeast-

 

85.779KBp

 

 

 

Corn-

570KBp

 

 

 

Turnip-

 

218KBp

 

 

 

Oenothers-

 

195KBP

 

 

 

Muskmelon-

2400 kbp

 

 

 

 

 

 

 

 

 

 

Some of the Genomic work completed:

 

E. coli: 1991-1997; n = 1; = 4.6x10^6; no. of genes = 4405; genes similar to humans are not known.

Yeast: 1989-1996; n=16 = 1.2 x10^7 bp; number of genes = 6241; there are 120 out of human’s 289 genes on mutations to cause disease.  Information indicates that its mechanisms are as complex as any other organism.

C.elegans:1990-98; n = 6 = 1x10^8bp; no. of genes = 19000; 9500 genes are approximately similar to Human genes.  This is an excellent system to understand cell lineage and tissue construction.

Fruit fly: 1999-2000; n = 4; = 1.8x10^8;bp, no. of genes = 13000; among them 177 genes are similar to humans.

Arabidopsis thaliana:  1996-2000; n= 5; = 1.18x^8; no. of genes = 25500, about 100 genes are similar to that of Humans.

Pieris japonica- 152pg = 978x10^6bp,

Frittilaria aethiopicus- 132.5pg,

Polychoas dubia - 670 x 10^9bp.

Mouse: Gallus gallus; 1999-2005; n = 40? Number of genes = 40000; a majority of them are related to or similar humans.

Human:  1990-2003; n = 23; 3pg = 3.3x10^9; estimated number of gene varies, theoretically 150, 000, but actually the number is about 21-22 000.

 

Interesting genomic sizes:

Candidatus Carsonella ruddii: 160kb

Encephalitozoon intestinalis: 0.0023pg human intestine parasite (smallest genome).

Fritillaria assyriaca: (Trillium × Hagee), Fritillary (132.50 pg).

Proteopterus aethiopicus: marbled lung fish with 132.83 pg.

Paris japonica: 152.3 pg; = 152.3 x 978 x 10^6 bp; Number of genes? ~148.9 x 10^9bp (1pg = 0.978 x10^6)

Polychaos dubium: 670 x 10^9 base pairs; an amoeba-largest genome.

 

 

 

Organisms used and time taken for size/sequence analysis:

 

Organism

#bp

Time in hrs/days

#genes

#bp/gene

 

PhiX174

 

5386 bp

 

1.5hrs

 

9

 

598

E. coli

4639 221 bp

54 days

4288

1072

S. cerevisiae

12 057849bp

140

6269

1923

C.elegans

~97000000bp

3.1yrs

19099

5079

A. thaliana

~125000 000bp

4yrs

25498

4902

D.melanogastor

~180 000 00bp

5.7yrs

13600

13235

H. sapiens

~3.2 x 10^9bp

108yrs

22000

113 333

Pieris (Paris)

 japonica

~3.2x15x10^9

 

 

15 times human genome

 

 

 

 

 

Genome Sizes:

The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like us) contain two genomes, one inherited from mother, the other from father. In flowering plants, it can be bisexually or unisexually inherited.

The majority of modern genome size estimates is based on either Feulgen densitometry (more recently using computerized image analysis) or flow cytometry, although DNA reassociation kinetics, bulk fluorometry, static fluorometry, DAPI method, Propidium iodide, Electrophoretic methods, Quantitative real-time PCR and complete genome sequencing have also been used.

Genome sizes are typically given as gametic nuclear DNA contents (‘C-values’) either in units of mass (picograms, where 1 pg = 10−12 g) or in number of base pairs (in eukaryotes, most often in mega bases, where 1 Mb = 106 bases). These are directly interconvertible as 1 pg = 978 Mb (or 1 Mb = 1.022 × 10−3 pg). Note: 1pg of ds DNA= 0.978 x 10^9 bp. The term "C-value enigma" represents an update of the more common but outdated term "C-value paradox.  "C-value" (Swift 1950) refers to haploid nuclear DNA content.  The term was coined by Canadian biologist Dr. T. Ryan Gregory of the University of Guelph in 2000/2001.

Tree of life with genome sizes as outer bars; http://www.en.wikipedia.org

The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.

Comparison of different genome sizes(Wikipedia)

http://en.wikipedia.org/

Table of Genome Sizes (haploid)

 

Base pairs

Genes

Notes

φX174

5,386

11

virus of E. coli

Human mitochondrion

16,569

37

 

Epstein-Barr virus (EBV)

172,282

80

causes mononucleosis

Nanoarchaeum equitans

490,885

552

This parasitic member of the Archaea has the smallest genome of a true organism yet found.

Nucleomorph of Gaillardia theta

551,264

511

all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote

Mycoplasma genitalium

580,073

485

two of the smallest true organisms

Mycoplasma pneumoniae

816,394

680

Chlamydia trachomatis

1,042,519

936

this bacterium causes the most common sexually-transmitted disease (STD) in the U.S.

Rickettsia prowazekii

1,111,523

834

bacterium that causes epidemic typhus

Treponema pallidum

1,138,011

1,039

bacterium that causes syphilis

Mimivirus

1,181,404

1,262

A virus (of an amoeba) with a genome larger than the six cellular organisms above

Pelagibacter ubique

1,308,759

1,354

smallest genome yet found in a free-living organism (marine α-proteobacterium)

Borrelia burgdorferi

1.44 x 106

1,738

bacterium that causes Lyme disease [Note]

Campylobacter jejuni

1,641,481

1,708

frequent cause of food poisoning

Helicobacter pylori

1,667,867

1,589

chief cause of stomach ulcers (not stress and diet)

Thermoplasma acidophilum

1,564,905

1,509

These unicellular microbes look like typical bacteria but their genes
are so different from those of either bacteria or eukaryotes that they are
classified in a third kingdom: Archaea.

Methanococcus jannaschii

1,664,970

1,783

Aeropyrum pernix

1,669,695

1,885

Methanobacterium
thermoautotrophicum

1,751,377

2,008

Hemophilus influenzae

1,830,138

1,738

bacterium that causes middle ear infections

Streptococcus pneumoniae

2,160,837

2,236

the pneumococcus

Neisseria meningitidis

2,184,406

2,185

Group A; causes occasional epidemics of meningitis in less developed countries.

Neisseria meningitidis

2,272,351

2,221

Group B; the most frequent cause of meningitis in the U.S.

Encephalitozoon cuniculi

2,507,519

1,997

(plus 69 RNA genes); a parasitic eukaryote.

Propionibacterium acnes

2,560,265

2,333

causes acne

Listeria monocytogenes

2,944,528

2,926

2,853 of these encode proteins; the rest RNAs

Deinococcus radiodurans

3,284,156

3,187

on 2 chromosomes and 2 plasmids; bacterium noted for its resistance to radiation damage

Synechocystis

3,573,470

4,003

a marine cyanobacterium ("blue-green alga")

Vibrio cholerae

4,033,460

3,890

in 2 chromosomes; causes cholera

Mycobacterium tuberculosis

4,411,532

3,959

causes tuberculosis

Mycobacterium leprae

3,268,203

1,604

causes leprosy

Bacillus subtilis

4,214,814

4,779

another bacterium

E. coli K-12

4,639,221

4,377

4,290 of these genes encode proteins; the rest RNAs

E. coli O157:H7

5.44 x 106

5,416

strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12

Agrobacterium tumefaciens

4,674,062

5,419

Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti

Salmonella enterica var Typhi

4,809,037

4,395

+ 2 plasmids with 372 active genes; causes typhoid fever

Salmonella enterica var Typhimurium

4,857,432

4,450

+ 1 plasmid with 102 active genes

Yersinia pestis

4,826,100

4,052

on 1 chromosome + 3 plasmids; causes plague

Schizosaccharomyces pombe

12,462,637

4,929

Fission yeast. A eukaryote with fewer genes than the four bacteria below.

Ralston solanacearum

5,810,922

5,129

soil bacterium pathogenic for many plants; 1681 of its genes on a huge plasmid

Pseudomonas aeruginosa

6.3 x 106

5,570

Increasingly common cause of opportunistic infections in humans.

Streptomyces coelicolor

6,667,507

7,842

An actinomycete whose relatives provide us with many antibiotics

Sino rhizobium melilotus

6,691,694

6,204

The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.

Saccharomyces cerevisiae

12,495,682

5,770

Budding yeast. A eukaryote.

Cyanidioschyzon Merola

16,520,305

5,331

A unicellular red alga.

Plasmodium falciparum

22,853,764

5,268

Plus 53 RNA genes. Causes the most dangerous form of malaria.

Thalassiosira pseudonana

34.5 x 106

11,242

A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins

Neurospora crassa

38,639,769

10,082

Plus 498 RNA genes.

Naegleria gruberi

41 x 106

15,727

This free-living unicellular organism lives as both an amoeboid and a flagellated form.
4,133 of its genes are also found in other eukaryotes suggesting that they were present in the
common ancestor of all eukaryotes. The great variety of functions encoded by these genes
also suggests that the common ancestor of all eukaryotes was itself as complex as many of the
present-day unicellular members.

Caenorhabditis elegans

100,258,171

21,733

The first metazoan to be sequenced.

Arabidopsis thaliana

115,409,949

~28,000

a flowering plant (angiosperm) See note.

Drosophila melanogaster

122,653,977

~17,000

the "fruit fly"

Anopheles gambiae

278,244,063

13,683

Mosquito vector of malaria.

Tetraodon nigroviridis (a pufferfish)

3.42 x 108

27,918

Although Tetraodon seems to have more protein-encoding genes than we do, it has much less "junk" DNA so its total genome is about a tenth the size of ours.

Rice

3.9 x 108

28,236

 

Sea urchin

8.14 x 108

~23,300

 

Zebrafish

1.2 x 109

15,761

 

Dogs

2.4 x 109

19,300

 

Humans

3.3 x 109

~21,000

[Link to more details.]

Mouse

3.4 x 109

~23,000

 

Amphibians

109–1011

?

 

Psilotum nudum

2.5 x 1011

?

Note, 3000 times the genome of A. thaliana

Porcine circovirus type1

1,759bp

 

 

 

 

 

 

 

 

 

 

 

 

 

                            

Note: The DNA from a single (diploid) human cell, if the 46 chromosomes were connected end-to-end and straightened, would have a length of ~2 m and a width of ~2.4 nanometers

 

Origin of the term Genome:

The term was adapted in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg (Germany). In Greek, the word genome (γίνομαι) means "I become, I am born, I come into being". The Oxford English Dictionary suggests the name to be a blend of the words, gene and chromosome. A few related -ome words already existed, such as proteome, transcriptome, biome and rhizome, forming a vocabulary into which genome fits systematically.

 

GENOME: Refer to ENCODE: Encyclopedia Of human DNA Elements

An analogy to the human genome stored on DNA is that of instructions stored in a book:

·                The book (genome) would contain 23 chapters (chromosomes);

Each chapter contains 48 to 250 million letters (A, C, G, T) without spaces;

·                Hence, the book contains over 3.2 billion letters total;

·                The book fits into a cell nucleus the size of a pinpoint;

·                At least one copy of the book (all 23 chapters =23) is found in most of the cells in our body. The only exception in humans is found in mature red blood cells which become enucleated during development and therefore lack a genome. www.wikipedia