Bio Review Notes #32
GENOME MAPPING AND SEQUENCING
Performance Objectives:
The complete sequence of hereditary information of an organism makes up its genome. Various methods can now be used to locate various genes within this genome. Gene sequencing allows scientists to determine the nucleic acid sequencing for an entire genome. Complete genome sequences have now been determined for many species, including humans.

Gene mapping:
  • Linkage mapping, using crossing-over frequencies, was developed in 1912.
    It determines relative locations of genes, but not their physical locations on chromosomes.
  • Genetic markers are variations (also called polymorphisms) whose physical locations among the chromosomes can be determined.
    Examples:
    RFLPs (Restriction Fragment Length Polymorphisms or "riflips") are variations in the lengths of the fragments produced with the aid of restriction enzymes;
    SNPs (Single Nucleotide Polymorphisms, or "snips") are variations in the number of repetitions of a single nucleotide at agiven location.
  • After a large number of markers have been located across the genome, scientists examine many pedigrees to look for a marker and a trait that are inherited together within many families. If a marker and a trait are always inherited together as a unit, then a gene associated with that trait is located close to the marker. Beginning in 1980, the availability of RFLPs and other markers greatly increased our ability to pinpoint the location of many genes.

Gene sequencing:   Restriction enzymes (endonucleases) are first used to cut the genome into fragments. The fragments are then sequenced by replicating them in the presence of chain-terminating nucleotides. Each of these nucleotides binds only opposite to a matching (complementary) base (such as A opposite T), but then terminates the chain. Doing this numerous times results in partial fragments whose ending (chain-terminating) nucleotides are known. These partial fragments are then sorted by length. Thus, if the shortest partial sequence ends in G, the second shortest in A, the third in G, and the fourth in T, then the sequence begins with GAGT. Partial sequences are determined in this way for fragments made by different endonucleases. Computer programs then search for overlaps among partial sequences that can be used to patch them together into longer sequences.

Human genome project: In 1986, scientists first proposed using gene sequencing techniques to sequence the entire human genome. This human genome project was funded by the U.S. Congress in 1989. By 2001, scientists working in many countries had completed a "draft sequence," and in 2003 they announced the sequence for the entire genome. Some key findings are these:
  • The human genome includes about 3.2 billion nucleotides.
  • Only about 5% of the genome consists of genes that are transcribed into mRNA or translated into protein. The rest consists of repetetive sequences with no identified function that were once considered "junk DNA" but are now thought to play an important role in development and evolution.
  • Humans have between 30,000 and 35,000 genes, only one-third of some previous estimates. This number is approximately twice the number for such animals as the fruit fly, Drosophila melanogaster or the nematode worm Caenorhabditis elegans.
  • Most human genes are shared with many other species. For example, about 46% of human genes are shared with yeast. Only about 1% of human genes, at most, are thought to be unique to humans.
  • Each gene exists in many possible variations (alleles), differing among human beings in about 15 nucleotide locations, on average.
  • However, about 99.95% of the genome is the same among all humans; only about 1/20 of 1% varies among different people.

Comparative genomics:   Complete genome sequences have already been determined for hundreds of bacteria and many viruses, archaea, fungi, plants, and animals. Vertebrates whose sequences are known include puffer fish, mice, rats, dogs, cows, rhesus monkeys, and chimpanzees. Other organisms include yeast, roundworms, fruit flies, rice, and the small flowering plant Arabidopsis. Comparisons show that many genes belong to gene families, originating as multiple copies that have evolved in different directions for different functions.

Index             Syllabus
Prev rev. Aug. 2011 Next