DNA sequencing autoradiograph - coloured

DNA sequencing

8/1/03. By Richard Twyman

How to determine the sequence of bases in a DNA molecule.

DNA sequencing is the process of determining the exact order of the bases A, T, C and G in a piece of DNA. In essence, the DNA is used as a template to generate a set of fragments that differ in length from each other by a single base. The fragments are then separated by size, and the bases at the end are identified, recreating the original sequence of the DNA.

The most commonly used method of sequencing DNA - the dideoxy or chain termination method - was developed by Fred Sanger in 1977 (for which he won his second Nobel Prize). The key to the method is the use of modified bases called dideoxy bases; when a piece of DNA is being replicated and a dideoxy base is incorporated into the new chain, it stops the replication reaction.

Key principles

  • A DNA molecule carries information in the form of four chemical groups or bases, represented by the letters A, C, G and T. The order of bases on a DNA strand is the DNA sequence.
  • Most DNA sequencing is carried out using the chain termination method. This involves the synthesis of new DNA strands on a single stranded template and the random incorporation of chain-terminating nucleotide analogues.
  • The chain termination method produces a set of DNA molecules differing in length by one nucleotide. The last base in each molecule can be identified by way of a unique label. Separation of these DNA molecules according to size places them in the correct order to read off the sequence.

How does it work?

The DNA to be sequenced is provided in single-stranded form. This acts as a template upon which a new DNA strand is synthesised. DNA synthesis requires a supply of the four nucleotides (the building blocks of DNA), the enzyme DNA polymerase and a primer (a short sequence annealed to the template which initiates the new DNA strand). The nucleotides added to the growing DNA strand are complementary to those in the template strand.

Sequencing is achieved by including in each reaction a nucleotide analogue that cannot be extended and thus acts as a chain terminator. Four reactions are set up, each containing the same template and primer but a chain terminator specific for A, C, G or T. Because only a small amount of the chain terminator is included, incorporation into the new DNA strand is a random event. Each reaction therefore generates a collection of fragments, but every DNA strand will end at the same type of base (A, C, G or T).

The primers or nucleotides included in each of the four reactions contain different fluorescent labels allowing DNA strands terminating at each of the four bases to be identified. The reaction products are then mixed and separated by gel electrophoresis, which separates DNA molecules according to size even if they differ in length by only a single nucleotide. As the DNA strands pass a specific point, the fluorescent signal is detected and the base identified. The whole process can be extensively automated.

How is it used?

The most obvious application of DNA sequencing technology is the accurate sequencing of genes and genomes. Only about 500-800 bases can be sequenced in one experiment so larger DNA molecules, including whole genomes, must be broken into smaller fragments before sequencing and then reassembled by searching for overlaps. Accuracy is achieved by sequencing each template several times.

Lower-fidelity single-pass sequencing is useful for the rapid accumulation of sequence data at the expense of some accuracy. Another application of DNA sequencing technology is resequencing the same DNA molecule over and over. This is necessary, for example, in the typing of single nucleotide polymorphisms.

Share |
Wellcome Trust, Gibbs Building, 215 Euston Road, London NW1 2BE, UK T:+44 (0)20 7611 8888