Short Tandem Repeats

Only about five percent of human DNA is thought to code for traits. Most of the rest is made of long stretches of nucleotide base pairs whose function is not known. Within these stretches are short, moderately repetitive base pair sequences called short tandem repeats (STRs). The number of repeats in each stretch is inherited and is easily detected. This makes them ideal identifying markers within a person's genome. The technology that quantifies these STR markers is routinely used to identify human remains, to establish or exclude paternity, or to match a suspect to a crime scene sample. The ability to quantify and compare the numbers of repeats of STR markers on the Y chromosome is valuable to genealogists for defining and confirming lines of descent in the male line and to anthropologists for learning about patterns of ancient human migrations.

Counting STRs

Once a segment of DNA has been singled out by its special primers and amplified by PCR, the number of motifs (repeats) of the target STR can be easily determined by gel electrophoresis. The mix of billions of short fragments from the PCR is loaded into either a shallow tray or a series of glass capillary tubes that contain a gel solution that serves as a sieving matrix. This sieve allows smaller molecules to pass through more readily than larger ones. During electrophoresis, a voltage is created across the gel so that one end is made positive and the other negative. Since DNA is slightly negative, its fragments will move to the positive end of the gel. The speed with which a DNA segment travels through the gel is determined by its size. Figure 1 shows a reference sample of a mixture containing fragments of all the known range of repeats that has been run through the gel in the "lane" on the left. The test sample is next to it on the right. The fragments from the reference mixture have spread out according to length forming what is known as an allelic ladder. The length of the fragments in the unknown sample can be determined by comparison. The number of base pairs for each fragment (known from the reference mixture) is on the right.

gel electrophoresis slab show separation of STR fragments

Figure 1

By Shinryuu (Own work) [Public domain], via Wikimedia Commons

Multiplex Reactions

To save time, money, and materials, much work has gone into developing procedures whereby test DNA can be "incubated" with a combination of PCR primers for several different markers at one time. Primers are designed to bind at only one spot in the genome so that during the PCR each primer can be busily dissecting out and amplifying its own STR marker exclusively.

The task of sorting out the results at the end is facilitated by adding various colored fluorescent tags to the fragments. Fluorescent labeling of DNA fragments may be performed in several ways. The most common method is to incorporate a fluorescent dye on the 5'-end of a PCR primer so that during PCR amplification either the forward or the reverse strand of DNA will be labeled.

At the end of the PCR, the mixture will be run through a gel electrophoresis slab or an array of microtubules each filled with a gel. Since each amplified fragment has a discrete length that depends on the number of repeat motifs, it will move through the gel at a different rate. The colors of the fluorescent labels are coordinated with the size ranges of the markers so that several markers in the same size range can be run together, each labeled with a different color.

Y-STR electropherogram
Figure 2

The number above each peak in the figure 2 above is the DYS marker number for the fragment.  The X axis represents the PCR Product Size i.e. number of base pairs in the fragment.  The smallest fragments are DYS 426 in the upper trace and DYS391 in the lower trace. 

By comparing with the mobility rates of fragments of known lengths, it is possible to determine the length in base pairs for each marker and from that calculate the number of repeats—the number that you will receive in your Y-DNA test report. 
In the tracing above, four different fluorescent labels have been used--green, blue, red, and black (which was probably actually yellow which does not show up very well.)  Two different sets of markers are shown.  The upper tracing represents the results of multiplexing 19 different markers--each labeled by its DYS identification number.  The bottom tracing shows the results of nine of the markers in the upper trace.  Notice how the markers DYS389I and DYS388 are only distinguishable as two separate markers because of their different colored fluorescence. 

Also understand that the height of the fluorescence peak for each marker is not an important value as only primers are labeled--one per fragment no matter how long or short the fragment.  The location of the peak on the x axis represents the number of base pairs in the fragment which is a function of the number of repeats. Basically, once the number of base pairs in the fragment is determined, it is a matter of subtracting the number of base pairs in the primer from the total length then dividing that number by the number of bps that make up on marker motif.  This is an oversimplification because some markers may be made of more than one repeating pattern and some primers may actually be part of the actual STR region.  Nevertheless, the concept is the same.  All measurements and calculations are done by computer.

DNA analyzer from 2001

Figure 3

Recently released DNA analyzer

Figure 4

In most large labs, an automated DNA analyzer is used to run gels and record the different colors.  There's an ultraviolet laser built into the machine that shoots through the gel near the bottom and scans side to side, checking for bands of fluorescent colors to pass through its beam. It is possible to run as many as 96 samples through the gel at one time!  The ABI PRISM® 3700 DNA Analyzer shown at the left was, according to the Applied Biosystems website in 2001, "a fully automated, multi-capillary electrophoresis instrument designed for use in production-scale DNA analysis. It can automatically analyze multiple runs of 96 samples, which enables 24-hour unattended operation." On the right is a bench top model, the SeqStudio Genetic Analyzer recently released (May 2017) by the same company, Applied Biosystems. According to its specifications, it can also run 96 samples simultaneously. Cartridges preloaded with reaction mixtures an automated procedures require less time and attention from laboratory researchers. According to Genomeweb, the new instrument has a list price of $57,000!

Molecular biology instrumentation has come a long ways in 16 years!

Where Can I Go From Here?

©️2002 - 2017 is a not for profit, educational website.

Where Can I Go From Here?

©️2002 - 2017 is a not for profit, educational website.