Details of the primers used for genotyping are given in Table S2 in the , and the genotyping scheme is shown in . Polymerase-chain-reaction (PCR) assays were performed as described previously, and amplicons were treated with exonuclease I and shrimp alkaline phosphatase (USB) before sequencing (BigDye Terminator). Sequence data were compared as described previously, and minimum-spanning-tree analysis based on SNPs and VNTRs was performed (BioNumerics software, version 6.1 [Applied Maths]).

The genetic variations are caused due to single nucleotide polymorphisms or SNPs which can occur both in the coding and non-coding regions of the genome. It is believed that SNPs occur at 1.6 million to 3.2 million sites in the human genome and may affect gene function, depending upon exact base change and where it occurs.

DNA microarrays, or gene chips, are an important new technology for genomic research. Learn how researchers use computing to analyze and interpret the huge datasets generated by microarray experiments.

Ref Seq is a well verified database of mRNAs and proteins of human, mouse and rat. The data provided in Ref Seq has been used in many cases such as designing gene chips and describing the sequence features of the human genome.

Minimum-spanning-tree analysis was performed with the use of combined VNTR and SNP data from human and armadillo strains. Each circle represents a genotype (human unless marked as armadillo) based on the combined data, with the circle size directly proportional to the number of strains with the corresponding genotype. Numbers along the links between circles indicate the number of loci that differ between the genotypes on either side of the link. Three fully sequenced reference strains(TN, Thai53, and Br4923) are labeled, as are two other reference strains (LWM26 and 43926) of foreign origin. Samples from patients with a history of foreign residence are indicated with an asterisk (with three asterisks indicating three patients). The 114 polymorphisms investigated include 84 SNPs described previously and 30 identified during our study; 10 VNTRs were also analyzed. The large circle illustrates the predominance of the 3I-2-v1 genotype in our study, with 25 patients and 28 armadillos having this identical genotype.

To improve the resolution of our data for 3I strains, we also surveyed for 30 of the 52 newly discovered markers. In addition to the 11-bp indel, 24 of 30 markers were restricted to SNPs of type 3I, irrespective of the source of the strain. However, in four 3I strains identified in patients, five SNPs contained ancestral bases and may represent intermediate sequences arising during the evolutionary divergence of 3I strains from their common ancestor. The strains with ancestral bases were classified as having SNPs of the subtype 3I-1 to differentiate them from the more divergent strains classified as 3I-2 strains found in all armadillos and most indigenous U.S. patients (, and Table S5 in the ).