<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2007-8-7-r151</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Human subtelomeric duplicon structure and organization</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Ambrosini</snm>
               <fnm>Anthony</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>aambrosi@princeton.edu</email>
            </au>
            <au id="A2">
               <snm>Paul</snm>
               <fnm>Sheila</fnm>
               <insr iid="I1"/>
               <email>paul@wistar.org</email>
            </au>
            <au id="A3">
               <snm>Hu</snm>
               <fnm>Sufen</fnm>
               <insr iid="I1"/>
               <email>shu@wistar.org</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Riethman</snm>
               <fnm>Harold</fnm>
               <insr iid="I1"/>
               <email>Riethman@wistar.org</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>The Wistar Institute, Spruce St, Philadelphia, PA 19104, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>7</issue>
         <fpage>R151</fpage>
         <url>http://genomebiology.com/2007/8/7/R151</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17663781</pubid>
               <pubid idtype="doi">10.1186/gb-2007-8-7-r151</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>29</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>25</day>
               <month>6</month>
               <year>2007</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>30</day>
               <month>7</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>30</day>
               <month>07</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Ambrosini et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Subtelomere structure</p>
      </shorttitle>
      <shortabs>
         <p>The sequence divergence within subtelomeric duplicon families varies considerably, as does the organization of duplicon blocks at subtelomere alleles; a class of duplicon blocks was identified that are subtelomere-specific.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Human subtelomeric segmental duplications ('subtelomeric repeats') comprise about 25% of the most distal 500 kb and 80% of the most distal 100 kb in human DNA. A systematic analysis of the duplication substructure of human subtelomeric regions was done in order to develop a detailed understanding of subtelomeric sequence organization and a nucleotide sequence-level characterization of subtelomeric duplicon families.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The extent of nucleotide sequence divergence within subtelomeric duplicon families varies considerably, as does the organization of duplicon blocks at subtelomere alleles. Subtelomeric internal (TTAGGG)n-like tracts occur at duplicon boundaries, suggesting their involvement in the generation of the complex sequence organization. Most duplicons have copies at both subtelomere and non-subtelomere locations, but a class of duplicon blocks is identified that are subtelomere-specific. In addition, a group of six subterminal duplicon families are identified that, together with six single-copy telomere-adjacent segments, include all of the (TTAGGG)n-adjacent sequence identified so far in the human genome.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Identification of a class of duplicon blocks that is subtelomere-specific will facilitate high-resolution analysis of subtelomere repeat copy number variation as well as studies involving somatic subtelomere rearrangements. The significant levels of nucleotide sequence divergence within many duplicon families as well as the differential organization of duplicon blocks on subtelomere alleles may provide opportunities for allele-specific subtelomere marker development; this is especially true for subterminal regions, where divergence and organizational differences are the greatest. These subterminal sequence families comprise the immediate cis-elements for (TTAGGG)n tracts, and are prime candidates for subtelomeric sequences regulating telomere-specific (TTAGGG)n tract length in humans.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Segmental duplications, defined operationally as duplicated stretches of genomic DNA at least 1 kb in length with >90% nucleotide sequence identity, comprise roughly 5% of euchromatin in the human genome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. They are preferential sites of genomic instability, associated with recurrent pathology-associated chromosome breakpoints <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, large-scale copy number polymorphisms <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, and evolutionary chromosome breakpoint regions <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. While they are distributed throughout the human genome, they tend to cluster near centromeres and telomeres <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p>
         <p>Human subtelomeric segmental duplications ('subtelomeric repeats') comprise about 25% of the most distal 500 kb and 80% of the most distal 100 kb in human DNA <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B6">6</abbr></abbrgrp>. From extensive early work on these complex regions it was recognized that telomere-adjacent sequence stretches contained low copy subtelomeric repeat segments of varying sizes and degrees of divergence <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. The first completed sequences of human subtelomere regions revealed at least two general classes of duplicons, sometimes separated by internal (TTAGGG)n-like islands; large and highly similar centromerically positioned subtelomere duplications and more abundant, dissimilar distal duplicons <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. While it is now well-established that subtelomeric repeat (Srpt) regions are composed of mosaic patchworks of duplicons <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, genome-wide analyses of these regions are revealing new details. The patchworks of subtelomeric duplicons appear to arise from translocations involving the tips of chromosomes, followed by transmission of unbalanced chromosomal complements to offspring <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The overall size, sequence content, and organization of subtelomeric segmental duplications relative to the terminal (TTAGGG)n repeat tracts and to subtelomeric single-copy DNA are different for each subtelomere <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, and the large-scale polymorphisms (50 kb to 500 kb) found near many human telomeres seem to be due primarily to variant combinations of subtelomeric segmental duplications <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B13">13</abbr></abbrgrp>. Thus, the architecture of each human subtelomere region is determined largely by its specific subtelomeric segmental duplication content and organization, which vary from telomere to telomere and are often allele-specific.</p>
         <p>Terminal (TTAGGG)n tracts lie immediately distal to subtelomeric segmental duplication regions and form the ends of chromosomes. The lengths of (TTAGGG)n tracts have been shown to vary from telomere to telomere within individual cells <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> and between alleles at the same telomere <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Individual-specific patterns of relative telomere-specific (TTAGGG)n tract lengths have a significant heritable component closely associated with the telomeres themselves <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>, and these patterns appear to be defined in the zygote and maintained throughout life <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Since the immediate effects of (TTAGGG)n tract loss on cell viability and chromosome stability may be attributable to the shortest telomere(s) in a cell, rather than to average telomere length <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B21">21</abbr></abbrgrp>, individual-specific patterns of allele-specific (TTAGGG)n tract lengths may be crucial for the biological functions of telomeres and the effects of telomere attrition and dysfunction associated with aging, cancer, stress and coronary artery disease <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>.</p>
         <p>The overall picture of duplicated subtelomeric DNA that has emerged is one of a very plastic and rapidly evolving genome compartment. Some of the DNA segments within this subtelomeric compartment can exchange sequences with each other inter-chromosomally <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>; these genomic fragments behave essentially as a multi-allelic subtelomeric gene family, with paralogs on separate subtelomeres sometimes sharing higher sequence similarity than alleles on homologous chromosomes. Thus, in order to track individual subtelomere alleles in these regions, it will be essential to define markers that can distinguish the allele not just from its homolog, but from each of its paralogs. This is a fundamental challenge in developing subtelomeric markers, and one that requires a detailed understanding of both subtelomeric sequence organization and the nucleotide sequence-level characterization of duplicon families. We therefore set out to characterize these features systematically based upon the available human DNA sequence.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Subtelomeric duplicon definition</p>
            </st>
            <p>Subtelomeric regions of human chromosomes are known to be composed, in part, of mosaic patchworks of duplicons <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B25">25</abbr></abbrgrp>. In order to analyze their sequence organization in a systematic manner, we developed a set of rules to identify modules of DNA defined by sequence similarity between segments of subtelomeric DNA from single telomeres and the assembled human genome. A hybrid reference genome composed of 500 kb subtelomere assemblies <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> incorporated into human genome build 35 at the appropriate subtelomere coordinates (Additional data file 1) was used for this purpose. The hybrid build used in the current analysis essentially replaces some of the build 35 subtelomeres with more complete and rigorously validated subtelomere assemblies <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, but is otherwise identical to the build 35 public reference sequence.</p>
            <p>The sequence of the most distal 500 kb of each human subtelomere region from this reference hybrid build was used to query the complete hybrid reference genome sequence as described in Materials and methods and in Additional data file 2. Adjacent and properly oriented BLAST matches with &#8805;90% nucleotide sequence identity and &#8805;1 kb in size were assembled into chains; the query sequence and each aligned region identified in this manner were termed 'duplicons' defined by that query, and this set of homologous sequences is a single 'module'. Each module was thus defined by a set of pairwise alignments with the query subtelomere sequence, and a percent nucleotide sequence identity for the non-masked parts of each chained pairwise alignment was derived from the BLAST alignments. In cases where more than one duplicon was defined by matches to a segment of subtelomere query sequence, the average percent identity of all pairwise alignments in the module was also calculated (the %ID<sub>avg</sub>). Interestingly, in most cases the best nucleotide sequence identity between the query subtelomere sequence and the duplicons was very similar to the average pairwise nucleotide sequence identity, indicating that either subtelomeric duplications within a group of this class occurred in a relatively narrow evolutionary time window, or gene conversion of duplicated sequences within the group has occurred at a relatively constant rate. The full set of modules, including the coordinates of their genomic alignments, is presented in Additional data file 3.</p>
            <p>Figure <figr fid="F1">1</figr> illustrates this analysis graphically for the 7p subtelomere region. Each rectangle in Figure <figr fid="F1">1</figr> represents a separate duplicon; for example, the chromosome 7 intrachromosomal duplicons (pink, above the coordinate line) include two large blocks and many smaller ones, with each duplicon corresponding to distinct, internal chromosome 7 coordinates. The large (90 kb) duplicon at the bottom of the figure matches a subtelomeric segment of chromosome 11 (bounded light green rectangle) whereas chromosome 1 is the site of 25 distinct 7ptel duplicons of various sizes, 9 of which are subtelomeric (bounded brown rectangles) and 16 non-subtelomeric (unbounded brown rectangles). The remaining duplicons defined by pairwise alignment with the 7ptel query sequence are designated in a similar fashion.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Duplicon substructure of the 7p subtelomere region</p>
               </caption>
               <text>
                  <p>Duplicon substructure of the 7p subtelomere region. The most distal 140 kb of the chromosome 7p reference sequence is shown oriented with the telomeric end on the left (34 kb of unsequenced 7p DNA lie beyond the sequenced region shown, and the remaining 350 kb of the 7p subtelomere region centromeric to that shown does not contain duplicated DNA). The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and 5'-3' G-strand orientation of (TTAGGG)n elements are shown as black arrows. Duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-telomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates).</p>
               </text>
               <graphic file="gb-2007-8-7-r151-1"/>
            </fig>
            <p>This systematic analysis resulted in the definition of 1,151 subtelomeric modules whose coordinates define duplicon families; 461 modules define duplicon families located exclusively in subtelomere regions, whereas the remainder have copies in both subtelomeric and non-subtelomeric DNA. The duplication module numbers are broken down by subtelomere in Additional data file 4. The abundance and genomic distribution for the subtelomere modules and each of their duplicons are summarized in Figure <figr fid="F2">2</figr>. In addition to the expected subtelomeric enrichment of duplicons, they are also localized at many pericentromeric loci and at a relatively small number of internal chromosome sites. Internal loci particularly enriched for subtelomeric duplicons include 2q13-q14 (at the site where ancestral primate telomeres fused to form modern human chromosome 2), 1q42.11-1q42.12, 1q42.13, 1q43-q44, 3p12.3, 3q29, 4q26, 7p13, 9q12-q13, and Yq11.23. These sites have been documented previously in genome-wide analyses of segmental duplications <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> and represent sites that were apparently susceptible to either donation or acceptance of these duplicated chromosome segments in recent evolutionary time.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Genomic distribution of subtelomeric duplicons</p>
               </caption>
               <text>
                  <p>Genomic distribution of subtelomeric duplicons. The total number of duplicon bases for each 1 Mb interval in the human genome is indicated by the following color designations: red, greater than 500 kb; purple, 100-500 kb; aqua, 50-100 kb; green, 5-50 kb; and blue, 1-5 kb. The positions of the centromeric gaps in build 35 are indicated as black cylinders.</p>
               </text>
               <graphic file="gb-2007-8-7-r151-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Subtelomeric duplicon characterization</p>
            </st>
            <p>The defined subtelomere modules and their duplicons were characterized according to size and nucleotide sequence similarity. Duplicons that occupy subtelomeric sequences were generally both larger and more abundant than those occurring elsewhere in the genome (Additional data file 5), consistent with the notion that subtelomeric location in humans is permissive for and/or somehow promotes large duplication events. Although smaller and fewer, non-subtelomeric copies of duplicons tended to cluster at the relatively few pericentric and interstitial loci described above (Figure <figr fid="F2">2</figr>).</p>
            <p>Figure <figr fid="F3">3</figr> shows the results of an analysis of duplicon number as a function of percent nucleotide identity. There is a bimodal distribution of duplicon number versus percent nucleotide sequence identities, with peaks at 98% and 91% (Figure <figr fid="F3">3</figr>, left panels). The 98% peak was highly enriched in subtelomeric duplicons. The combined large size and high sequence similarity of a subset of subtelomeric duplicons is highlighted in the right panels of Figure <figr fid="F3">3</figr>, which plots the total bases covered by the duplicons as a function of the nucleotide sequence identity. The bimodal distribution of duplicon peaks might suggest two evolutionary waves of duplications, with the more recent one accounting for most of the large subtelomeric duplicons; this sort of punctuated duplication pattern is reminiscent of that observed by Eichler and co-workers <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> for segmentally duplicated DNA in a pericentromeric chromosome region. Alternatively, the 98% peak may be due to maintenance of sequence similarity by ongoing interchromosomal gene conversion between the large subtelomeric duplicons.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Duplicon number, size, and total bases covered as a function of percent nucleotide sequence identity</p>
               </caption>
               <text>
                  <p>Duplicon number, size, and total bases covered as a function of percent nucleotide sequence identity. Duplicon number (left panels) and the total bases in duplicons (right panels) are shown on the Y-axis, and percent nucleotide sequence identity for the non-RepeatMasked bases is shown on the X-axis. The size ranges (kb) of duplicons in each category are indicated by the colors shown in the key at the bottom of the figure.</p>
               </text>
               <graphic file="gb-2007-8-7-r151-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Subtelomeric duplicon organization and divergence</p>
            </st>
            <p>Visual inspection of the duplicon organization for the subtelomeres revealed several key features (Figure <figr fid="F4">4</figr>, Additional data files 6-47). The internal (TTAGGG)n sequences are usually oriented towards the telomere and almost always co-localize to duplicon boundaries. The orientations of the duplicons in the segmentally duplicated regions are similarly maintained, consistent with a recent model for their generation that features subtelomeric translocation of chromosome tips followed by transmission of unbalanced subtelomeric chromosome complements <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. In an unusual case where the orientations are opposite to the telomere (Figure <figr fid="F4">4</figr>, 5p telomere), the (TTAGGG)n occurs head-to-head with one in the normal orientation, perhaps indicating the relic of a head-to-head telomere fusion event transmitted in the germline. Subtelomeric internal (TTAGGG)n-like sequences at duplicon boundaries suggest the possibility of internal binding/interaction sites for some (TTAGGG)n-binding protein components found primarily at terminal (TTAGGG)n tracts; published data showing TRF2 and TIN2 localization at internal (TTAGGG)n tracts resulting from a fused human chromosome pair support this idea <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The subtelomeric internal (TTAGGG)n-like islands range in size up to 823 base-pairs (bp), with most in the 150-200 bp range; they vary considerably in similarity to canonical (TTAGGG)n repeats <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> as well as in the relative abundance of (TTAGGG)n-related motifs. Several of the (TTAGGG)n-related motifs found in these islands were detected previously in proximal regions of telomeres (for example, TGAGGG, TCAGGG, TTGGGG <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp> (H Riethman, unpublished)). A more detailed analysis of these interesting sequence islands and their comparison with a more comprehensive set of telomere-proximal sequences than is currently available might shed light on their origins and the relative timing of their internalization.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Duplicon organization of selected telomeres</p>
               </caption>
               <text>
                  <p>Duplicon organization of selected telomeres. The sequences are oriented with the telomeres at the left, with the distance from the end of the sequence to the start of the terminal repeat array indicated by the vertical arrow at the telomeric end of the sequence. The position and 5'-3' G-strand orientation of (TTAGGG)n elements are shown as black arrows. Note the co-localization of nearly all of the internal (TTAGGG)n islands with duplicon boundaries. The duplicon substructure for each of the 43 non-satellited telomeres is shown in Additional data files 6-47.</p>
               </text>
               <graphic file="gb-2007-8-7-r151-4"/>
            </fig>
            <p>For any given segment of a subtelomere, the level of nucleotide sequence similarity with duplicated DNA depends entirely on the specific duplicon content and organization and does not necessarily correlate with its distance from the telomere terminus (Additional data files 6-47, bottom panels). Large duplicons with relatively high sequence similarity amongst family members cover a large proportion of the duplicated sequence space, but occupy only a subset of subtelomere regions and exist at variable distances from the terminal (TTAGGG)n tract. Since many of the currently incomplete assemblies terminate within these large duplicons, the actual sequence organization is still unknown for these chromosome ends (1p, 3q, 6p, 7p, 8p, 9q, 11p, 19p). For assemblies completed or very nearly completed that contain the large duplicons, there is a consistent pattern of higher divergence in (TTAGGG)n-adjacent subterminal sequence than in adjacent large duplicon regions (4q, 5q, 6q, 10q, 15q, 16q, and 17q, bottom panels). For subtelomeres that lack the large duplicons, there is typically a much lower degree of sequence similarity throughout these subtelomeric duplication regions (often 90-96% nucleotide sequence identity; 1q, 2p, 4p, 5p, 10p, 13q, 14q, 18p, 19q, 21q, 22q). The 3p, 14q, and 20p subtelomeres have unsequenced gaps adjacent to their terminal (TTAGGG)n tracts; hybridization experiments showed that 3p and 14q have small Srpt regions, whereas that for 20p is more extensive and contains large duplicons (H Riethman, data not shown).</p>
            <p>The duplicon sequence similarity characteristics of a small group of telomeres falls outside of the general patterns mentioned above. The 16p reference allele subtelomere and the Xq/Yq subtelomere have small, highly similar subterminal duplicons and more divergent adjacent subtelomeric ones, whereas the 2q, 12p, 17p, and 20q subtelomeres have moderately sized duplicons with &lt;96% to 98.5% similarity throughout the duplicated regions. The 9p subtelomere has subterminal duplicons with high sequence similarity (98.5-99%) and several large blocks of sequence that correspond to the 2qfus internal site and several internal loci on chromosome 9 (Additional data file 22) <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
            <p>The telomere assemblies analyzed here represent only a single reference sequence, and there is extensive evidence for large copy number polymorphism at many of these chromosome ends <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. Known major variant alleles differ quite dramatically in sequence organization from the shown reference alleles. For example, the 16p allele shown is one of at least three large variants of this subtelomere <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>; finished sequence data from part of a second allele show the presence of additional duplicated DNA sequences, including several large duplicons bearing very high sequence similarity (97-98.5%) with those characterized in this study (data not shown). Similarly, the 11p reference allele assembly shown here is part of a long segmental variant of this subtelomere; the short version (whose existence has been validated by cloning and mapping (H Riethman, data not shown)) ends at an internal (TTAGGG)n sequence present within the long allele (coordinate 115 kb), and has a structure similar to the 17p subtelomere (compare Additional data files 26 and 37). As additional variant subtelomeres are cloned and characterized, it is likely that further combinations of duplicons will be discovered on alleles that may, in many instances, be more similar to their paralogs than their homologs.</p>
         </sec>
         <sec>
            <st>
               <p>Subtelomere-only sequence blocks</p>
            </st>
            <p>Systematic analysis of each subtelomere revealed a limited set of subtelomeric segments whose sequence aligned exclusively with other subtelomeric DNA sequences (detailed in Additional data file 48). The 11 largest stretches of these subtelomere-only duplicon blocks, each greater than 10 kb in length, are summarized in Table <tblr tid="T1">1</tblr>. The size and subtelomere origin of the largest homology block for each of these duplicon families is indicated, along with the number of copies and the range of pairwise nucleotide sequence identities of the subtelomere alignments to the query DNA segments. It should be noted that some of the duplicons are smaller than the largest query block, either because they are missing some of the sequences or because they are from the edge of an incomplete subtelomeric sequence assembly. The subtel-only blocks include portions of the largest duplicated regions with highest sequence similarity among copies (blocks 1, 2, 3, 6, 6', and 12) in addition to several blocks with somewhat lower sequence similarity among copies. Because they are restricted exclusively to subtelomeres and are of sufficient size and sequence similarity to be detected by FISH-based approaches, this class of duplicon blocks is an attractive starting point for developing subtelomere paint probes for tracking somatic changes to subtelomeres <it>in situ</it>. Their delineation here will permit the development of sequence-based copy number quantification assays to assist in the analysis of subtelomere allele dosage changes in both germline DNA and the somatic evolution of genomes in cancer.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Large subtelomere-specific duplicons</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Subtel block</p>
                     </c>
                     <c ca="left">
                        <p>Telomere</p>
                     </c>
                     <c ca="center">
                        <p>Size (kb)</p>
                     </c>
                     <c ca="center">
                        <p>Duplicated blocks</p>
                     </c>
                     <c ca="center">
                        <p>Percent identity</p>
                     </c>
                     <c ca="left">
                        <p>Named transcripts</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1p</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>97.15-98.42</p>
                     </c>
                     <c ca="left">
                        <p>Sim to protein phosphatase 1 inhibitor subunit 2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>15q</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>97.84-98.33</p>
                     </c>
                     <c ca="left">
                        <p>OR4F3, OR4F4, OR4F5, OR4F29, OR4F21, OR4F16, OR4F17, C6orf88</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>1p</p>
                     </c>
                     <c ca="center">
                        <p>35</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>97</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>2p</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>90.49-92.00</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>3q</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>97.15-97.89</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6'</p>
                     </c>
                     <c ca="left">
                        <p>11p</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>97.98</p>
                     </c>
                     <c ca="left">
                        <p>RYD5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>4q</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>*</p>
                     </c>
                     <c ca="left">
                        <p>DUX4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>4q</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>91.33-94.60</p>
                     </c>
                     <c ca="left">
                        <p>TUBB4q</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>2q</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>96.47</p>
                     </c>
                     <c ca="left">
                        <p>FBXO25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>9q</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>91.26 - 95.68</p>
                     </c>
                     <c ca="left">
                        <p>IL9R</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>12p</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>97.89</p>
                     </c>
                     <c ca="left">
                        <p>IQSEC3</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>* Block 7 corresponds to the D4Z4 tandem repeat on the 4q and 10q subtelomeres, for which no percent identity is calculated because of the very large number and diverse % identities of the BLAST alignments among tandem D4Z4 repeats.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Subterminal sequence blocks</p>
            </st>
            <p>Adjacent to some of the terminal (TTAGGG)n sequences and to many internal (TTAGGG)n sequences are stacks of small duplicons (for example, 7p in Figure <figr fid="F1">1</figr>, 19p, 10q, 16q, 9q, 6p in Figure <figr fid="F4">4</figr>, and telomeres 2p, 3q, 4p, 4q, 5q, 6q, 8p, 11p, 17q, 18p, 19q, 21q, 22q in Additional data files 6-47). This subterminal duplicon class has sequence similarity to DNA positioned adjacent to the terminal (TTAGGG)n tract of at least one chromosome end. To more formally define these sequences, we examined the duplicon structure of each of the finished and near-finished (within 5 kb of the terminal (TTAGGG)n) subtelomere assemblies <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B36">36</abbr></abbrgrp> and identified subterminal sequence segments that are flanked by terminal (TTAGGG)n and by a position &lt;25 kb from the terminal (TTAGGG)n that corresponds to a boundary of multiple duplicons. These sequences were termed subterminal modules and were used as query sequence to define subterminal duplicons that contained sequence aligned to them using the criteria outlined in Additional data file 2. Six subterminal duplicon families were defined in this manner (Additional data file 49). Together with six one-copy DNA (TTAGGG)n-adjacent regions (7q, 8q, 11q, 12q, 18q, and Xp/Yp), these duplicon families represent the global set of sequences occupying the DNA space immediately cis to terminal (TTAGGG)n tracts. As such, they are among the sequences most likely to directly impact terminal (TTAGGG)n tract regulation <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <p>Table <tblr tid="T2">2</tblr> shows the telomere and the defining subterminal segment sizes for these six duplicon families, as well as the copy number for each family. The copies are categorized according to those that occur in other subterminal regions (&lt;25 kb from any known terminal (TTAGGG)n tract; subterm), those that occur in subtelomeric repeat regions but are not subterminal (subtel), and those that occur in non-subtelomeric regions (non_subtel). Subterminal duplicons that occur at internal subtelomeric sites are often adjacent to internal (TTAGGG)n tracts and are evident graphically as stacks of duplicated DNA segments (for example, 7p in Figure <figr fid="F1">1</figr>, and 19p, 10q, 16q, 9q, and 6p in Figure <figr fid="F4">4</figr>). However, some duplicons in such stacks are bounded by an internal (TTAGGG)n and some are not. The same situation can be visualized at several subtelomeric sites defined by stacks of subterminal duplicons but that lack internal (TTAGGG)n (for example, telomere 5p in Figure <figr fid="F4">4</figr>, and telomeres 1p, 1q, 5p, 6q, 16q from Additional data files 6-47). The simplest explanation for these observations is that these duplicon edges correspond to the positions of terminal translocations where (TTAGGG)n sequences on the recipient telomeres were lost or where the (TTAGGG)n motif was originally present but has decayed beyond recognition.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Subterminal duplicons</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Subterm block</p>
                     </c>
                     <c ca="left">
                        <p>Telomere</p>
                     </c>
                     <c ca="center">
                        <p>Size (kb)</p>
                     </c>
                     <c ca="center">
                        <p>Duplicated blocks</p>
                     </c>
                     <c ca="left">
                        <p>Location</p>
                     </c>
                     <c ca="center">
                        <p>%ID</p>
                     </c>
                     <c ca="left">
                        <p>Named transcripts</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>2p</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>91.74-92.46</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>2p</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>Subtel</p>
                     </c>
                     <c ca="center">
                        <p>91.24 - 92.65</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>2p</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>Non_subtel</p>
                     </c>
                     <c ca="center">
                        <p>91.8</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B</p>
                     </c>
                     <c ca="left">
                        <p>4p</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>90.67-98.39</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B</p>
                     </c>
                     <c ca="left">
                        <p>4p</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>Subtel</p>
                     </c>
                     <c ca="center">
                        <p>90.57-93.66</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B</p>
                     </c>
                     <c ca="left">
                        <p>4p</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>Non_subtel</p>
                     </c>
                     <c ca="center">
                        <p>91.9</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>9p</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>98.29-99.00</p>
                     </c>
                     <c ca="left">
                        <p>Sim to MGC13005, sim to DDX11, CXYorf1-related</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>9p</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>Non_subtel</p>
                     </c>
                     <c ca="center">
                        <p>98.27</p>
                     </c>
                     <c ca="left">
                        <p>Sim to MGC13005, sim to DDX11, CXYorf1-related</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>10q</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>90.7-96.65</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>10q</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>Subtel</p>
                     </c>
                     <c ca="center">
                        <p>91.68-96.09</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>10q</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>Non_subtel</p>
                     </c>
                     <c ca="center">
                        <p>93.69-95.80</p>
                     </c>
                     <c ca="left">
                        <p>Sim to RPL23AP7, FAM41C</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>17p</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>95.97-97.16</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>18p</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>Subterm</p>
                     </c>
                     <c ca="center">
                        <p>99.00</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>18p</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>Subtel</p>
                     </c>
                     <c ca="center">
                        <p>93.58</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>18p</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>Non_subtel</p>
                     </c>
                     <c ca="center">
                        <p>91.19-94.27</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>A limited set of non-subtelomeric copies of subterminal duplicons also exist (Table <tblr tid="T2">2</tblr>, Additional data file 49). Their genomic locations suggest sites of ancestral telomere-associated chromosome rearrangements, including a well-documented telomere fusion at 2q13-q14 <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and ancestral inversion of a chromosome arm followed by duplication of pericentromeric sequences (see legend to Additional data file 49).</p>
            <p>The relationship between subterminal duplicon copies within a family and between several related subterminal families (also detailed in the legend to Additional data file 49) is complex and broadly consistent with an earlier model of subtelomere structure (based upon the first completely sequenced subtelomeres) featuring a subterminal 'compartment' with more active recombinational features than the larger and less abundant centromerically positioned subtelomere duplications <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. In particular, many of the subterminal intra-family and cross-family homology regions are relatively short, their positions within the subterminal blocks vary, and they are located at different distances from the terminal (TTAGGG)n tract. In addition, there are several alternative organizations of high-copy repetitive elements (masked and not examined in detail in this study) within these subterminal blocks. Further refinement of the classification of these subterminal families appears feasible and will benefit from more extensive sampling of (TTAGGG)n-adjacent sequences from additional alleles.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Tracking subtelomere alleles using conventional DNA markers is currently very difficult. All but six of the most distal 30 kb euchromatic subtelomere segments are composed exclusively of segmental duplications, and for a significant number of subtelomeres the duplication regions can be far more extensive (hundreds of kilobases) as well as highly variable in size and duplication content among alleles. Most of this subtelomeric DNA lies outside of the 'Hapmappable' genome; using single nucleotide polymorphisms to follow haplotypes in these regions is virtually impossible using current high-throughput technologies because of subtelomeric duplication content. Our high-resolution analysis of subtelomeric duplication sequence content and organization demonstrates significant differences in the levels of sequence similarity between distinct subtelomere duplicon families as well as large variations in the types and sequence organization of duplicons present at particular subtelomeres. These differences may offer opportunities for distinguishing individual subtelomere alleles in the context of genomic DNA samples, ultimately permitting large-scale studies associating subtelomere haplotypes or haplotype combinations with particular phenotypes.</p>
         <p>Our analysis of subtelomeric duplicon substructure and nucleotide sequence similarity provides a different and more detailed perspective on subtelomere sequence organization than the subtelomere paralogy analysis included as part of the Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> study. The starting point for our analysis was a comprehensive set of manually curated and physically mapped subtelomere sequence assemblies <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, and we incorporated all segmental duplications of the subtelomeric sequences (both non-subtelomeric and subtelomeric) into our duplicon definition and analysis strategy; this led to the systematic and comprehensive definition and sequence characterization of duplicons anchored to each subtelomere (Additional data files 6-47). The paralogy map derived from the Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> analysis does not incorporate non-subtelomeric homology blocks or the newer subtelomeric sequence included in our assemblies. Because of these differences, the paralogy blocks they define overlap with, but do not correspond to, any of the subtel-only blocks or subterminal blocks defined in this study (Additional data file 50). In addition, we determined raw percent nucleotide sequence similarity numbers directly from the pairwise blastn alignments of RepeatMasked sequence, rather than calculating this parameter from alignments of non-RepeatMasked DNA post-processed to exclude gaps and small insertions/deletions from alignment percent identity scoring <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. This accounts for the generally higher divergence between our duplicon sequence alignments compared to those of Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, and helps to focus attention on sequence differences most likely to be useful for allelic and paralog discrimination.</p>
         <p>Duplicons and sets of adjacent duplicon blocks that comprise segmentally duplicated subtelomeric DNA were classified according to several practically useful and perhaps biologically significant groups. Duplicon blocks that occur only in subtelomeric regions (Table <tblr tid="T1">1</tblr>) can be used to develop sequence-based approaches to the analysis of subtelomere variation and subtelomeric somatic evolution of individual genomes, without interfering background signals from non-subtelomeric sites. Subterminal duplicon blocks of sequence (Table <tblr tid="T2">2</tblr>) were defined that, together with six one-copy subterminal regions, comprise all of the cis-elements adjacent to terminal (TTAGGG)n tracts. These sequences are believed to be involved in telomere-specific and allele-specific (TTAGGG)n tract regulation <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and are amongst the first non-(TTAGGG)n sequences expected to be affected by telomere dysfunction, aberrant telomere replication, and telomere instability. Their delineation and analysis of their variation are crucial for understanding the role of human subtelomeres in telomere length regulation and telomere biology.</p>
         <p>Subtelomeric duplicons are known to harbor protein-encoding genes and predicted protein-encoding genes as well as pseudogenes and many transcripts of unknown function <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B12">12</abbr><abbr bid="B35">35</abbr></abbrgrp> (H Riethman, unpublished). Known genes embedded in the subtelomere-specific duplicons and in the subterminal duplicons are listed in Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>, respectively; a comprehensive listing of RefSeq matches with these duplicons is given in Additional data files 51 and 52. For several subtelomeric transcript families (IL9R, DUX4, FBXO25) functional evidence for protein expression from at least one transcript locus is available <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. However, for most transcript families the evidence for encoded protein function relies upon the existence of one or more actively transcribed loci with open reading frames predicted to encode evolutionarily conserved proteins <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. While these data strongly suggest that one or more members of each of these gene families encode functional protein, in most cases pseudogene copies of the respective gene family co-exist amongst the duplicons and a great deal of work lies ahead in terms of deciphering the functions of individual members of subtelomeric gene families as well as their evolution. In this light, it is important to note that only a single reference sequence has been sampled in this analysis, and given the abundant large-scale variation in these regions, there are certain to be many additional members of most of these gene families yet to be discovered in the human population.</p>
         <p>One of the most intriguing transcript families embedded in the subtelomere repeat region is one predicted to encode odorant receptors <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B41">41</abbr></abbrgrp>, in subtelomere-specific duplicon block 2 (Table <tblr tid="T1">1</tblr>). The highly variable dosage and polymorphic distribution of these genes in humans reflect a recent and evolutionarily rapid expansion of this gene family. Subtelomeric duplicon regions of yeast, Plasmodium, and trypanosomes are each associated with rapid duplication and generation of functional diversity in their embedded genes (discussed in <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>), and it is intriguing to speculate that similar mechanisms are active in human evolution. A very interesting transcript family of unknown function (CXYorf1-related) is embedded in subterminal duplicon block C (Table <tblr tid="T2">2</tblr>); many of these transcripts are predicted to encode variants of an evolutionarily conserved open reading frame with one copy in the mouse genome <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. This transcript family varies widely in both dosage and telomere distribution in individual genomes, and usually terminates less than 5 kb from the start of the terminal (TTAGGG)n tract; thus, individual telomeric transcription sites for this family might be differentially susceptible to position effects depending on local telomeric chromatin/heterochromatin status and on chromosome-specific telomere lengths.</p>
         <p>From our analysis, it is clear that most subterminal duplicon sequences are more divergent than the large duplicons that exist more centromerically, both in nucleotide sequence similarity and in sequence organization. This divergence might be exploited to develop subterminal allele-specific PCR assays to track some of these sequences genetically in the context of total genomic DNA. For both the highly similar and the more divergent duplicon families, coupling quantitative PCR assays designed to amplify sequences across these regions with new bead-based single molecule characterization and sequencing methods <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp> might provide an extremely powerful means for determining both the copy number and a global set of short-range subtelomere haplotypes within an individual genome. Thus, subtelomere variation might be linked with phenotypes at this level. Extending these global short-range sequence haplotypes into longer-range subtelomere allele haplotypes will be more challenging, and may require the isolation, detailed characterization, and perhaps complete sequencing of many additional variant subtelomere alleles.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This comprehensive analysis of the segmental duplication substructure in human subtelomere regions yielded a number of insights with important biological implications. The localization of interstitial subtelomeric (TTAGGG)n-like sequences at duplicon boundaries suggests their involvement in the generation of the complex sequence organization. Their existence at subtelomeres suggests the possibility of internal binding/interaction sites for some (TTAGGG)n-binding protein components found primarily at terminal (TTAGGG)n tracts. Identification of a class of duplicon blocks that are subtelomere-specific will facilitate high-resolution analysis of subtelomere repeat copy number variation as well as studies involving somatic subtelomere rearrangements. Finally, the significant levels of nucleotide sequence divergence within many duplicon families as well as the differential organization of duplicon blocks on subtelomere alleles may provide opportunities for allele-specific subtelomere marker development; this is especially true for subterminal regions, where divergence and organizational differences are the greatest. These subterminal sequence families comprise the immediate cis-elements for (TTAGGG)n tracts, and are prime candidates for subtelomeric sequences regulating telomere-specific (TTAGGG)n tract length in humans. Their delineation and analysis of their variation will be crucial for understanding the role of human subtelomeres in telomere length regulation and telomere biology.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>'Hybrid' genome build</p>
            </st>
            <p>Both build 35 subtelomeres and the Riethman <it>et al</it>. <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> subtelomere sequences are based upon the same mapping data <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B36">36</abbr></abbrgrp>, but the manually curated subtelomere assemblies <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> are more complete, containing some subtelomere sequences missing and/or misincorporated in the public builds. A single hybrid reference genome was therefore created and used in the current analysis, so that duplicons could be identified and consistently defined in the context of the highest quality sequence available. The centromeric single-copy regions of our assemblies matched build 35 perfectly, so the 500 kb subtelomeric assemblies <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> (see also Riethman Lab Website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>) were substituted for build 35 sequence at the appropriate sequence coordinates (given in Additional data file 1; for each of the non-acrocentric chromosome ends the appropriate p-arm sequence was attached at the p-arm coordinate. The reverse complement of the q-arm sequences were attached at the indicated q-arm coordinates).</p>
         </sec>
         <sec>
            <st>
               <p>Rules for modules of BLAST hits</p>
            </st>
            <p>Duplicon modules were defined by processing the results of BLAST <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> searches of in-house curated subtelomere sequence with repeats masked by RepeatMasker <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and Tandem Repeats Finder <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> against the hybrid build 35 genome build described above. Blast hits (&#8805;90% identity and &#8805;100 bp length) were segregated according to chromosomal location and orientation. Any blast hits that were colinear, within 25 kb of each other in both loci, and uninterrupted by other hits from the same group were combined to form these duplicons. Our methods were tolerant of large insertions and deletions (for example, of retrotransposons) but not rearrangements. Groups of combined blast hits &#8805;1 kb were defined as duplicons, and those smaller were discarded. The percent identity of each pairwise alignment was derived directly from the blastn output; no post-processing of alignments to remove small insertions and deletions as described by Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> was done.</p>
         </sec>
         <sec>
            <st>
               <p>Subtel-only block definition and characterization</p>
            </st>
            <p>The master module list (Additional data file 3) was scanned for regions in which the query sequences shared homology with other subtelomeres but not any non-subtelomeric regions. A representative was taken from the longest stretch of query associated with each of these regions. This subsequence was passed through the module definition pipeline described above (Additional data file 2) to give sets of duplicons whose boundaries correspond precisely with the delineated subsequence.</p>
         </sec>
         <sec>
            <st>
               <p>Subterminal block definition and characterization</p>
            </st>
            <p>We examined the duplicon structure (Figures <figr fid="F1">1</figr> and <figr fid="F4">4</figr>, Additional data files 6-47) of each of the finished and near-finished subtelomere assemblies (finished to within 5 kb of the terminal (TTAGGG)n) <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and identified subterminal sequence segments that are flanked at one end by a terminal (TTAGGG)n and at the other by a position within 25 kb of the terminal (TTAGGG)n that corresponds to the boundary of multiple duplicons. These sequence blocks were used as query sequence to define subterminal duplicons that contained sequence aligned to the query subterminal block using the criteria outlined in Additional data file 2. The six subterminal families represent a minimally redundant set of such subterminal blocks.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> provides coordinates of build 35 to which the 500 kb subtelomeric <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> assemblies were added prior to the subtelomeric duplicon analysis. Additional data file <supplr sid="S2">2</supplr> is a definition of subtelomeric duplicons. Additional data file <supplr sid="S3">3</supplr> is a table giving duplicon definition and characterization. Additional data file <supplr sid="S4">4</supplr> is a summary of modules defined by similarity to human subtelomeric DNA. Additional data file <supplr sid="S5">5</supplr> gives the number and size range of duplicons found in non-subtelomeric genome regions and in subtelomeric genome regions. Additional data files <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>, <supplr sid="S8">8</supplr>, <supplr sid="S9">9</supplr>, <supplr sid="S10">10</supplr>, <supplr sid="S11">11</supplr>, <supplr sid="S12">12</supplr>, <supplr sid="S13">13</supplr>, <supplr sid="S14">14</supplr>, <supplr sid="S15">15</supplr>, <supplr sid="S16">16</supplr>, <supplr sid="S17">17</supplr>, <supplr sid="S18">18</supplr>, <supplr sid="S19">19</supplr>, <supplr sid="S20">20</supplr>, <supplr sid="S21">21</supplr>, <supplr sid="S22">22</supplr>, <supplr sid="S23">23</supplr>, <supplr sid="S24">24</supplr>, <supplr sid="S25">25</supplr>, <supplr sid="S26">26</supplr>, <supplr sid="S27">27</supplr>, <supplr sid="S28">28</supplr>, <supplr sid="S29">29</supplr>, <supplr sid="S30">30</supplr>, <supplr sid="S31">31</supplr>, <supplr sid="S32">32</supplr>, <supplr sid="S33">33</supplr>, <supplr sid="S34">34</supplr>, <supplr sid="S35">35</supplr>, <supplr sid="S36">36</supplr>, <supplr sid="S37">37</supplr>, <supplr sid="S38">38</supplr>, <supplr sid="S39">39</supplr>, <supplr sid="S40">40</supplr>, <supplr sid="S41">41</supplr>, <supplr sid="S42">42</supplr>, <supplr sid="S43">43</supplr>, <supplr sid="S44">44</supplr>, <supplr sid="S45">45</supplr>, <supplr sid="S46">46</supplr>, <supplr sid="S47">47</supplr> show the duplicons defined in the terminal 500 kb of all non-satellited telomeres (1p-Yq); each has a top panel and a bottom panel, with the top panel showing duplicon origin and organization and the bottom panel showing the % nucleotide sequence similarity for each of these duplicons. Additional data file <supplr sid="S48">48</supplr> is a table listing duplicon blocks that are specific for subtelomeric regions of the human genome. Additional data file <supplr sid="S49">49</supplr> is a table listing duplicon blocks that are adjacent to terminal (TTAGGG)n repeats. Additional data file <supplr sid="S50">50</supplr> is a Comparison of subtel-only and subterminal duplicon blocks defined in this work with the subtelomeric homology blocks reported in Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Additional data file <supplr sid="S51">51</supplr> is a table listing subtel-only block transcript matches. Additional data file <supplr sid="S52">52</supplr> is a table listing subterminal block transcript matches.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Coordinates of build 35 to which the 500 kb subtelomeric <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> assemblies were added prior to the subtelomeric duplicon analysis</p>
            </caption>
            <text>
               <p>The p-arm sequence as given was attached at the p-arm coordinate, and the reverse complement of the q-arm sequences were attached at the indicated q-arm coordinates</p>
            </text>
            <file name="gb-2007-8-7-r151-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Definition of subtelomeric duplicons</p>
            </caption>
            <text>
               <p>Duplicon modules were defined by processing the results of BLAST searches of in-house curated subtelomere query sequences (see text and Materials and methods). Colinear and properly oriented pairs of BLAST matches to the query sequence were joined into a chain if not separated by greater than 25 kb and not uninterrupted by other hits from the same query sequence. Groups of chained blast hits spanning &#8805;1 kb of the subject sequence were defined as duplicons. These methods were tolerant of insertions and deletions &lt;25 kb in size (for example, of retrotransposons) but not tolerant of rearrangements.</p>
            </text>
            <file name="gb-2007-8-7-r151-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>Duplicon definition and characterization</p>
            </caption>
            <text>
               <p>Each module is defined by a set of pairwise alignments, and each reference sequence in these sets is represented as a single row in this table. The first column (module) contains an identifier for the particular copy of the module (duplicon) indicated in the next three columns. These columns (query sequence) list the subtelomeric location of the query sequence defining the module (see Materials and methods). The 'aligned sequences' column shows the locations of other duplicons in this module, matched by the query. The coordinates in this column refer either to our published subtelomeric assemblies (designated by chromosome and arm p or q) or the human genome build 35 (all other designations). The %ID<sub>each </sub>is percent nucleotide sequence identity across the chained pairwise alignment, excluding masked sequence. The %ID<sub>avg </sub>is the average percent identity of all pairwise alignments in the module. This was the number used for %ID in charts and analyses in this paper. The final column shows a 1 if the module contains intrachromosomal non-subtelomeric sequence matches, and 0 if it does not.</p>
            </text>
            <file name="gb-2007-8-7-r151-S3.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>Summary of modules defined by similarity to human subtelomeric DNA</p>
            </caption>
            <text>
               <p>This table shows the numbers of duplicon modules defined per subtelomere. The complete list of these modules is included in Additional data file 3. The 'subtelomeric' column shows the total number of modules for each subtelomere region (since each module is defined by a set of subtelomeric coordinates). The 'non-subtelomeric' column lists the subset of these modules with homology to duplicated regions that lie outside the subtelomeres. A comparison of these non-subtelomeric duplicons to the subtelomeric copies is included in Figure <figr fid="F3">3</figr> and in Additional data file 5. The 'intra-chromosomal' column indicates the subset of modules with homology to a different region on the same chromosome.</p>
            </text>
            <file name="gb-2007-8-7-r151-S4.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional data file 5</p>
            </title>
            <caption>
               <p>Number and size range of duplicons found in non-subtelomeric genome regions and in subtelomeric genome regions</p>
            </caption>
            <text>
               <p>Subtelomeric regions correspond to the set of query sequences enumerated in Additional data file 1 and the average percent identity across the sequences to which each is aligned. The non-subtelomeric regions correspond to the aligned sequences that fall outside the subtelomere regions (the subset listed in Additional data file 2).</p>
            </text>
            <file name="gb-2007-8-7-r151-S5.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional data file 6</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 1p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S6.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S7">
            <title>
               <p>Additional data file 7</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 1q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S7.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S8">
            <title>
               <p>Additional data file 8</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 2p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S8.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S9">
            <title>
               <p>Additional data file 9</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 2q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S9.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S10">
            <title>
               <p>Additional data file 10</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 3p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S10.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S11">
            <title>
               <p>Additional data file 11</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 3q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S11.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S12">
            <title>
               <p>Additional data file 12</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 4p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S12.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S13">
            <title>
               <p>Additional data file 13</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 4q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S13.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S14">
            <title>
               <p>Additional data file 14</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 5p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S14.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S15">
            <title>
               <p>Additional data file 15</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 5q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S15.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S16">
            <title>
               <p>Additional data file 16</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 6p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S16.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S17">
            <title>
               <p>Additional data file 17</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 6q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S17.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S18">
            <title>
               <p>Additional data file 18</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 7p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S18.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S19">
            <title>
               <p>Additional data file 19</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 7q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S19.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S20">
            <title>
               <p>Additional data file 20</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 8p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S20.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S21">
            <title>
               <p>Additional data file 21</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 8q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S21.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S22">
            <title>
               <p>Additional data file 22</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 9p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S22.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S23">
            <title>
               <p>Additional data file 23</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 9q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S23.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S24">
            <title>
               <p>Additional data file 24</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 10p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S24.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S25">
            <title>
               <p>Additional data file 25</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 10q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S25.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S26">
            <title>
               <p>Additional data file 26</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 11p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S26.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S27">
            <title>
               <p>Additional data file 27</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 11q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S27.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S28">
            <title>
               <p>Additional data file 28</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 12p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S28.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S29">
            <title>
               <p>Additional data file 29</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 12q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S29.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S30">
            <title>
               <p>Additional data file 30</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 13q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S30.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S31">
            <title>
               <p>Additional data file 31</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 14q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S31.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S32">
            <title>
               <p>Additional data file 32</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 15q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S32.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S33">
            <title>
               <p>Additional data file 33</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 16p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S33.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S34">
            <title>
               <p>Additional data file 34</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 16q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S34.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S35">
            <title>
               <p>Additional data file 35</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 17p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S35.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S36">
            <title>
               <p>Additional data file 36</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 17q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S36.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S37">
            <title>
               <p>Additional data file 37</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 18p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S37.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S38">
            <title>
               <p>Additional data file 38</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 18q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S38.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S39">
            <title>
               <p>Additional data file 39</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 19p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S39.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S40">
            <title>
               <p>Additional data file 40</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 19q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S40.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S41">
            <title>
               <p>Additional data file 41</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 20p</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S41.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S42">
            <title>
               <p>Additional data file 42</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 20q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S42.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S43">
            <title>
               <p>Additional data file 43</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 21q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S43.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S44">
            <title>
               <p>Additional data file 44</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: 22q</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S44.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S45">
            <title>
               <p>Additional data file 45</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: Xp, Yp</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S45.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S46">
            <title>
               <p>Additional data file 46</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: Xq</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S46.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S47">
            <title>
               <p>Additional data file 47</p>
            </title>
            <caption>
               <p>Duplicons defined in the terminal 500 kb of all non-satellited telomeres: Yq</p>
            </caption>
            <text>
               <p>The subtelomere sequences shown are the assemblies published previously <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and are available at the Riethman Lab website <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. The telomeric end of each sequence assembly is located at the left. The distance from the end of the sequence to the start of the terminal repeat array is indicated by the vertical arrow at the telomeric end of the sequence. The position and orientation of (TTAGGG)n tracts are shown as black arrows. Top panels: duplicated genomic segments are identified by chromosome (color) and whether they are subtelomeric (bounded rectangles), non-subtelomeric (unbounded rectangles), or intra-chromosomal (located above the subtelomere coordinates). Each rectangle represents a separate duplicon. Bottom panels: duplicated genomic segments are the same as in the top panels, but identified by nucleotide sequence similarity with the query subtelomere sequence (color scheme as indicated in the key).</p>
            </text>
            <file name="gb-2007-8-7-r151-S47.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S48">
            <title>
               <p>Additional data file 48</p>
            </title>
            <caption>
               <p>Duplicon blocks that are specific for subtelomeric regions of the human genome</p>
            </caption>
            <text>
               <p>This table shows blocks of modules that occur exclusively in subtelomere regions. The first column gives an identifier for each block. The next three columns (query sequence) give the subtelomeric location that defines the block (which will consist of one or more adjacent modules). For completeness, in some cases aligned sequences have been included in these blocks even though they fell below thresholds for module definition. The percent identity of the chained alignments between the sequences is indicated (excluding masked sequence). Named genes/gene families that have transcripts matching part or all of the respective duplicon blocks are listed in the last column. Block 7 is the D4Z4 tandem repeat on the 4q and 10q subtelomeres, for which no percent identity is calculated because of the very large number and diverse percent identities of the BLAST alignments among tandem D4Z4 repeats.</p>
            </text>
            <file name="gb-2007-8-7-r151-S48.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S49">
            <title>
               <p>Additional data file 49</p>
            </title>
            <caption>
               <p>Duplicon blocks that are adjacent to terminal (TTAGGG)n repeats</p>
            </caption>
            <text>
               <p>This table shows blocks of modules that are adjacent to the ends of finished telomeres (see Materials and methods). The columns describe the same categories of information as indicated in Additional data file 48. A limited set of non-subtelomeric copies of subterminal duplicons exist (Additional data file 49). Their genomic locations suggest sites of ancestral telomere-associated chromosome rearrangements, including a well-documented telomere fusion at 2q13-q14 <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> that contains representatives of subterminal duplicon families A, B, C, and D (Additional data file 49). The non-subtelomeric site of a duplicon from family D at 3p12.3 is the tip of an extended duplication region; the DNA on the centromeric flank of this site contains 4q and 10q subtelomere homology, including beta satellite repeat structure resembling part of the D4Z4 repeat. Subterminal family F contains several non-subtelomeric sites of duplicons; those on chromosomes 22q, 14q, and 12p are very close to the respective centromeres (Additional data file 49), indicating potential ancestral inversion of a chromosome arm followed by duplication of pericentromeric sequences as a mechanism for the genesis of the non-subterminal copies of this subterminal sequence family. The sequence similarity between subterminal duplicon copies within a family is mainly in the 90-96% range for subterminal blocks A, B, and D (Table <tblr tid="T2">2</tblr>; see Additional data file 49 for the rare exceptions.). As with the subtel-only blocks, some of these duplicons correspond to only part of the subterminal block sequence. There is also some overlap in sequences occupied by subterminal duplicon blocks A, B, and D; this is reflected in their occupancy of parts of the same transcript families RPL23A7 and FAM41C (Table <tblr tid="T2">2</tblr>). The cross-family homologies between subterminal blocks A, B, and D are also in the 90-96% identity range but the positions of the duplicons within the blocks vary and are located at different distances from the (TTAGGG)n tract; also, there are several alternative organizations of high-copy repetitive elements (masked and not examined in detail in this study) within these subterminal blocks. Thus, there might be more frequent shuffling of subterminal sequences than sequences located more centromerically, at least within a subset of subtelomere alleles; this idea is broadly consistent with an earlier model of subtelomere structure featuring compartments with distinct functional properties <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Further refinement of the classification of these subterminal families appears feasible and will benefit from more extensive sampling of (TTAGGG)n-adjacent sequences from additional alleles. Subterminal Block F contains one duplicon on 10p with very high similarity to the 18p query sequence, suggesting a very recent duplication event; the remaining duplicons were all in the 91-94% identity range. Block C has the highest sequence similarity among all subterminal duplicon sequence families, and has a copy at the 2q fusion locus. Block E (96-97%) is unusual in that it corresponds to a portion of subtelomere-only duplicon family 6 (Table <tblr tid="T1">1</tblr>), and is the only subterminal duplicon sequence family with subtel-only properties. This particular sequenced allele of 17p might have formed by the truncation of a chromosome end within this large subtelomere-only duplicon, as there is mapping evidence for several longer alleles of the 17p telomere (H Riethman, unpublished). It is interesting to note that (TTAGGG)n tracts at 17p and, indeed, on this particular allele of 17p tend to be consistently among the shortest in the human genome <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B51">51</abbr></abbrgrp>.</p>
            </text>
            <file name="gb-2007-8-7-r151-S49.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S50">
            <title>
               <p>Additional data file 50</p>
            </title>
            <caption>
               <p>Comparison of subtel-only and subterminal duplicon blocks defined in this work with the subtelomeric homology blocks reported in Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp></p>
            </caption>
            <text>
               <p>Comparison of subtel-only and subterminal duplicon blocks defined in this work with the subtelomeric homology blocks reported in Linardopoulou <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp></p>
            </text>
            <file name="gb-2007-8-7-r151-S50.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S51">
            <title>
               <p>Additional data file 51</p>
            </title>
            <caption>
               <p>Subtel-only blocks transcript matches</p>
            </caption>
            <text>
               <p>Candidate transcripts were identified by blasting the representative subtelomere-only query sequences (Additional data file 48) against the NCBI RefSeq mrna database (downloaded 24 July 2006) <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Human mRNAs with 90% or greater homology were run through Spidey <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> against the set of subtelomere-only duplicon block representatives. This table has been filtered to those hits above 95% identity according to the Spidey predictions. The first and second columns indicate the subtelomere-only block and RefSeq accession that align to each other. The third is the description line from the RefSeq database. The fourth and fifth columns are the percent identity and percent coverage of the aligned mRNA as reported by Spidey.</p>
            </text>
            <file name="gb-2007-8-7-r151-S51.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S52">
            <title>
               <p>Additional data file 52</p>
            </title>
            <caption>
               <p>Subterminal blocks transcript matches</p>
            </caption>
            <text>
               <p>Candidate transcripts were identified by blasting the representative subterminal query sequences (Additional data file 49) against the NCBI RefSeq mrna database (downloaded 24 July 2006) <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Human mRNAs with 90% or greater homology were run through Spidey <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> against the set of subterminal duplicon block representatives. The first and second columns indicate the subterminal block and RefSeq accession that align to each other. The third is the description line from the RefSeq database. The fourth and fifth columns are the percent identity and percent coverage of the aligned mRNA as reported by Spidey.</p>
            </text>
            <file name="gb-2007-8-7-r151-S52.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>John Rux and the Wistar Bioinformatics Facility provided programming and computational support. Financial support was provided by NIH HG00567 and CA 25874, and by the Commonwealth Universal Research Enhancement Program, PA Dept of Health.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Finishing the euchromatic sequence of the human genome.</p>
            </title>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>431</volume>
            <fpage>931</fpage>
            <lpage>945</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature03001</pubid>
                  <pubid idtype="pmpid" link="fulltext">15496913</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Genome architecture, rearrangements and genomic disorders.</p>
            </title>
            <aug>
               <au>
                  <snm>Stankiewicz</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lupski</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>74</fpage>
            <lpage>82</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02592-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">11818139</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Large-scale copy number polymorphism in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Sebat</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lakshmi</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Troge</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Alexander</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lundin</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Maner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Massa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chi</snm>
                  <fnm>M</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>305</volume>
            <fpage>525</fpage>
            <lpage>528</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1098918</pubid>
                  <pubid idtype="pmpid" link="fulltext">15273396</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Detection of large-scale variation in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Iafrate</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Feuk</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rivera</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Listewnik</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Donahoe</snm>
                  <fnm>PK</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <fpage>949</fpage>
            <lpage>951</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1416</pubid>
                  <pubid idtype="pmpid" link="fulltext">15286789</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps.</p>
            </title>
            <aug>
               <au>
                  <snm>Murphy</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Larkin</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Everts-van der Wind</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bourque</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tesler</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Auvil</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Beever</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Chowdhary</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Galibert</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Gatzke</snm>
                  <fnm>L</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>309</volume>
            <fpage>613</fpage>
            <lpage>617</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1111387</pubid>
                  <pubid idtype="pmpid" link="fulltext">16040707</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Mapping and initial analysis of human subtelomeric sequence assemblies.</p>
            </title>
            <aug>
               <au>
                  <snm>Riethman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ambrosini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Castaneda</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Finklestein</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>Mudunuri</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>18</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">314271</pubid>
                  <pubid idtype="pmpid" link="fulltext">14707167</pubid>
                  <pubid idtype="doi">10.1101/gr.1245004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Structure and polymorphism of human telomere-associated DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Brown</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>MacKinnon</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Villasante</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Spurr</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Buckle</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Dobson</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1990</pubdate>
            <volume>63</volume>
            <fpage>119</fpage>
            <lpage>132</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(90)90293-N</pubid>
                  <pubid idtype="pmpid" link="fulltext">2208276</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Isolation of telomere junction fragments by anchored polymerase chain reaction.</p>
            </title>
            <aug>
               <au>
                  <snm>Royle</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Jeffreys</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proc Biol Sci</source>
            <pubdate>1992</pubdate>
            <volume>247</volume>
            <fpage>57</fpage>
            <lpage>67</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1098/rspb.1992.0009</pubid>
                  <pubid idtype="pmpid" link="fulltext">1348122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Sequence comparison of human and yeast telomeres identifies structurally distinct subtelomeric domains.</p>
            </title>
            <aug>
               <au>
                  <snm>Flint</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dorman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Willingham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Micklem</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Higgs</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Louis</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1997</pubdate>
            <volume>6</volume>
            <fpage>1305</fpage>
            <lpage>1313</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/6.8.1305</pubid>
                  <pubid idtype="pmpid" link="fulltext">9259277</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The complex structure and dynamic evolution of human subtelomeres.</p>
            </title>
            <aug>
               <au>
                  <snm>Mefford</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>91</fpage>
            <lpage>102</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg727</pubid>
                  <pubid idtype="pmpid" link="fulltext">11836503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Segmental polymorphisms in the proterminal regions of a subset of human chromosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Der-Sarkissian</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Vergnaud</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Borde</snm>
                  <fnm>YM</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Londono-Vallejo</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1673</fpage>
            <lpage>1678</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187550</pubid>
                  <pubid idtype="pmpid" link="fulltext">12421753</pubid>
                  <pubid idtype="doi">10.1101/gr.322802</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Human subtelomeres are hot spots of interchromosomal recombination and segmental duplication.</p>
            </title>
            <aug>
               <au>
                  <snm>Linardopoulou</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Fan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>437</volume>
            <fpage>94</fpage>
            <lpage>100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1368961</pubid>
                  <pubid idtype="pmpid" link="fulltext">16136133</pubid>
                  <pubid idtype="doi">10.1038/nature04029</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Human subtelomeric DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Riethman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ambrosini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Castaneda</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Finklestein</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cold Spring Harb Symp Quant Biol</source>
            <pubdate>2003</pubdate>
            <volume>68</volume>
            <fpage>39</fpage>
            <lpage>47</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/sqb.2003.68.39</pubid>
                  <pubid idtype="pmpid">15338601</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Heterogeneity in telomere length of human chromosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Lansdorp</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Verwoerd</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>van de Rijke</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Dragowska</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Little</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Dirks</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Raap</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Tanke</snm>
                  <fnm>HJ</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <fpage>685</fpage>
            <lpage>691</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/5.5.685</pubid>
                  <pubid idtype="pmpid" link="fulltext">8733138</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Telomeres in the mouse have large inter-chromosomal variations in the number of T2AG3 repeats.</p>
            </title>
            <aug>
               <au>
                  <snm>Zijlmans</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Martens</snm>
                  <fnm>UM</fnm>
               </au>
               <au>
                  <snm>Poon</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Raap</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Tanke</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Lansdorp</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>7423</fpage>
            <lpage>7428</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">23837</pubid>
                  <pubid idtype="pmpid" link="fulltext">9207107</pubid>
                  <pubid idtype="doi">10.1073/pnas.94.14.7423</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The relative lengths of individual telomeres are defined in the zygote and strictly maintained during life.</p>
            </title>
            <aug>
               <au>
                  <snm>Graakjaer</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pascoe</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Der-Sarkissian</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kolvraa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Christensen</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Londono-Vallejo</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Aging Cell</source>
            <pubdate>2004</pubdate>
            <volume>3</volume>
            <fpage>97</fpage>
            <lpage>102</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1474-9728.2004.00093.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15153177</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Extensive allelic variation and ultrashort telomeres in senescent human cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Baird</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Rowson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wynford-Thomas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kipling</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>33</volume>
            <fpage>203</fpage>
            <lpage>207</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1084</pubid>
                  <pubid idtype="pmpid" link="fulltext">12539050</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>The shortest telomeres drive karyotype evolution in transformed cells.</p>
            </title>
            <aug>
               <au>
                  <snm>der-Sarkissian</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bacchetti</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Cazes</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Londono-Vallejo</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Oncogene</source>
            <pubdate>2004</pubdate>
            <volume>23</volume>
            <fpage>1221</fpage>
            <lpage>1228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/sj.onc.1207152</pubid>
                  <pubid idtype="pmpid" link="fulltext">14716292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Structural stability and chromosome-specific telomere length is governed by cis-acting determinants in humans.</p>
            </title>
            <aug>
               <au>
                  <snm>Britt-Compton</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rowson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Locke</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mackenzie</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kipling</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baird</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2006</pubdate>
            <volume>15</volume>
            <fpage>725</fpage>
            <lpage>733</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddi486</pubid>
                  <pubid idtype="pmpid" link="fulltext">16421168</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The pattern of chromosome-specific variations in telomere length in humans is determined by inherited, telomere-near factors and is maintained throughout life.</p>
            </title>
            <aug>
               <au>
                  <snm>Graakjaer</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bischoff</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Korsholm</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Holstebroe</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vach</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Bohr</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Christensen</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kolvraa</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mech Ageing Dev</source>
            <pubdate>2003</pubdate>
            <volume>124</volume>
            <fpage>629</fpage>
            <lpage>640</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0047-6374(03)00081-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12735903</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>The shortest telomere, not average telomere length, is critical for cell viability and chromosome stability.</p>
            </title>
            <aug>
               <au>
                  <snm>Hemann</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Strong</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Hao</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Greider</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2001</pubdate>
            <volume>107</volume>
            <fpage>67</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(01)00504-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">11595186</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Historical claims and current interpretations of replicative aging.</p>
            </title>
            <aug>
               <au>
                  <snm>Wright</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Shay</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>20</volume>
            <fpage>682</fpage>
            <lpage>688</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt0702-682</pubid>
                  <pubid idtype="pmpid" link="fulltext">12089552</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Growth, telomere dynamics and successful and unsuccessful human aging.</p>
            </title>
            <aug>
               <au>
                  <snm>Aviv</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mangel</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mech Ageing Dev</source>
            <pubdate>2003</pubdate>
            <volume>124</volume>
            <fpage>829</fpage>
            <lpage>837</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0047-6374(03)00143-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">12875746</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Accelerated telomere shortening in response to life stress.</p>
            </title>
            <aug>
               <au>
                  <snm>Epel</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Blackburn</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dhabhar</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Adler</snm>
                  <fnm>NE</fnm>
               </au>
               <au>
                  <snm>Morrow</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Cawthon</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <fpage>17312</fpage>
            <lpage>17315</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">534658</pubid>
                  <pubid idtype="pmpid" link="fulltext">15574496</pubid>
                  <pubid idtype="doi">10.1073/pnas.0407162101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Human subtelomere structure and variation.</p>
            </title>
            <aug>
               <au>
                  <snm>Riethman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ambrosini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Chromosome Res</source>
            <pubdate>2005</pubdate>
            <volume>13</volume>
            <fpage>505</fpage>
            <lpage>515</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s10577-005-0998-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">16132815</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Recent segmental duplications in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Reinert</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Samonte</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <fpage>1003</fpage>
            <lpage>1007</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1072047</pubid>
                  <pubid idtype="pmpid" link="fulltext">12169732</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Punctuated duplication seeding events during the evolution of human chromosome 2p11.</p>
            </title>
            <aug>
               <au>
                  <snm>Horvath</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Gulden</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Vallente</snm>
                  <fnm>RU</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Ventura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>McPherson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Graves</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rocchi</snm>
                  <fnm>M</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>914</fpage>
            <lpage>927</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1172035</pubid>
                  <pubid idtype="pmpid" link="fulltext">15965031</pubid>
                  <pubid idtype="doi">10.1101/gr.3916405</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>A human interstitial telomere associates <it>in vivo </it>with specific TRF2 and TIN2 proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Mignon-Ravix</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Depetris</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Delobel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Croquette</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Mattei</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>Eur J Hum Genet</source>
            <pubdate>2002</pubdate>
            <volume>10</volume>
            <fpage>107</fpage>
            <lpage>112</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/sj.ejhg.5200775</pubid>
                  <pubid idtype="pmpid" link="fulltext">11938440</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Mechanisms underlying telomere repeat turnover, revealed by hypervariable variant repeat distribution patterns in the human Xp/Yp telomere.</p>
            </title>
            <aug>
               <au>
                  <snm>Baird</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Jeffreys</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Royle</snm>
                  <fnm>NJ</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1995</pubdate>
            <volume>14</volume>
            <fpage>5433</fpage>
            <lpage>5443</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">394652</pubid>
                  <pubid idtype="pmpid">7489732</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>High levels of sequence polymorphism and linkage disequilibrium at the telomere of 12q: implications for telomere biology and human evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Baird</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rosser</snm>
                  <fnm>ZH</fnm>
               </au>
               <au>
                  <snm>Royle</snm>
                  <fnm>NJ</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2000</pubdate>
            <volume>66</volume>
            <fpage>235</fpage>
            <lpage>250</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1288329</pubid>
                  <pubid idtype="pmpid" link="fulltext">10631154</pubid>
                  <pubid idtype="doi">10.1086/302721</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions.</p>
            </title>
            <aug>
               <au>
                  <snm>Fan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Newman</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Linardopoulou</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1663</fpage>
            <lpage>1672</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187549</pubid>
                  <pubid idtype="pmpid" link="fulltext">12421752</pubid>
                  <pubid idtype="doi">10.1101/gr.338402</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Stable length polymorphism of up to 260 kb at the tip of the short arm of human chromosome 16.</p>
            </title>
            <aug>
               <au>
                  <snm>Wilkie</snm>
                  <fnm>AO</fnm>
               </au>
               <au>
                  <snm>Higgs</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Rack</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Buckle</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Spurr</snm>
                  <fnm>NK</fnm>
               </au>
               <au>
                  <snm>Fischel-Ghodsian</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ceccherini</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>PC</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1991</pubdate>
            <volume>64</volume>
            <fpage>595</fpage>
            <lpage>606</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(91)90243-R</pubid>
                  <pubid idtype="pmpid" link="fulltext">1991321</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Sequence organization of the human chromosome 2q telomere.</p>
            </title>
            <aug>
               <au>
                  <snm>Macina</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Negorev</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Spais</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ruthig</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>Riethman</snm>
                  <fnm>HC</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>1847</fpage>
            <lpage>1853</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/3.10.1847</pubid>
                  <pubid idtype="pmpid" link="fulltext">7545974</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Molecular cloning and RARE cleavage mapping of human 2p, 6q, 8q, 12q, and 18q telomeres.</p>
            </title>
            <aug>
               <au>
                  <snm>Macina</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Morii</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>Negorev</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Spais</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ruthig</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Riethman</snm>
                  <fnm>HC</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1995</pubdate>
            <volume>5</volume>
            <fpage>225</fpage>
            <lpage>232</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.5.3.225</pubid>
                  <pubid idtype="pmpid" link="fulltext">8593610</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Members of the olfactory receptor gene family are contained in large blocks of DNA duplicated polymorphically near the ends of human chromosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Martin-Gallardo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rowen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Akinbami</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Blankenship</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Giorgi</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Iadonato</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>F</fnm>
               </au>
               <etal/>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1998</pubdate>
            <volume>7</volume>
            <fpage>13</fpage>
            <lpage>26</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/7.1.13</pubid>
                  <pubid idtype="pmpid" link="fulltext">9384599</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Integration of telomere sequences with the draft human genome sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Riethman</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Xiang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Morse</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>Flint</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chi</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Grady</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Moyzis</snm>
                  <fnm>RK</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>948</fpage>
            <lpage>951</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057180</pubid>
                  <pubid idtype="pmpid" link="fulltext">11237019</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Multiple variants in subtelomeric regions of normal karyotypes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ijdo</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Lindsay</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Wells</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Baldini</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>1992</pubdate>
            <volume>14</volume>
            <fpage>1019</fpage>
            <lpage>1025</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0888-7543(05)80125-9</pubid>
                  <pubid idtype="pmpid">1478643</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>The IL-9 receptor gene, located in the Xq/Yq pseudoautosomal region, has an autosomal origin, escapes X inactivation and is expressed from the Y.</p>
            </title>
            <aug>
               <au>
                  <snm>Vermeesch</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Petit</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kermouni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Renauld</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Van Den Berghe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Marynen</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1997</pubdate>
            <volume>6</volume>
            <fpage>1</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/6.1.1</pubid>
                  <pubid idtype="pmpid" link="fulltext">9002663</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Intracellular trafficking and dynamics of double homeodomain proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Ostlund</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Garcia-Carrasquillo</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Belayew</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Worman</snm>
                  <fnm>HJ</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2005</pubdate>
            <volume>44</volume>
            <fpage>2378</fpage>
            <lpage>2384</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi047992w</pubid>
                  <pubid idtype="pmpid" link="fulltext">15709750</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Characterization of FBX25, encoding a novel brain-expressed F-box protein.</p>
            </title>
            <aug>
               <au>
                  <snm>Hagens</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Minina</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Schweiger</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ropers</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Kalscheuer</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2006</pubdate>
            <volume>1760</volume>
            <fpage>110</fpage>
            <lpage>118</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16278047</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Transcriptional activity of multiple copies of a subtelomerically located olfactory receptor gene that is polymorphic in number and location.</p>
            </title>
            <aug>
               <au>
                  <snm>Linardopoulou</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mefford</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>van den Engh</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Farwell</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Coltrera</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2001</pubdate>
            <volume>10</volume>
            <fpage>2373</fpage>
            <lpage>2383</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/10.21.2373</pubid>
                  <pubid idtype="pmpid" link="fulltext">11689484</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>A cascade of complex subtelomeric duplications during the evolution of the hominoid and Old World monkey genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>van Geel</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Shan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Haaf</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>van der Maarel</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Frants</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>de Jong</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2002</pubdate>
            <volume>70</volume>
            <fpage>269</fpage>
            <lpage>278</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">419983</pubid>
                  <pubid idtype="pmpid" link="fulltext">11731935</pubid>
                  <pubid idtype="doi">10.1086/338307</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Identification of a novel retina-specific gene located in a subtelomeric region with polymorphic distribution among multiple human chromosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Mah</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Stoehr</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schulz</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>BH</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2001</pubdate>
            <volume>1522</volume>
            <fpage>167</fpage>
            <lpage>174</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11779631</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Characterization of the murine orthologue of a novel human subtelomeric multigene family.</p>
            </title>
            <aug>
               <au>
                  <snm>Gianfrancesco</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Falco</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Esposito</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Rocchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>D'Urso</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Cytogenet Cell Genet</source>
            <pubdate>2001</pubdate>
            <volume>94</volume>
            <fpage>98</fpage>
            <lpage>100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1159/000048796</pubid>
                  <pubid idtype="pmpid" link="fulltext">11701968</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Genome sequencing in microfabricated high-density picolitre reactors.</p>
            </title>
            <aug>
               <au>
                  <snm>Margulies</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Egholm</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Altman</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Attiya</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Bemben</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Berka</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Braverman</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>YJ</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Z</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>437</volume>
            <fpage>376</fpage>
            <lpage>380</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1464427</pubid>
                  <pubid idtype="pmpid" link="fulltext">16056220</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>BEAMing: single-molecule PCR on microparticles in water-in-oil emulsions.</p>
            </title>
            <aug>
               <au>
                  <snm>Diehl</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kinzler</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Vogelstein</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dressman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Methods</source>
            <pubdate>2006</pubdate>
            <volume>3</volume>
            <fpage>551</fpage>
            <lpage>559</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nmeth898</pubid>
                  <pubid idtype="pmpid" link="fulltext">16791214</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>The Riethman Lab Website</p>
            </title>
            <url>http://www.wistar.upenn.edu/riethman/</url>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>RepeatMasker</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>AFA</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <url>http://www.repeatmasker.org</url>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Tandem repeats finder: a program to analyze DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Benson</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>573</fpage>
            <lpage>580</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148217</pubid>
                  <pubid idtype="pmpid" link="fulltext">9862982</pubid>
                  <pubid idtype="doi">10.1093/nar/27.2.573</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Short telomeres on human chromosome 17p.</p>
            </title>
            <aug>
               <au>
                  <snm>Martens</snm>
                  <fnm>UM</fnm>
               </au>
               <au>
                  <snm>Zijlmans</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Poon</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Dragowska</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Yui</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chavez</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Lansdorp</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1998</pubdate>
            <volume>18</volume>
            <fpage>76</fpage>
            <lpage>80</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng0198-018</pubid>
                  <pubid idtype="pmpid" link="fulltext">9425906</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>The NCBI RefSeq mrna Database</p>
            </title>
            <url>ftp://ftp.ncbi.nih.gov/blast/db/</url>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Spidey: a tool for mRNA-to-genomic alignments.</p>
            </title>
            <aug>
               <au>
                  <snm>Wheelan</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1952</fpage>
            <lpage>1957</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">311166</pubid>
                  <pubid idtype="pmpid" link="fulltext">11691860</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
