Increased gene sampling strengthens support for higher-level groups within leaf-mining moths and relatives (Lepidoptera: Gracillariidae)
1 Department of Entomology, University of Maryland, College Park, MD, USA
2 Division of Evolutionary Biology, National Institute for Basic Biology, Okazaki, Japan
3 Center for Ecological Research, Kyoto University, Kyoto, Japan
4 Institute for Bioscience and Biotechnology Research, University of Maryland, College Park, MD, USA
5 Laboratory of Molecular Evolution, Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA
6 Department of Entomology, Smithsonian Institution, Washington, D.C., USA
7 Department of Ecology & Evolutionary Biology, University of Connecticut, Storrs, CT, USA
8 Royal Museum for Central Africa, Tervuren, Belgium
9 INRA, UR0633 Zoologie Forestière, F-45000, Orléans, France
BMC Evolutionary Biology 2011, 11:182 doi:10.1186/1471-2148-11-182Published: 24 June 2011
Researchers conducting molecular phylogenetic studies are frequently faced with the decision of what to do when weak branch support is obtained for key nodes of importance. As one solution, the researcher may choose to sequence additional orthologous genes of appropriate evolutionary rate for the taxa in the study. However, generating large, complete data matrices can become increasingly difficult as the number of characters increases. A few empirical studies have shown that augmenting genes even for a subset of taxa can improve branch support. However, because each study differs in the number of characters and taxa, there is still a need for additional studies that examine whether incomplete sampling designs are likely to aid at increasing deep node resolution. We target Gracillariidae, a Cretaceous-age (~100 Ma) group of leaf-mining moths to test whether the strategy of adding genes for a subset of taxa can improve branch support for deep nodes. We initially sequenced ten genes (8,418 bp) for 57 taxa that represent the major lineages of Gracillariidae plus outgroups. After finding that many deep divergences remained weakly supported, we sequenced eleven additional genes (6,375 bp) for a 27-taxon subset. We then compared results from different data sets to assess whether one sampling design can be favored over another. The concatenated data set comprising all genes and all taxa and three other data sets of different taxon and gene sub-sampling design were analyzed with maximum likelihood. Each data set was subject to five different models and partitioning schemes of non-synonymous and synonymous changes. Statistical significance of non-monophyly was examined with the Approximately Unbiased (AU) test.
Partial augmentation of genes led to high support for deep divergences, especially when non-synonymous changes were analyzed alone. Increasing the number of taxa without an increase in number of characters led to lower bootstrap support; increasing the number of characters without increasing the number of taxa generally increased bootstrap support. More than three-quarters of nodes were supported with bootstrap values greater than 80% when all taxa and genes were combined. Gracillariidae, Lithocolletinae + Leucanthiza, and Acrocercops and Parectopa groups were strongly supported in nearly every analysis. Gracillaria group was well supported in some analyses, but less so in others. We find strong evidence for the exclusion of Douglasiidae from Gracillarioidea sensu Davis and Robinson (1998). Our results strongly support the monophyly of a G.B.R.Y. clade, a group comprised of Gracillariidae + Bucculatricidae + Roeslerstammiidae + Yponomeutidae, when analyzed with non-synonymous changes only, but this group was frequently split when synonymous and non-synonymous substitutions were analyzed together.
1) Partially or fully augmenting a data set with more characters increased bootstrap support for particular deep nodes, and this increase was dramatic when non-synonymous changes were analyzed alone. Thus, the addition of sites that have low levels of saturation and compositional heterogeneity can greatly improve results. 2) Gracillarioidea, as defined by Davis and Robinson (1998), clearly do not include Douglasiidae, and changes to current classification will be required. 3) Gracillariidae were monophyletic in all analyses conducted, and nearly all species can be placed into one of six strongly supported clades though relationships among these remain unclear. 4) The difficulty in determining the phylogenetic placement of Bucculatricidae is probably attributable to compositional heterogeneity at the third codon position. From our tests for compositional heterogeneity and strong bootstrap values obtained when synonymous changes are excluded, we tentatively conclude that Bucculatricidae is closely related to Gracillariidae + Roeslerstammiidae + Yponomeutidae.