<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/rss.css" type="text/css"?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/"
    xmlns:cc="http://web.resource.org/cc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:extra="http://www.w3.org/1999/xhtml"
    xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <channel rdf:about="http://www.biomedcentral.com/feeds/latestarticles/journal?journal=bmcbioinformatics&amp;quantity=&amp;format=rss&amp;version=">
        <title>BMC Bioinformatics - Latest Articles</title>
        <link>http://www.biomedcentral.com/bmcbioinformatics/</link>
        <description>The latest research articles published by BMC Bioinformatics</description>
        <dc:date>2010-02-09T00:00:00Z</dc:date>
        <items>
            <rdf:Seq>
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/83" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/82" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/81" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/80" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/79" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/78" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/77" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/76" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/75" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/11/74" />
                            </rdf:Seq>
        </items>
        <extra:info rdf:parseType="Literal">
            <html:div style="font:14px Verdana, Geneva, Arial, Helvetica, sans-serif" xmlns:html="http://www.w3.org/1999/xhtml">
                <html:span style="font-weight:bold">
                    This is an RSS newsfeed from BioMed Central
                </html:span>
                <html:br />
                <html:span style="font-size: 12px;">
                    It is intended to be used with an RSS reader. For more information about RSS newsfeeds from BioMed Central, visit
                    <html:br />
                    <html:a href="http://www.biomedcentral.com/info/about/rss/" style="color:#3333CC; font-size:12px;">
                        http://www.biomedcentral.com/info/about/rss/
                    </html:a>
                    <html:br />
                </html:span>
            </html:div>
        </extra:info>
        <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </channel>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/83">
        <title>Simulation of a Petri net-based Model of the Terpenoid Biosynthetic Pathway</title>
        <description>Background:
The development and simulation of dynamic models of terpenoid biosynthesis has yielded a systems perspective that provides new insights into how the structure of this biochemical pathway affects compound synthesis.  These insights may eventually help to identify reactions that could be experimentally manipulated to amplify terpenoid production. In this study, a dynamic model of the terpenoid biosynthetic pathway was constructed based on the Hybrid Functional Petri Net (HFPN) technique. This technique is a fusion of three other extended Petri net techniques, namely Hybrid Petri Net (HPN), Dynamic Petri Net (HDN) and Functional Petri Net (FPN).
Results:
The biological data needed to construct the terpenoid metabolic model were gathered from the literature and biological databases. These data were used as building blocks to create an HFPNe model and to generate parameters that govern the global behaviour of the model. The dynamic model was simulated and validated against known experimental data obtained from extensive literature searches. The model successfully simulated metabolite concentration changes over time (pt) and the observations correlated with known data. Interactions between the intermediates that affect the production of terpenes could be observed through the introduction of inhibitors that established feedback loops within and crosstalk between the pathways.
Conclusions:
Although this metabolic model is only preliminary, it will provide a platform for analysing various high-throughput data, and it should lead to a more holistic understanding of terpenoid biosynthesis.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/83</link>
                <dc:creator>Aliah Hazmah Hawari</dc:creator>
                <dc:creator>Zeti-Azura Mohamed- Hussein</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:83</dc:source>
        <dc:date>2010-02-09T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-83</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>83</prism:startingPage>
        <prism:publicationDate>2010-02-09T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/82">
        <title>Dynamic probe selection for studying microbial transcriptome with high-density genomic tiling microarrays</title>
        <description>Background:
Current commercial high-density oligonucleotide microarrays can hold millions of probe spots on a single microscopic glass slide and are ideal for studying the transcriptome of microbial genomes using a tiling probe design. This paper describes a comprehensive computational pipeline implemented specifically for designing tiling probe sets to study microbial transcriptome profiles.
Results:
The pipeline identifies every possible probe sequence from both forward and reverse-complement strands of all DNA sequences in the target genome including circular or linear chromosomes and plasmids. Final probe sequence lengths are adjusted based on the maximal oligonucleotide synthesis cycles and best isothermality allowed. Optimal probes are then selected in two stages - sequential and gap-filling. In the sequential stage, probes are selected from sequence windows tiled alongside the genome. In the gap-filling stage, additional probes are selected from the largest gaps between adjacent probes that have already been selected, until a predefined number of probes is reached. Selection of the highest quality probe within each window and gap is based on five criteria: sequence uniqueness, probe self-annealing, melting temperature, oligonucleotide length, and probe position.
Conclusions:
The probe selection pipeline evaluates global and local probe sequence properties and selects a set of probes dynamically and evenly distributed along the target genome. Unique to other similar methods, an exact number of non-redundant probes can be designed to utilize all the available probe spots on any chosen microarray platform. The pipeline can be applied to microbial genomes when designing high-density tiling arrays for comparative genomics, ChIP chip, gene expression and comprehensive transcriptome studies.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/82</link>
                <dc:creator>Hedda Hovik</dc:creator>
                <dc:creator>Tsute Chen</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:82</dc:source>
        <dc:date>2010-02-09T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-82</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>82</prism:startingPage>
        <prism:publicationDate>2010-02-09T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/81">
        <title>An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data</title>
        <description>Background:
ChIP-Seq, which combines chromatin immunoprecipitation (ChIP) with high-throughput massively parallel sequencing, is increasingly being used for identification of protein-DNA interactions in vivo in the genome. However, to maximize the effectiveness of data analysis of such sequences requires the development of new algorithms that are able to accurately predict DNA-protein binding sites.
Results:
Here, we present SIPeS (Site Identification from Paired-end Sequencing), a novel algorithm for precise identification of binding sites from short reads generated by paired-end Solexa ChIP-Seq technology. In this paper we used ChIP-Seq data from the Arabidopsis basic helix-loop-helix transcription factor ABORTED MICROSPORES (AMS), which is expressed within the anther during pollen development, the results show that SIPeS has better resolution for binding site identification compared to two existing ChIP-Seq peak detection algorithms, Cisgenome and MACS.
Conclusions:
When compared to Cisgenome and MACS, SIPeS shows better resolution for binding sites discovery. Moreover, SIPeS is designed to calculate the mappable genome length accurately with the fragment length based on the paired-end reads. Dynamic baselines are also employed to effectively discriminate closely adjacent binding sites, for effective binding sites discoverywhich is of particular value when working with high-density genomes.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/81</link>
                <dc:creator>Congmao Wang</dc:creator>
                <dc:creator>Jie Xu</dc:creator>
                <dc:creator>Dasheng Zhang</dc:creator>
                <dc:creator>Zoe Wilson</dc:creator>
                <dc:creator>Dabing Zhang</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:81</dc:source>
        <dc:date>2010-02-09T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-81</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>81</prism:startingPage>
        <prism:publicationDate>2010-02-09T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/80">
        <title>Parameters for accurate genome alignment</title>
        <description>Background:
Genome sequence alignments form the basis of much research.  Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use.  Surprisingly, there has been no large-scale assessment of these choices using real genomic data.  Moreover, rigorous procedures to control the rate of spurious alignment have not been employed.
Results:
We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes.  As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters.  Higher values of the X-drop parameter are not always better.  E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way.  Finally, we show that gamma-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases.
Conclusion:
These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases.  This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours (http://last.cbrc.jp/).</description>
        <link>http://www.biomedcentral.com/1471-2105/11/80</link>
                <dc:creator>Martin Frith</dc:creator>
                <dc:creator>Michiaki Hamada</dc:creator>
                <dc:creator>Paul Horton</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:80</dc:source>
        <dc:date>2010-02-09T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-80</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>80</prism:startingPage>
        <prism:publicationDate>2010-02-09T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/79">
        <title>Using simple artificial intelligence methods for predicting 
amyloidogenesis in antibodies</title>
        <description>Background:
All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences.
Results:
The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15 % for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classication accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05 % and 77.24% using the Bayesian classier, depending on the training set size. The accuracy for the holdout test set was 89 %. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00 %.
Conclusions:
This exploratory study indicates that both classication methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classiers may also be improved when additional factors, such as structural andphysico-chemical data, are considered. The development of this type of classifier has signicant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/79</link>
                <dc:creator>Maria Pamela David</dc:creator>
                <dc:creator>Gisela Concepcion</dc:creator>
                <dc:creator>Eduardo Padlan</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:79</dc:source>
        <dc:date>2010-02-08T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-79</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>79</prism:startingPage>
        <prism:publicationDate>2010-02-08T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/78">
        <title>Testing the additional predictive value of high-dimensional
molecular data</title>
        <description>Background:
While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature.
Results:
We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to the two publicly available cancer  data sets.
Conclusions:
Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available. It is implemented in the R package &apos;&apos;globalboosttest&apos;&apos; which is publicly available from R-forge and will be sent to the CRAN as soon as possible.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/78</link>
                <dc:creator>Anne-Laure Boulesteix</dc:creator>
                <dc:creator>Torsten Hothorn</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:78</dc:source>
        <dc:date>2010-02-08T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-78</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>78</prism:startingPage>
        <prism:publicationDate>2010-02-08T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/77">
        <title>A random effect multiplicative heteroscedastic model for bacterial
growth</title>
        <description>Background:
Predictive microbiology develops mathematical models that can predict the growth rate of a microorganism population under a set of environmental conditions. Many primary growth models have been proposed. However, when primary models are applied to bacterial growth curves, the biological variability is reduced to a single curve defined by some kinetic parameters (lag time and growth rate), and sometimes the models give poor fits in some regions of the curve. The development of a prediction band (from a set of bacterial growth curves) using non-parametric and bootstrap methods permits to overcome that problem and include the biological variability of the microorganism into the modelling process.
Results:
Absorbance data from Listeria monocytogenes cultured at 22, 26, 38, and 42 C were selected under different environmental conditions of pH (4.5, 5.5, 6.5, and 7.4) and percentage of NaCl (2.5, 3.5, 4.5, and 5.5). Transformation of absorbance data to viable count data was carried out. A random effect multiplicative heteroscedastic model was considered to explain the dynamics of bacterial growth. The concept of a prediction band for microbial growth is proposed. The bootstrap method was used to obtain resamples from this model. An iterative procedure is proposed to overcome the computer intensive task of calculating simultaneous prediction intervals, along time, for bacterial growth. The bands were narrower below the inflection point (0-8 h at 22 C, and 0-5.5 h at 42 C), and wider to the right of it (from 9 h onwards at 22 C, and from 7 h onwards at 42 C). A wider band was observed at 42 C than at 22 C when the curves reach their upper asymptote. Similar bands have been obtained for 26 and 38 C.
Conclusions:
The combination of nonparametric models and bootstrap techniques results in a good procedure to obtain reliable prediction bands in this context. Moreover, the new iterative algorithm proposed in this paper allows one to achieve exactly the prefixed coverage probability for the prediction band. The microbial growth bands reflect the influence of the different environmental conditions on the microorganism behaviour, helping in the interpretation of the biological meaning of the growth curves obtained experimentally.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/77</link>
                <dc:creator>Ricardo Cao</dc:creator>
                <dc:creator>Mario Francisco-Fernandez</dc:creator>
                <dc:creator>Emiliano Quinto</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:77</dc:source>
        <dc:date>2010-02-08T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-77</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>77</prism:startingPage>
        <prism:publicationDate>2010-02-08T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/76">
        <title>Extracting consistent knowledge from highly inconsistent cancer gene data sources </title>
        <description>Background:
Hundreds of genes that are causally implicated in oncogenesis have been found and collected in various databases. For efficient application of these abundant but diverse data sources, it is of fundamental importance to evaluate their consistency.
Results:
First, we showed that the lists of cancer genes from some major data sources were highly inconsistent in terms of overlapping genes. In particular, most cancer genes accumulated in previous small-scale studies could not be rediscovered in current high-throughput genome screening studies. Then, based on a metric proposed in this study, we showed that most cancer gene lists from different data sources were highly functionally consistent. Finally, we extracted functionally consistent cancer genes from various data sources and collected them in our database F-Census.
Conclusions:
Although they have very low gene overlapping, most cancer gene data sources are highly consistent at the functional level, which indicates that they can separately capture partial genes in a few key pathways associated with cancer. Our results suggest that the sample sizes currently used for cancer studies might be inadequate for consistently capturing individual cancer genes, but could be sufficient for finding a number of cancer genes that could represent functionally most cancer genes. The F-Census database provides biologists with a useful tool for browsing and extracting functionally consistent cancer genes from various data sources.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/76</link>
                <dc:creator>Xue Gong</dc:creator>
                <dc:creator>Ruihong Wu</dc:creator>
                <dc:creator>Yuannv Zhang</dc:creator>
                <dc:creator>Wenyuan Zhao</dc:creator>
                <dc:creator>Lixin Cheng</dc:creator>
                <dc:creator>Yunyan Gu</dc:creator>
                <dc:creator>Lin Zhang</dc:creator>
                <dc:creator>Jing Wang</dc:creator>
                <dc:creator>Jing Zhu</dc:creator>
                <dc:creator>Zheng Guo</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:76</dc:source>
        <dc:date>2010-02-05T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-76</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>76</prism:startingPage>
        <prism:publicationDate>2010-02-05T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/75">
        <title>Mining protein loops using a structural alphabet and
statistical exceptionality</title>
        <description>Background:
Protein loops encompass 50% of protein residues in available three-dimensional structures. Theseregions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description ofprotein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, havebeen widely studied whereas loops, because they are highly variable in terms of sequence and structure, aredifficult to analyze. Due to data sparsity, long loops have rarely been systematically studied.
Results:
We developed a simple and accurate method that allows the description and analysis of the structuresof short and long loops using structural motifs without restriction on loop length. This method is based on thestructural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into aone-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter.The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structuralletter strings as in conventional protein sequence analysis. We systematically extracted all seven-residuefragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence,named structural word. This approach permits a systematic analysis of loops of all sizes since we consider thestructural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrentwords of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only3310 highly recurrent structural words out of 28274 observed words). These structural words have low structuralvariability (mean RMSd of 0.85 angstrom). As expected, half of these motifs display a flanking-region preference butinterestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrentmotifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% oflong loops contain at least one such word. We complement our analysis with the detection of statisticallyover-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) ofstructural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibitlower structural variability and higher sequential specificity, suggesting structural or functional constraints.
Conclusions:
We developed a method to systematically decompose and study protein loops using recurrentstructural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignmentand geometrical parameters. We extracted meaningful structural motifs that are found in both short and longloops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio inprotein loops. This finding helps to better describe protein loops and might permit to decrease the complexity oflong-loop analysis. Detailed results are available athttp://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.</description>
        <link>http://www.biomedcentral.com/1471-2105/11/75</link>
                <dc:creator>Leslie Regad</dc:creator>
                <dc:creator>Juliette Martin</dc:creator>
                <dc:creator>Gregory Nuel</dc:creator>
                <dc:creator>Anne-claude Camproux</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:75</dc:source>
        <dc:date>2010-02-04T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-75</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>75</prism:startingPage>
        <prism:publicationDate>2010-02-04T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/11/74">
        <title>CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics</title>
        <description>Background:
Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist.
Results:
We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV.
Conclusions:
To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects.Availability and Implementation: Available on the web at: http://sourceforge.net/projects/cnv</description>
        <link>http://www.biomedcentral.com/1471-2105/11/74</link>
                <dc:creator>Xiaowu Gai</dc:creator>
                <dc:creator>Juan Perin</dc:creator>
                <dc:creator>Kevin Murphy</dc:creator>
                <dc:creator>Ryan O'Hara</dc:creator>
                <dc:creator>Monica D'arcy</dc:creator>
                <dc:creator>Adam Wenocur</dc:creator>
                <dc:creator>Hongbo Xie</dc:creator>
                <dc:creator>Eric Rappaport</dc:creator>
                <dc:creator>Tamim Shaikh</dc:creator>
                <dc:creator>Peter White</dc:creator>
                <dc:source>BMC Bioinformatics 2010, 11:74</dc:source>
        <dc:date>2010-02-04T00:00:00Z</dc:date>
        <dc:identifier>doi:10.1186/1471-2105-11-74</dc:identifier>
        <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>11</prism:volume>
        <prism:startingPage>74</prism:startingPage>
        <prism:publicationDate>2010-02-04T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <cc:License rdf:about="http://creativecommons.org/licenses/by/2.0/">
        <cc:permits rdf:resource="http://creativecommons.org/ns#Reproduction" />
        <cc:permits rdf:resource="http://creativecommons.org/ns#Distribution" />
        <cc:permits rdf:resource="http://creativecommons.org/ns#DerivativeWorks" />
    </cc:License>
</rdf:RDF>
