Improved algorithm for removal of clonal sequences of the BISMA software. For illustration, a data set obtained in the DNA methylation analysis of in the mouse Xist promoter of a female animal was used, where 50% fully methylated and 50% unmethylated clones are expected. A) Simplified example DNA sequence alignment of bisulfite sequencing data for demonstration of the filtering algorithm. Cytosines in the reference sequence on top of the alignment are indicated in bold green. For the rest of the aligned experimental sequences, methylated CpG sites are highlighted in bold orange, while unmethylated CpG sites are shown in bold purple. Converted cytosines at non-CpG positions are shown in bold black, while conversion artifacts are indicated in bold green. B) Relevant cytosine pattern derived from the multiple sequence alignment includes information about the methylation status of CpG sites and the conversion status of non-CpG site cytosines. 1) Using the strict option BISMA keeps only one sequence out of several with identical patterns. 2) Using the BISMA suggested filtering algorithm only removes clones with identical patterns if these have a conversion artifact at the same position. C) Final methylation pattern obtained after using the 1) strict filtering algorithm or 2) the BISMA suggested filtering algorithm: each square indicates a CpG. Columns represent CpG sites while rows represent single molecules which were subcloned and sequenced. The underlying full DNA sequence alignment of all bisulfite sequencing data is available in Additional file 1: Suppl. Text S9.
Rohde et al. BMC Bioinformatics 2010 11:230 doi:10.1186/1471-2105-11-230