Additional file 6.

Distribution of context bias differences between human lineage data sets. We found that total context bias differs between different types of sequence, for example between transposons and non-repetitive sequences. One question we would like to answer is what is the origin of this difference. It turns out it is not due to patterns which are unique in one data set or the other. Another question is whether the differences is due to differences in a few shared patterns, or many. Here we compare context bias values for patterns which are shared. For example, in (A) we are looking at 2 bp patterns. We calculate the value of transposon minus non-repetitive for each of these. We then sort large to small, and plot them according to their rank. The y value of this plot is the cumulative value of context bias difference. The horizontal line represents the total context bias value for all patterns. As can be seen, most of the final total context bias value is due to a few patterns which differ greatly in transposons and non-repetitive sequence. A-D represent transposon vs. non-repetitive for 2-5 bp, E-H represent near-far for transposon sequences, and I-L represent far-near for non-repetitive sequences.

Format: PDF Size: 624KB Download file

This file can be viewed with: Adobe Acrobat Reader

Nevarez et al. BMC Bioinformatics 2010 11:462   doi:10.1186/1471-2105-11-462