Resolution:
## Figure 1.
The iASeq model. (a) An example of the data structure. Each row represents a SNP and each column corresponds
to either the reference allele (R) or the non-reference allele (N) read counts from
a ChIP-seq sample in a dataset. A dataset could be a TF ChIP-seq experiment or a HM
ChIP-seq experiment, and can have multiple replicate samples (Rep). iASeq assumes
the following data generating process. (b) First, SNPs belong to K + 1 classes with different ASB patterns. For each SNP, a class label a_{i}is randomly assigned according to a class abundance probability vector . Given the class label, a configuration [Πb_{id},c_{id}] is generated for each SNP in each dataset according to the probabilistic allele-specificity
patterns specified by two vectors V_{k}and W_{k}. In the figure, the darkness of each cell in and V represents the probability for Wb_{id}or c_{id}to be 1. (c) Next, a skewing probability p_{idj}is generated for each SNP i, dataset d and replicate sample j based on [b_{id},c_{id}]. The distribution of p_{idj}for NS SNPs in each sample follows a Beta distribution (blue lines). p_{idj}s for SR SNPs are uniformly distributed in the interval [p_{dj0},1] where p_{dj0}is the mean of the background Beta distribution (dark blue lines). p_{idj}s for SN SNPs are uniformly distributed in the interval [0,p_{dj0}] (light blue lines). (d) Finally, given the configuration [b_{id},c_{id}], skewing probability p_{idj}and a total read count n_{idj}for SNP i, dataset d and sample j, the read count for each allele is generated according to a binomial distribution.
The length of the orange bar represents the non-reference allele read count, and the
length of the red bar represents the reference allele read count.
Wei |