Quality control for paired-end sequencing. The y-axis shows the number of pairs for each of the following categories: 1) good pair: sequence reads from both ends of a sequence are from the same chromosome and their distance and orientation are consistent with the reference genome; 2) unpaired on the forward strand: orphan reads from one end of the sequencing; unpaired on the reverse strand: orphan reads from the other end of sequencing. We separate the reads from two ends since for some technologies the reading efficiency and accuracy are different for two ends; 3) different chromosome: two ends of the same fragment are from different chromosomes based on the reference genome; 4) wrong orientation: although the two ends are from the same chromosome, their relative orientation is different from the reference genome; 5) < defined range: paired-end reads with shorter than the expected library fragment range and 6) > defined range: paired-end reads with longer than the expected library fragment range. In the above example, more than one third of the pairs have a shorter than expected distances, thus indicating a library quality issue.
Dai et al. BMC Genomics 2010 11(Suppl 4):S7 doi:10.1186/1471-2164-11-S4-S7