Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 30010, Taiwan

Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan

Abstract

Background

Genome rearrangements are studied on the basis of genome-wide analysis of gene orders and important in the evolution of species. In the last two decades, a variety of rearrangement operations, such as reversals, transpositions, block-interchanges, translocations, fusions and fissions, have been proposed to evaluate the differences between gene orders in two or more genomes. Usually, the computational studies of genome rearrangements are formulated as problems of sorting permutations by rearrangement operations.

Result

In this article, we study a sorting problem by cut-circularize-linearize-and-paste (CCLP) operations, which aims to find a minimum number of CCLP operations to sort a signed permutation representing a chromosome. The CCLP is a genome rearrangement operation that cuts a segment out of a chromosome, circularizes the segment into a temporary circle, linearizes the temporary circle as a linear segment, and possibly inverts the linearized segment and pastes it into the remaining chromosome. The CCLP operation can model many well-known rearrangements, such as reversals, transpositions and block-interchanges, and others not reported in the biological literature. In addition, it really occurs in the immune response of higher animals. To distinguish those CCLP operations from the reversal, we call them as non-reversal CCLP operations. In this study, we use permutation groups in algebra to design an

Conclusion

The algorithm we propose in this study is very simple so that it can be easily implemented with 1-dimensional arrays and useful in the studies of phylogenetic tree reconstruction and human immune response to tumors.

Background

Genome rearrangements are studied on the basis of genome-wide analysis of gene orders and important in the evolution of species

Recently, great attention has been paid to the study of genome rearrangement using block-interchanges, since block-interchanges contain transpositions as a special case and, currently, the computational models involving block-interchanges are more tractable than those involving transpositions. More recently, Yancopoulos et al. defined a double cut and join (DCJ) operation that can model all the rearrangement operations described previously

Illustration of a cut-circularize-linearize-and-paste operation.

**Illustration of a cut-circularize-linearize-and-paste operation.** A modified cut-circularize-linearize-and-paste operation that can model seven different kinds of rearrangement, where the cutting site of the temporary circle with genes 2, 3 and 4 can be either

• Case I – reversal:

As illustrated in Figure

• Case II – transposition:

The temporary circle is cut in a new place (e.g., the

• Case III – two consecutive, adjacent reversals:

The temporary circle is cut in a new place (e.g., the

• Case IV – transposition:

The temporary circle is cut in the same place as it was joined and then pasted back to the chromosome at a new site (e.g., the

• Case V – transversal:

The temporary circle is cut in the same place as it was joined, and then inverted and pasted back to the chromosome at a new site (e.g., the

• Case VI – block-interchange:

The temporary circle is cut in a new place (e.g., the

• Case VII – two consecutive, overlapping reversals:

The temporary circle is cut in a new place (e.g., the

All these seven rearrangements described above are simply called

Preliminaries

Below, we introduce some definitions about the basics of permutation groups, as well as a couple of lemmas from Huang and Lu **1** = (1)(2)⋯(

The ^{–1}, is a permutation such that ^{–1} = ^{–1}**1**. The ^{–1}.

As demonstrated in _{1}, _{2}) be a 2-cycle and _{1} and _{2} belong to the same cycle of _{1} and _{2} belong to two different cycles of

In fact, any permutation _{c}_{c}_{c}_{c}^{–1}|| = ||

**Lemma 1**_{1}, _{2}, _{k}_{1}, _{2}, _{k} appear in the same cycle of β in the order of e_{1}, _{2}, _{k} if and only if_{1}, _{2}, …, _{k}

It is required to further extend the definition of ^{2} = **1** and Γ^{–1} = Γ. If a cycle contains no ^{+} denote a strand of a DNA molecule ^{–} = Γ · (^{+})^{–1} is the ^{+}, representing another strand of ^{+} and ^{–} are disjoint. For the purpose of modeling reversals using the permutation groups, the DNA molecule ^{+} and ^{–} (i.e., ^{+}^{–}^{–}^{+}), as demonstrated in

**Lemma 2 **^{–1}^{–1}^{–1}.

Actually, ^{–1} are ^{–1} according to Lemma 2.

**Lemma 3**

Note that in Lemma 3, (

**Lemma 4. **

Algorithmic result

In this section, we design an efficient algorithm on the basis of the permutation groups that sorts a given chromosome ^{–1} and ^{–1} such that these 2-cycles perform as a sequence of CCLP operations to optimally transform

**Lemma 5. **^{+}^{–}^{–1}^{–1}^{–1}

^{+}^{–}_{1}, _{2}, _{n}_{n}_{n}_{–1}, …, –_{1}). The assumption (^{+}, and hence ^{–}. Hence, both (_{i}_{i}_{+1} <_{k}_{–1} is the maximum in _{k}_{–1} + 1 is not in ^{–1}(_{1}) = _{k}_{–1}) = _{k}_{–1} + 1, meaning that _{1} and _{k}_{–1} + 1 are adjacent in ^{–1}. In other words, there are two adjacent elements _{1} and _{k–}_{1} + 1 in ^{–1} such that (_{1},_{k}_{–1} + 1), as well as (_{k}_{–1} + l), _{1})), acts on _{k}_{–1} + 1),_{1}))(_{1}, _{k}_{–1} + 1)_{k}_{–1} + 1),_{1}))(_{1}, _{k}_{–1} + l)^{–1} such that (_{k}_{–1} + 1 must be in _{k}_{–1} + 1),_{1}))(_{1},_{k}_{–1} + 1)_{1}, _{k}_{–1} + 1) is not admissible. Further suppose that _{j}_{j}_{j}_{j}_{k}_{–1} + 1 (since _{k}_{–1} + 1 is also in _{j}_{–1} and _{j}^{–1} because ^{–1}(_{j}_{–1}) = _{j}

**Case 1.**_{j}_{j}_{j}_{j}_{j}_{j}_{k}_{–1} cannot be the maximum in _{j}_{k}_{–1} + 1 and hence –_{j}_{k}_{–1} which contradicts to our assumption that _{k}_{–1} is the maximum in _{j}_{j}_{–1}, _{j}_{j}_{j}_{–1}))(_{j}_{–1},_{j}

**Case 2.**_{j}_{j}_{j}_{j}_{j}_{j}_{–1},_{j}_{j}_{j–}_{1}))(–_{j}_{–1}, _{j}

**Case 3. **_{j}_{j}_{j}_{j}_{j}_{j}_{–1},_{j}_{j}_{j}_{–1}))(–_{j}_{–1}, _{j}

**Case 4.**_{j}^{–1} such that (

**Case 5. **_{j}_{j}

According to the above discussion, we have completed the proof of this lemma.

**Theorem 1.**

^{–1} can be expressed as a composition of 2^{–1}|| ≤ 2^{–1}|| = |_{c}^{–1}), which bases on the lemma proposed in _{c}^{–1}) ≤ 2

Assume that there are at least two adjacent elements ^{–1} such that (^{–1} to rearrange ^{–1} such that (^{–1} by 4 and a reversal by 2. Since non-reversal CCLP operations are weighted 2 and reversals are weighted 1, Algorithm 1 decreases the norm of ^{–1} by 1 at the weight of

**Theorem 2. **

^{–1} and ^{–1}

It is worth mentioning here that our algorithm is applicable to both circular and linear chromosomes. Actually, using similar discussion as in

Conclusion

In this article, we have introduced and studied the sorting problem by CCLP operations, where CCLP is a cut-circularize-linearize-and-paste operation that can model several known and unknown rearrangements. In addition, we have proposed an

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

CLL conceived of this study, designed and analyzed its algorithm and drafted the manuscript. KHH and KTC participated in the design and analysis of the algorithm and the draft of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This article has been published as part of