NPhardness proof
The proof has the following structure. First, the Common Superstring with Multiplicities (CSM) problem is formulated. This problem is shown to be NPhard by reducing SCS to it. Then CSM is reduced to de Bruijn Superwalk with Multiplicities problem.
Let S be a string over alphabet ∑. Let L_{c} (S) denote the number of occurrences of character c ∈ ∑ in S. Then, let Common Superstring with Multiplicities problem be the problem, given strings S_{1}, S_{2}, ..., S_{n} and nonnegative integers l_{c} for all c ∈ ∑ (given in unary notation), to find out if there exists a string S such that:
 all strings S_{1}, S_{2}, ..., S_{n} are substrings of S,
 L_{c} (S) = l_{c} for each c ∈ ∑.
Theorem 1. Common Superstring with Multiplicities problem is NPhard for ∑ = 2.
Proof. To prove this, we take an instance of Shortest Common Superstring problem with ∑ = {0, 1}, which is NPhard 8, and transform it into an instance of Common Superstring with Multiplicities problem with the same answer. Let the original instance of SCS problem be
{
S
1
′
,
S
2
′
,
.
.
.
,
S
n
′
}
, l' (this instance means that we need to find if there exists a superstring of
S
1
′
,
S
2
′
,
.
.
.
,
S
n
′
having length at most l').
Let us define T_{0} = 000111 and T_{1} = 001011. These strings have been selected in such a way that each of them contains the same number of zeroes and ones and they do not overlap  no proper suffix of any of the T_{c}(c ∈ {0, 1}) is equal to any of the proper prefixes of any of the T_{c}(c ∈ {0, 1}).
Then, let
S
k
=
T
(
S
k
′
)
and l_{0} = l_{1} = 3l', where T
(
c
1
c
2
…
c
k
)
=
T
c
1
T
c
2
…
T
c
k
. The following lemmas formulate several properties of these instances of SCS and CSM problems. Equivalence of these instances is shown in lemmas 3 and 7.
Lemma 1. L_{0}(T(S')) = L_{1}(T(S')) = 3S'.
Proof. It follows directly from the definition of T.
Lemma 2. If
S
1
′
is a substring of
S
2
′
, then T
(
S
1
′
)
is a substring of T
(
S
2
′
)
.
Proof. It follows directly from the definition of T.
Lemma 3. If the answer for the original instance of SCS problem is positive, then the answer for the instance of CSM problem is also positive.
Proof. If the answer for the instance of SCS problem is positive, then there exists a string S' of length l'' ≤ l' such that S' is a superstring of
S
1
′
,
S
2
′
,
…
,
S
n
′
. Then, let S = T(S'0^{l'l''}). Because S'0^{l'l''} = S' + 0^{l'l''} = l'' + (l'l'') = l', L_{0}(S) = L_{1}(S) = 3l' (see lemma 1) and all S_{i}are substrings of T(S') (see lemma 2) the answer to the instance of CSM is indeed positive.
Lemma 4. Let
S
1
′
and
S
2
′
be two strings such that there is a suffix of T
(
S
1
′
)
equal to a prefix of T
(
S
2
′
)
. Then the following holds:
 the length of that suffix is a multiple of 6,
 if the length of the suffix is 6k, then the suffix of length k of
S
1
′
is equal to the prefix of length k of
S
2
′
.
Proof. Suppose that the length of the suffix is equal to 6k + i, 0 < i <6. Let c_{1} be the last character of
S
1
′
and c_{2} be the character at the (k + 1)th position of
S
2
′
(positions are numbered starting from one). Then, the suffix of
T
c
1
of length i would be equal to the prefix of
T
c
2
of the same length.
As mentioned before, no proper suffix of any of the T_{c}(c ∈ {0, 1}) is equal to any of the proper prefixes of any of the T_{c}(c ∈ {0, 1}). Therefore, the length of the suffix is a multiple of 6. The second follows from T_{0} and T_{1} both having length 6 and T_{0} ≠ T_{1}.
Lemma 5. Let
S
1
′
and
S
2
′
be two strings such that T
(
S
1
′
)
is a substring of T
(
S
2
′
)
.
Then following statements hold:
 each occurrence of T
(
S
1
′
)
in T
(
S
2
′
)
starts at a position which is congruent to 1 modulo 6,
 if T
(
S
1
′
)
occurs at position 6k + 1 in T
(
S
2
′
)
, then
S
1
′
occurs as a substring of
S
2
′
at position k + 1.
Proof. The proof is analogous to lemma 4.
Lemma 6. Let
S
1
′
,
S
2
′
,
.
.
.
,
S
n
′
be a set of strings, and let S be a superstring of T
(
S
1
′
)
, T
(
S
2
′
)
,
…
,
T
(
S
n
′
)
such that T
(
S
1
′
)
, T
(
S
2
′
)
,
.
.
.
,
T
(
S
n
′
)
occur in S at positions i_{1}, i_{2}, ..., i_{n} respectively (if some T
(
S
k
′
)
occurs in S in multiple positions only one position is recorded) and every character of S is covered by at least one of those occurrences. Then the following statements hold:
 i_{1}, i_{2}, ..., i_{n} are all congruent to 1 modulo 6,
 length of S is a multiple of 6,
 There exists a string S' such that S = T(S'). Strings
S
1
′
,
S
2
′
,
.
.
.
,
S
n
′
occur in S' at positions
i
1
′
,
i
2
′
,
.
.
.
,
i
n
′
,where
i
k
=
6
i
k
′

5
for k = 1, 2, ..., n.
Proof. Suppose the contrary. Let i_{k}be the smallest of i_{1}, i_{2}, ..., i_{n}which is not congruent to 1 modulo 6. Then, if i_{k}th character of S is covered by some T
(
S
k
′
′
)
such that i_{k'} <i_{k}, we have a contradiction because i_{k' }is not congruent with i_{k }modulo 6, but either T
(
S
k
′
)
and T
(
S
k
′
′
)
overlap, or T
(
S
k
′
)
is a substring of T
(
S
k
′
′
)
, which would violate either lemma 4 or lemma 5. If i_{k}th character of S is not covered by any T
(
S
k
′
′
)
, such that, i_{k' } <i_{k}, than (i_{k} 1)th character of S must be covered by the last character of some T
(
S
k
′
′
)
. But length of T
(
S
k
′
′
)
is a multiple of 6, so i_{k}must be congruent to i_{k' }modulo 6, which leads to a contradiction.
The last character of S is also covered by the last character of some T
(
S
k
′
)
. Because i_{k}is congruent to 1 modulo 6 and the length of T
(
S
k
′
)
is a multiple of 6, the length of S is also a multiple of 6.
To prove the last point, it is enough to notice that for j = 1, 7, ..., S  5, the substring of S starting at position j and having length 6 is either T_{0} or T_{1}. This follows from the fact that the jth character of S is covered by an occurrence of T
(
S
k
′
)
for some k, but restrictions on lengths of T
(
S
k
′
)
and on i_{k}mean that the whole substring of length 6 would be covered by T
(
S
k
′
)
. Moreover, the position at which the substring of length 6 occurs in T
(
S
k
′
)
is congruent to 1 modulo 6, therefore that substring is either T_{0} or T_{1} by definition of T .
Lemma 7. If the answer for the instance of CSM problem is positive, then the answer for the original instance of SCS problem is also positive.
Proof. If the answer for the instance of CSM problem is positive, then there exists a string S of length 6l' which is a superstring of S_{1}, S_{2}, ..., S_{n}. Let S'' be the shortest common superstring of these strings. Then S'' ≤ 6l' and each character of S'' is covered by an occurrence of one of S_{1}, S_{2}, ...,S_{n}. Recall that
S
k
=
T
(
S
k
′
)
. By lemma 6, there exists a string S' such that S'' = T(S') and
S
1
′
,
S
2
′
,
.
.
.
,
S
n
′
are substrings of S'. Also the equation

S
′

=

S
″

6
≤
6
l
′
6
=
l
′
holds. Therefore, the answer for the original instance of SCS problem is also positive.
Theorem 2. The de Bruijn Superwalk with Multiplicities Problem is NPhard for any fixed ∑ ≥ 2 and any positive integer k.
Proof. Consider the graph with one vertex and two loops (see Figure 1). An instance of Common Superstring with Multiplicities problem with ∑ = {0, 1} can be translated into an instance of Superwalk with Multiplicities problem on this graph in the following way:
<p>Figure 1</p>A graph on which Common Superwalk with Multiplicities problem is NPhard
A graph on which Common Superwalk with Multiplicities problem is NPhard.
 each S_{k} is directly translated into a walk, by representing 0 as occurrence of edge 0 and 1 as occurrence of edge 1 in the walk,
 the multiplicity of edge 0 is set to l_{0}, and the multiplicity of edge 1 is set to l_{1}.
To complete the proof we need to embed this graph into a de Bruijn graph with given k.
This can be done in straightforward manner (see Figure 2). Edge 0 is mapped to a loop, while edge 1 is mapped to a cycle of length k + 1.
<p>Figure 2</p>Embedding of the graph from the figure 1 into a de Bruijn graph
Embedding of the graph from the figure 1 into a de Bruijn graph.