Table 4

Analysis of the 12 proteins from the UniProtKB/Swiss-Prot database with a very long fuzzy tandem repeat

Protein

Tandem repeats found by


Acc #

ID

Length

PTRStalker

XSTREAM

T-REKS

TRUST

RADAR


Q8IVF2

AHNK2_HUMAN

5795aa

165-x-24

165-x-23

*

**

163-x-31

[720-4666]

[774-4617]

[289-5529]

Q9N4M4

ANC1_CAEEL

8545aa

915-x-6

903-x-4.27

58-x-4

**

***

[3000-8491]

[4342-8199]

[2336-2567]

P08519

APOA_HUMAN

4548aa

1495-x-3

114-x-37

114-x-24

114-x-39

111-x-38

[0-4486]

[7-4220]

[1501-4125]

[18-4523]

[17-4282]

P20930

FILA_HUMAN

4061aa

1339-x-3

323-x-11

*

324-x-12

***

[32-4051]

[268-3902]

[82-3935]

Q54CU4

COLA_DICDI

11103aa

433-x-17

430-x-17

*

**

424-x-22

[1175-8554]

[1257-8691]

[301-9409]

Q8R0W0

EPIPL_MOUSE

6548aa

515-x-8

515-x-8

*

**

***

[2000-6548]

[2067-6529]

Q9Y6R7

FCGBP_HUMAN

5405aa

1367-x-3

1201-x-3

*

1201-x-5

394-x-13

[1000-5102]

[1100-4811]

[21-5405]

[444-5382]

P05790

FIBH_BOMMO

5263aa

1049-x-5

168-x-30

8-x-19

**

***

[1-5247]

[152-5221]

[3362-3495]

Q9UKN1

MUC12_HUMAN

5478aa

1548-x-3

1557-x-2

25-x-8

28-x-151

***

[74-4719]

[446-3569]

[2049-2280]

[215-5123]

Q8WXI7

MUC16_HUMAN

22152aa

156-x-61

156-x-61

156-x-17

**

153-x-61

[12038-21555]

[12047-21567]

[12420-15000]

[12046-21559]

Q6PZE0

MUC19_MOUSE

7524aa

652-x-9.6

163-x-36.4

*

**

***

[1071-7372]

[1281 -7214]

Q8WZ42

TITIN_HUMAN

34350aa

1082-x-4

28-x-6

10-x-26

**

395-x-28

[22186-26525]

[11428-11596]

[11445-11686]

[20001-29694]


Analysis of the 12 proteins from the UniProtKB/Swiss-Prot database for which a tandem repeat of length ≥ 4000 aa has been detected by PTRStalker. For each protein (row) and each algorithm that returned at least a result (column) we report the longest TR found by each algorithm above the threshold of 100 aa. For each TR we report: the period -x- repeat number and the [interval spanned]. Fail to report is marked with "*". Note that HHRep and HHRepID are not listed here because they fail to report multi-repeating units, since they only report pairs of homologous substrings. Entries on the TRUST column marked "**" could not be completed because of excessive memory required (see Additional file 1). Entries in the RADAR column marked "***" correspond to absence of a TR cluster in the output, although many interspersed repeats may be found.

Pellegrini et al. BMC Bioinformatics 2012 13(Suppl 3):S8   doi:10.1186/1471-2105-13-S3-S8

Open Data