Table 1

Summary statistics for the mosaic classes of the human genome

Class pair

A

C

G

T

Length

CpG proportion

Proportion of genome

1

2

3

4

5

6

7

8


1

0.169

0.234

0.104

0.492

5.4

0.006

0.143

2

0.230

0.220

0.243

0.307

84.7

0.011

0.134

3

0.314

0.139

0.118

0.428

71.5

0.003

0.132

4

0.235

0.281

0.133

0.351

64.9

0.005

0.120

5

0.279

0.175

0.185

0.361

55.3

0.004

0.116

6

0.215

0.165

0.188

0.432

504.7

0.005

0.092

7

0.183

0.336

0.205

0.276

55.6

0.009

0.069

8

0.177

0.392

0.194

0.237

33.6

0.029

0.044

9

0.167

0.379

0.236

0.218

48.2

0.027

0.042

10

0.145

0.242

0.309

0.303

18.2

0.017

0.022

11

0.199

0.453

0.240

0.108

14.4

0.056

0.017

12

0.175

0.021

0.079

0.725

14.4

0.000

0.014

13

0.252

0.359

0.126

0.263

10.5

0.003

0.012

14

0.125

0.454

0.252

0.169

25.8

0.096

0.009

15

0.217

0.357

0.124

0.301

71.4

0.010

0.008

16

0.223

0.045

0.261

0.471

14.0

0.007

0.007

17

0.003

0.066

0.006

0.925

11.9

0.000

0.006

18

0.099

0.402

0.071

0.429

5.1

0.024

0.006

19

0.021

0.434

0.023

0.522

33.5

0.003

0.005


Each row refers to two classes, which we refer to as the a and b classes, each with the reverse complement properties of the other. The a class is the one with the higher percentage of T+C, and the base frequencies for this class are shown in columns 2-3-4-5. The frequencies for the b class are the same but with the A/T and C/G frequencies interchanged. Column 6 gives the mean length of the class in bases and column 7 gives the proportion of doublets within the class that are CpG: these two quantities are the same for both the a and b classes. Column 8 gives the proportion of bases in the genome within both classes of the class pair: the total of column 8 is therefore 1.0. The class pairs have been numbered by the proportion of bases they contain. All values have been calculated from the fitted HMM: the parameters of this model are given in Additional File 1.

Evans BMC Genomics 2010 11:286   doi:10.1186/1471-2164-11-286

Open Data