Table 5

Prediction performance on singular peptide set (SP) using training sets with and without homologs.

Allelic variant

SR

AUC

ALL

AUC

AUC reduction1

# peptide reduction2

% peptide reduction3


HLA-DPA1*0103-DPB1*0201

0.787

0.797

0.010

801

0.571


HLA-DPA1*01-DPB1*0401

0.809

0.801

-0.008

797

0.596


HLA-DPA1*0201-DPB1*0101

0.764

0.735

-0.029

795

0.568


HLA-DPA1*0201-DPB1*0501

0.587

0.640

0.053

824

0.584


HLA-DPA1*0301-DPB1*0402

0.744

0.772

0.028

805

0.572


HLA-DQA1*0101-DQB1*0501

0.850

0.821

-0.029

1155

0.664


HLA-DQA1*0102-DQB1*0602

0.667

0.719

0.052

1036

0.636


HLA-DQA1*0301-DQB1*0302

0.569

0.756

0.187

1123

0.653


HLA-DQA1*0401-DQB1*0402

0.632

0.551

-0.081

1116

0.656


HLA-DQA1*0501-DQB1*0201

0.587

0.652

0.065

1069

0.645


HLA-DQA1*0501-DQB1*0301

0.764

0.766

0.002

1087

0.644


HLA-DRB1*0101

0.777

0.781

0.004

2923

0.455


HLA-DRB1*0301

0.782

0.786

0.004

579

0.338


HLA-DRB1*0401

0.682

0.709

0.027

548

0.310


HLA-DRB1*0404

0.805

0.818

0.013

103

0.179


HLA-DRB1*0405

0.765

0.748

-0.017

533

0.337


HLA-DRB1*0701

0.793

0.810

0.017

570

0.327


HLA-DRB1*0802

0.672

0.622

-0.050

503

0.331


HLA-DRB1*0901

0.669

0.651

-0.018

478

0.314


HLA-DRB1*1101

0.809

0.799

-0.010

590

0.329


HLA-DRB1*1302

0.712

0.733

0.021

510

0.323


HLA-DRB1*1501

0.712

0.719

0.007

598

0.338


HLA-DRB3*0101

0.829

0.838

0.009

514

0.342


HLA-DRB4*0101

0.762

0.745

-0.017

510

0.335


HLA-DRB5*0101

0.774

0.798

0.024

571

0.323


H-2-IAb

0.816

0.833

0.017

114

0.173


Average

0.737

0.748

0.011

779

0.444


The "ALL" column indicates 5-fold cross validation performance of this subset trained with entire dataset. The "SR" indicates 5-fold cross validation performance of this subset trained with sequence similarity reduced dataset.

1. AUC reduction = AUC all - AUC SR

2. # peptide reduction = # peptide all - # peptide SR

3. % peptide reduction = (# peptide all - # peptide SR)/# peptide all

Wang et al. BMC Bioinformatics 2010 11:568   doi:10.1186/1471-2105-11-568

Open Data