Table 1

Summary statistics for the results of automated classification of chemical entities in the test set

Tested Category

Total Inferences

Direct Inferences

Correct Inferences


Acetamides

13

4 (30.8)

13 (100.0)


Acetic Anhydrides

11

5 (45.5)

9 (81.8)


Acrylamides

20

4 (20.0)

19 (95.0)


Alcohols

9

4 (44.4)

9 (100.0)


Aldehydes

6

2 (33.3)

6 (100.0)


Alkadienes

8

5 (62.5)

8 (100.0)


Amides

2

1 (50.0)

2 (100.0)


Amines

9

0 (0.0)

9 (100.0)


Amino Alcohols

7

3 (42.9)

7 (100.0)


Aminopyridines

12

3 (25.0)

10 (83.3)


Anhydrides

5

3 (60.0)

5 (100.0)


Aza Compounds

3

1 (33.3)

3 (100.0)


Benzaldehydes

11

4 (36.4)

11 (100.0)


Benzyl Alcohols

10

4 (40.0)

10 (100.0)


Benzylamines

10

4 (40.0)

10 (100.0)


Boron Compounds

8

5 (62.5)

7 (87.5)


Butylamines

11

5 (45.5)

11 (100.0)


Carbodiimides

2

1 (50.0)

2 (100.0)


Carboxylic Acids

9

0 (0.0)

4 (44.4)


Chlorohydrins

11

5 (45.5)

10 (90.9)


Cyanates

4

3 (75.0)

4 (100.0)


Cyclohexylamines

6

3 (50.0)

6 (100.0)


Diazonium Compounds

15

5 (33.3)

10 (66.7)


Ethers

6

5 (83.3)

6 (100.0)


Ethylamines

4

2 (50.0)

4 (100.0)


Fatty Alcohols

2

2 (100.0)

2 (100.0)


Formamides

7

5 (71.4)

7 (100.0)


Glycols

6

2 (33.3)

3 (50.0)


Guanidines

15

5 (33.3)

15 (100.0)


Hydrazines

11

3 (27.3)

11 (100.0)


Hydroxylamines

14

5 (35.7)

13 (92.9)


Imides

21

5 (23.8)

21 (100.0)


Imines

7

2 (28.6)

7 (100.0)


Isocyanates

10

5 (50.0)

7 (70.0)


Ketones

6

5 (83.3)

6 (100.0)


Lactams

17

4 (23.5)

17 (100.0)


Lactones

10

5 (50.0)

10 (100.0)


Methylamines

7

3 (42.9)

7 (100.0)


Nitrates

6

2 (33.3)

6 (100.0)


Nitriles

8

5 (62.5)

8 (100.0)


Nitrites

5

5 (100.0)

5 (100.0)


Nitro Compounds

22

5 (22.7)

17 (77.3)


Nitroso Compounds

13

4 (30.8)

11 (84.6)


Organic Compounds

2

1 (50.0)

2 (100.0)


Organophosphorus Compounds

18

5 (27.8)

16 (88.9)


Organoselenium Compounds

6

3 (50.0)

6 (100.0)


Organosilicon Compounds

6

5 (83.3)

6 (100.0)


Organothiophosphorus Compounds

8

5 (62.5)

8 (100.0)


Peroxides

5

5 (100.0)

5 (100.0)


Phenols

8

5 (62.5)

8 (100.0)


Total

452

182 (40.3)

419 (92.7)


The direct inferences are class inferences that are identical to the annotations in the test set. Correct inference counts include the direct inferences and inferences that were deemed correct by a curator. Please note that a lack of direct inferences does not reflect an error - merely the presence of another class whose definition was a closer match for a given molecule than its original class. More than one inference was possible for a given molecule. Percentages of total inferences for each class are given in brackets for each category.

Chepelev et al. BMC Bioinformatics 2012 13:3   doi:10.1186/1471-2105-13-3

Open Data