Table 4

Average sentence, base and full NP lengths (in tokens)

Subdomain

Sentence length

Average Base NP length

Average Full NP length


Vascular Diseases

28.665

1.803

3.580

Physiology

26.663

1.793

3.410

Molecular Biology

26.330

1.844

3.436

Environmental Health

26.101

1.790

3.470

Rheumatology

26.016

1.805

3.447

Biochemistry

25.981

1.846

3.569

Geriatrics

25.920

1.768

3.427

Botany

25.874

1.835

3.415

Ethics

25.842

1.655

3.172

Science

25.840

1.812

3.403

Microbiology

25.704

1.834

3.430

Tropical Medicine

25.536

1.788

3.524

Medicine

25.498

1.800

3.466

Genetics

25.433

1.827

3.424

Pulmonary Medicine

25.330

1.795

3.475

Virology

25.191

1.860

3.500

Biotechnology

25.077

1.859

3.518

Cell Biology

25.073

1.790

3.251

Neoplasms

24.983

1.849

3.467

Pharmacology

24.930

1.791

3.485

Veterinary Medicine

24.788

1.757

3.544

PMC

24.736

1.805

3.439

Public Health

24.712

1.755

3.383

Critical Care

24.611

1.802

3.471

Genetics, Medical

24.535

1.836

3.480

Psychiatry

24.482

1.752

3.412

Communicable Diseases

24.462

1.785

3.438

Embryology

24.393

1.819

3.316

Complementary Therapies

24.162

1.749

3.340

Obstetrics

24.159

1.754

3.467

Pediatrics

23.870

1.739

3.449

Gastroenterology

23.837

1.793

3.477

Education

23.653

1.719

3.303

Medical Informatics

23.579

1.785

3.365

Biomedical Engineering

23.510

1.835

3.635

Therapeutics

23.478

1.749

3.399

Neurology

23.033

1.787

3.358

Endocrinology

22.679

1.799

3.401

Newswire

19.128

1.603

3.067

Ophthalmology

17.326

1.763

3.366


Lippincott et al. BMC Bioinformatics 2011 12:212   doi:10.1186/1471-2105-12-212

Open Data