Email updates

Keep up to date with the latest news and content from BMC Medical Research Methodology and BioMed Central.

Open Access Highly Accessed Software

An R package for analyzing and modeling ranking data

Paul H Lee12* and Philip LH Yu2

Author Affiliations

1 School of Public Health/Department of Community Medicine, The University of Hong Kong, Room 624-627, Core F, Cyberport 3, 100 Cyberport Road, Hong Kong, Hong Kong

2 Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, Hong Kong

For all author emails, please log on.

BMC Medical Research Methodology 2013, 13:65  doi:10.1186/1471-2288-13-65

Published: 14 May 2013

Abstract

Background

In medical informatics, psychology, market research and many other fields, researchers often need to analyze and model ranking data. However, there is no statistical software that provides tools for the comprehensive analysis of ranking data. Here, we present pmr, an R package for analyzing and modeling ranking data with a bundle of tools. The pmr package enables descriptive statistics (mean rank, pairwise frequencies, and marginal matrix), Analytic Hierarchy Process models (with Saaty’s and Koczkodaj’s inconsistencies), probability models (Luce model, distance-based model, and rank-ordered logit model), and the visualization of ranking data with multidimensional preference analysis.

Results

Examples of the use of package pmr are given using a real ranking dataset from medical informatics, in which 566 Hong Kong physicians ranked the top five incentives (1: competitive pressures; 2: increased savings; 3: government regulation; 4: improved efficiency; 5: improved quality care; 6: patient demand; 7: financial incentives) to the computerization of clinical practice. The mean rank showed that item 4 is the most preferred item and item 3 is the least preferred item, and significance difference was found between physicians’ preferences with respect to their monthly income. A multidimensional preference analysis identified two dimensions that explain 42% of the total variance. The first can be interpreted as the overall preference of the seven items (labeled as “internal/external”), and the second dimension can be interpreted as their overall variance of (labeled as “push/pull factors”). Various statistical models were fitted, and the best were found to be weighted distance-based models with Spearman’s footrule distance.

Conclusions

In this paper, we presented the R package pmr, the first package for analyzing and modeling ranking data. The package provides insight to users through descriptive statistics of ranking data. Users can also visualize ranking data by applying a thought multidimensional preference analysis. Various probability models for ranking data are also included, allowing users to choose that which is most suitable to their specific situations.

Keywords:
Distance-based model; Luce model; Multidimensional preference analysis; Visualization; Weighted distance