Open Access Highly Accessed Open Badges Research article

A data mining approach for grouping and analyzing trajectories of care using claim data: the example of breast cancer

Nicolas Jay12*, Gilles Nuemi3, Maryse Gadreau5 and Catherine Quantin34

Author Affiliations

1 Université de Lorraine, LORIA UMR 7503, F-54000, Nancy, France

2 CHU de Nancy, Département d’information médicale, F-54000, Nancy, France

3 CHRU Dijon, Service de Biostatistique et d’Informatique Médicale (DIM), F-21000, Dijon, France

4 Inserm, U866, Univ de Bourgogne, F-21000, Dijon, France

5 Université de Bourgogne, F-21000, Dijon, France

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2013, 13:130  doi:10.1186/1472-6947-13-130

Published: 30 November 2013



With the increasing burden of chronic diseases, analyzing and understanding trajectories of care is essential for efficient planning and fair allocation of resources. We propose an approach based on mining claim data to support the exploration of trajectories of care.


A clustering of trajectories of care for breast cancer was performed with Formal Concept Analysis. We exported Data from the French national casemix system, covering all inpatient admissions in the country. Patients admitted for breast cancer surgery in 2009 were selected and their trajectory of care was recomposed with all hospitalizations occuring within one year after surgery. The main diagnoses of hospitalizations were used to produce morbidity profiles. Cumulative hospital costs were computed for each profile.


57,552 patients were automatically grouped into 19 classes. The resulting profiles were clinically meaningful and economically relevant. The mean cost per trajectory was 9,600€. Severe conditions were generally associated with higher costs. The lowest costs (6,957€) were observed for patients with in situ carcinoma of the breast, the highest for patients hospitalized for palliative care (26,139€).


Formal Concept Analysis can be applied on claim data to produce an automatic classification of care trajectories. This flexible approach takes advantages of routinely collected data and can be used to setup cost-of-illness studies.

Data mining; Formal concept analysis; Claim data; Trajectory of care; Cancer