Email updates

Keep up to date with the latest news and content from BMC Medical Informatics and Decision Making and BioMed Central.

Open Access Technical advance

Using n-gram analysis to cluster heartbeat signals

Yu-Chen Huang13, Hanjun Lin13, Yeh-Liang Hsu13* and Jun-Lin Lin2

Author Affiliations

1 Department of Mechanical Engineering, Yuan Ze University, Taoyuan, Taiwan

2 Department of Information Management, Yuan Ze University, Taoyuan, Taiwan

3 Gerontechnology Research Center, Yuan Ze University, Taoyuan, Taiwan

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2012, 12:64  doi:10.1186/1472-6947-12-64

Published: 8 July 2012



Biological signals may carry specific characteristics that reflect basic dynamics of the body. In particular, heart beat signals carry specific signatures that are related to human physiologic mechanisms. In recent years, many researchers have shown that representations which used non-linear symbolic sequences can often reveal much hidden dynamic information. This kind of symbolization proved to be useful for predicting life-threatening cardiac diseases.


This paper presents an improved method called the “Adaptive Interbeat Interval Analysis (AIIA) method”. The AIIA method uses the Simple K-Means algorithm for symbolization, which offers a new way to represent subtle variations between two interbeat intervals without human intervention. After symbolization, it uses the n-gram algorithm to generate different kinds of symbolic sequences. Each symbolic sequence stands for a variation phase. Finally, the symbolic sequences are categorized by classic classifiers.


In the experiments presented in this paper, AIIA method achieved 91% (3-gram, 26 clusters) accuracy in successfully classifying between the patients with Atrial Fibrillation (AF), Congestive Heart Failure (CHF) and healthy people. It also achieved 87% (3-gram, 26 clusters) accuracy in classifying the patients with apnea.


The two experiments presented in this paper demonstrate that AIIA method can categorize different heart diseases. Both experiments acquired the best category results when using the Bayesian Network. For future work, the concept of the AIIA method can be extended to the categorization of other physiological signals. More features can be added to improve the accuracy.