Skip to main content

Automatic onlinespike sorting with singular value decomposition and fuzzy C-mean clustering

Abstract

Background

Understanding how neurons contribute to perception, motor functions and cognition requires the reliable detection of spiking activity of individual neurons during a number of different experimental conditions. An important problem in computational neuroscience is thus to develop algorithms to automatically detect and sort the spiking activity of individual neurons from extracellular recordings. While many algorithms for spike sorting exist, the problem of accurate and fast online sorting still remains a challenging issue.

Results

Here we present a novel software tool, called FSPS (Fuzzy SPike Sorting), which is designed to optimize: (i) fast and accurate detection, (ii) offline sorting and (iii) online classification of neuronal spikes with very limited or null human intervention. The method is based on a combination of Singular Value Decomposition for fast and highly accurate pre-processing of spike shapes, unsupervised Fuzzy C-mean, high-resolution alignment of extracted spike waveforms, optimal selection of the number of features to retain, automatic identification the number of clusters, and quantitative quality assessment of resulting clusters independent on their size. After being trained on a short testing data stream, the method can reliably perform supervised online classification and monitoring of single neuron activity. The generalized procedure has been implemented in our FSPS spike sorting software (available free for non-commercial academic applications at the address: http://www.spikesorting.com) using LabVIEW (National Instruments, USA). We evaluated the performance of our algorithm both on benchmark simulated datasets with different levels of background noise and on real extracellular recordings from premotor cortex of Macaque monkeys. The results of these tests showed an excellent accuracy in discriminating low-amplitude and overlapping spikes under strong background noise. The performance of our method is competitive with respect to other robust spike sorting algorithms.

Conclusions

This new software provides neuroscience laboratories with a new tool for fast and robust online classification of single neuron activity. This feature could become crucial in situations when online spike detection from multiple electrodes is paramount, such as in human clinical recordings or in brain-computer interfaces.

Background

Electrophysiological recording of single neuron activity represents a fundamental tool for investigating brain functions. Since a recording electrode often picks-up spikes from more than one neuron, a spike sorting technique is needed to identify and separate spikes of different neurons [1, 2]. Most currently available computational procedures provide accurate sorting and classification, but are often highly interactive and time-consuming and require specific experience and subjective judgments. Fast automatic methods are available [35], but they are usually not as accurate as the offline ones and often suffer from problems such as false match or double match, spike overlap and errors in classification [1, 6]. Solving the tradeoff between automation, speed and accuracy of spike sorting is thus a crucial challenge in systems neuroscience.

Here, we aim at contributing to the progress of the field by achieving accurate, fast and fully automated spike sorting. To this purpose, we present a new method (and a software package) based on the Fuzzy C-mean (FCM) classification of spike waveforms in the low-dimensional feature space of Principal Components (PCs). Many currently used offline spike sorting algorithms already employ Principal Component Analysis (PCA) as a preprocessing step to compress the dimensionality of the patterns to be clustered [1, 7]. However, its practical application for fast and automatic separation of neurons is still limited for many reasons. In fact, it is commonly reported that the performance of the PCA heavily depends on the accuracy of the waveform alignment [8], thus requiring strong computations and human supervision of the clustering results [9]. In addition, eigenvectors accounting for the largest variance of the data are not necessarily providing the best separation of the spike classes [10]. Finally, it is often pointed out that PCA, in its basic configuration, is a static technique, not suitable for monitoring of non-stationary behaviour [11], while in vivo single unit activity represents mostly non-stationary system with nontrivial dynamics [12].

In this article, we develop and present a spike sorting software called FSPS (Fuzzy SPike Sorting) which is designed to overcome these limitations. The FSPS software (whose architecture is sketched in Figure 1) increases the robustness and speed of PCA-based spike sorting means of several steps. First, it carefully preprocesses the data to improve the alignment of spike shapes. Second, it uses a Partial Single Value Decomposition (PSVD) preprocessing technique to extract PCs [13]. This technique is computationally efficient because it exploits previously computed Single Value Decomposition (SVD) for the further dynamic low-rank approximation of new coming waveforms by means of series of simple matrix operations on the output eigenvectors and at the same times reduces the noise in the components to be sorted [14, 15]. Third, the algorithm uses the information obtained during SVD to classify the neuronal waveforms by means of FCM clustering [1619]. The unsupervised nature of FCM and its ability to detect clusters of different shapes makes it particularly useful for online sorting because of its robustness to non-stationary recordings, responsible of the smeared clusters in the high-dimensional feature space. Fourth, to control the accuracy of neuron isolation, the software provides several objective isolation quality measures, including the L ratio measure which allows a comparison of cross-laboratory clustering [20].

Figure 1
figure 1

Functional architecture of the FSPS™ programme. The main menu (1) allows the user to initiate new recordings (2), visualize raw data or results (3) and to get fast access to any stage of the data processing (4 and 5). All other modules of the programme are loop-structured and prompt the user to execute a procedure when necessary. Dynamic links between modules and storage of critical parameters during SVD of data matrix and unsupervised FCM allows the program to solve the invariance problem during PCA, automatically select features to be used in online fuzzy classification, choose the best settings for clustering (new or previously calculated for each recording site) and remove noise. Elements included in the dotted block are crucial parameters obtained during multivariate analysis and unsupervised FCM, which are stored externally from the programme for use during fuzzy classification. These are updated when necessary. Output classes (namely, labelled groups of PCs), together with participating elements, thus represent different single neurons of specific action potential waveform, and then indicate their location in the raw signal. Elements marked with dashed line (VI 1 and VI 2) are two separated Virtual Instruments (VI) working simultaneously either on one computer with multi-core processor architecture or two different single-processor computers connected to a LAN. Solid blocks and lines inside VI 1 element – parts of the programme involved in the online procedures; dotted blocks and lines – parts that run once to get the prototype, before starting online classification.

The article is organized as follows. First, we describe the spike sorting method used in our approach. We then evaluate its accuracy using simulated spike datasets at different background noise levels proposed by Quiroga et. al. (2004). Then we demonstrate the performance of our method by applying it to the analysis of real extracellular recordings from the macaque premotor cortex, and we investigate the robustness of the algorithm to sample size and inhomogeneities between cluster size to deal with the non-stationarity of the data during the recording session. Finally, we illustrate how our method of spike sorting can be implemented to monitor online the activity of single neurons during electrophysiological recordings.

Implementation

The basic strategy of FSPS software is to provide accurate and trustable classification with a minimal supervision and, more importantly, without specific software knowledge (like Python scripting, Matlab toolboxes, C++ etc.). The software supports a large variety of digital acquisition (DAQ) systems (including low-cost ones) and simplifies electrophysiology setup by using the flexible graphical user interface (GUI) of Virtual Instruments (VIs). The spike sorting algorithm was entirely implemented within graphical programming language LabVIEW 2009 (National Instruments, USA), whose DAQ hardware and interfaces became very popular in electrophysiology labs over the last decade. Besides, we choose this software platform for its ability to control the experimental protocol and data acquisition while being able to run the analysis fast and online using threaded dataflow methodology [21]. It is also reported that many LabVIEW subroutines shows considerable outperformance when compared to their identical counterpart written in MATLAB (MathWorks, USA) [22]. FSPS high-level schema is sketched in Figure 1. The program allows triggered (Figure 2) and continuous (Figure 3A) acquisition from one or more electrodes simultaneously and offers the user the choice to set all parameters of the analysis automatically or manually, both in case of “test” acquisition and online classification (Figure 3B). Besides, the software has the following advanced features: band-pass signal filtering; automatic detection of spikes with evaluation of background noise level and automatic threshold selection; extraction and alignment of spike waveforms; removal of constant DC offset, false positive and noisy spikes; pre-processing with computationally efficient PCA; automatic determination the number of PCs to retain; automatic determination of the number of clusters to be found; offline fuzzy clustering analysis; online fuzzy classification; 2D and 3D visualization tools; quantitative quality assessment of resulting clusters, basic statistics, PSTH, measurements of some clinical parameters of spike trains etc. Additional file 1. The software allows simultaneous visualization/monitoring of activity of several isolated neurons and provides online acoustic feedback about one selected neuron. It has import/export features and allows synchronization of the acquisition with external devices (e.g. digital videorecorders, stimulators etc.). The application is available at http://www.spikesorting.com in the Download section.

Figure 2
figure 2

Interface window for triggered data acquisition and visualization. An offline module of the FSPS programme shows the acquired signal and consents instant acquisition of the two first Principal Components (PC1 and PC2, after full SVD) for each triggered acquisition session. Here we showed the organization of the user-interface window with neural polyspikes acquired during twelve triggered sessions of goal-directed movement in a monkey. Raw data from the electrode and additional digital hardware information were retrieved and visualized (as shown under each session). The resulting histogram of spike occurrence (on the basis of external hardware threshold discriminator) is shown below the trials. The protocol information used later for the analysis is shown to the left of the histogram, while the control buttons to start, stop, overwrite and scroll the trials is shown to its right.

Figure 3
figure 3

Fully automated online classifier. A - After the period of “TEST” acquisition, which starts once for each recording site, the classification of newly arriving spikes is continual. The control panel on the left-hand side permits some adjustments to automatic thresholding and spline interpolation parameters, depending on the digitalization rate of acquisition. Numerical information about the number of spikes and single units, as well as their waveforms, are instantly available to the researcher; B – interface window of online monitoring. This is the most innovative part of our procedure. To ensure good performance we gave the option of sharing most resource-dependent processes, like extraction of PCs and their FCM-classification, between two different processors. In order to achieve this goal we created two separated subprograms (VI 1 and VI 2), running in parallel and linked via UDP protocol, for the transmission of the reduced number of extracted features (PCs) together with a time of spike occurrence, which is reverse-reconstructed and reproduced right after event classification. These two VIs, being processor-dependent, can be run on the same computer when a limited number of recording electrodes is considered. Since the latest versions of LabVIEW (LabVIEW-2009 or higher) can effectively treat multi-core processor architecture and parallel-loop execution, the FSPS software can run on the same computer, sharing the power of multi-core processor (Intel Core i5-2430 M, 2.4 GHz). However, an Ethernet connection may also be useful when experimental conditions necessitate distant online monitoring.

Simulation

Simulated extracellular recordings were used to test the spike sorting procedures. These simulated data were the ones used in [10] and are available online at http://vis.caltech.edu/~rodri/Wave_clus/Wave_clus_home.htm. Briefly, simulated signals consisted of spike shapes of three neurons compiled from recordings in the neocortex and basal ganglia. For generating background noise, spikes randomly selected from the database were superimposed at random times and amplitudes. Next, a train of three distinct spike shapes was superimposed on the noise signal at random times. The amplitude of the three spike classes was normalized to have a peak value of 1. The noise level was determined from its standard deviation, which was equal to 0.05, 0.1, 0.15, and 0.2 relative to the amplitude of the spike classes. There were four different example simulations, number from one to four in order of increasing sorting difficulty (see Ref [10]).

Real multi-unit recordings

Electrophysiological recordings were made from freely behaving, partially restrained, macaque monkey (Macaca fascicularis). All experimental protocols were approved by the Veterinarian Animal Care and Use Committee of the University of Ferrara, by the Italian Ministry of Health and complied with the European laws on the use of laboratory animals. The surgical procedure were the same as previously described [23]. Multi-unit recordings were performed by using varnished (Sivamid 595, ELANTAS Deatech S.r.l., Italy) tungsten microelectrodes with impedance 0.15–1.5MΩ (measured at 1 kHz), slowly inserted in the cortex by a hydraulic microdrive (Kopf Instruments, CA, USA; step resolution, 10 μm). Recorded signal was amplified 10,000 times (BAK Electronics, Germantown MD, USA), filtered by a dual variable filter VBF-8 (KEMO Ltd., Backenham, UK) (bandwidth 300–5000 Hz) and digitized (USB-6229, National Instruments, USA) at 10 kHz. During online classification, isolated spike shapes of selected unit are flashed on computer display and reproduced by a sound device to provide experimenters with audio feedback on neuron response.

Spike detection and waveform extraction

The detection of individual spikes in the sampled signal was performed with LabVIEW Peak Detector VI that fits a quadratic polynomial to sequential data points. This algorithm interpolates between sequential data points to find the peak time and reduces errors caused by asynchronous sampling of rapidly changing waveforms, ultimately providing a better alignment of spike shapes. To determine the significance of each peak, the quadratic fit of the peak was tested against the threshold level (Thr), automatically adjusted for each recording site [10]:

T h r = 4 σ noise ; σ noise = m e d i a n x 0.6745
(1)

where x is the bandpass-filtered signal and σ noise is an estimate of the standard deviation of the background noise [24]. Whereas peaks with amplitude lower than the threshold were ignored, peaks higher than threshold were considered for further analysis as follows. Once a significant peak was detected, the whole waveform was collected (eight samples before the peak and ten samples after it, which with our sampling frequency resulted in a total duration of 1.8 ms) and was then interpolated twice to obtain 36 samples for each waveform with cubic spline interpolation method [25]. Six samples at the beginning and at the end of each interpolated shape were then removed, thus leading to 24 sample waveforms (1.2 ms, see Figure 4). These parameters were empirically found to be a good compromise between sampling as many points as possible to record all the important phases of action potential, and keeping the number of spike parameters compact to facilitate further analysis. A n×24 indexed array was then filled with these peak data, rejecting spikes that violate a minimum refractory period after the preceding threshold crossing in order to reduce false positives (see Additional file 2).

Figure 4
figure 4

Visualization of raw signal and collection of spike waveforms. A - information covering each trial of movement execution to monitor the uniformity of trials, where black vertical hatches represent spike occurrence detected using external hardware threshold discriminator; blue line is the infrared signal from the analogue IR-pair; thick coloured horizontal lines are retrieved from the set of digital sensors representing different kinematic parts of executed movement. B - the visualization of one trial in 3 sec multi-unit recordings, where spikes determined by software discriminator are marked with red dots. C - an expanding view of the same raw signal, in which the presence of different kinds of spikes is evident; D - extracted 1.8 ms of spikes waveforms, aligned to the peak of action potential by spline interpolation method; E - the same waveforms with 6 truncated samples at the ends, yielded 1.2 ms waveforms to be filled by the data matrix.

Extraction of Principal Components

Spike extraction as described above gave origin to a real-valued data matrix A(n×24) containing n rows to-be-analyzed spike shapes. Before performing further analysis, we centered the collected waveforms by subtracting the row average from every element in a row, which is referred to as centering across the second mode[26] in order to remove constant terms in the data. It can be expressed as:

X = A n 1 T ,
(2)

where A is a real-valued data matrix A(n×24) containing spike shapes, n is a vector holding nth row average in the nth element, 1 is a n-vector of ones and X is a matrix holding the centering data.

We then extracted the PCs of the spike waveforms using Singular Value Decomposition (SVD). SVD is a factorization approach of a given matrix, and constitutes a powerful computational tool commonly used in many engineering and biomedical applications [27, 28]. SVD is analytically presented in standard textbooks on linear algebra and multivariate statistics (see [29] for an extensive review of the method). Let’s consider the matrix X to be of rank r, in which the rank refers to the maximum number of linearly independent row vectors. This factorization approach captures the most important properties of the matrix X, as it allows to decompose the matrix X into the product of three matrices:

X = U S V T ,
(3)

where U is a n × 24 left orthogonal matrix of singular vectors giving the principal component scores which represent the spike waveforms in term of the PCs; V is a 24 × 24 orthogonal matrix detailing the spike profile and mapping the vector space; and S is a 24 × 24 diagonal matrix where the diagonal elements are the singular values of X.

Despite the good quality the standard SVD algorithm (Eq. 3) is computationally very expensive. For an n × 24 user-item matrix, the SVD decomposition requires a run-time of O(n)3[14]. However, in our application it runs only once for each recording session during the period of test acquisition in unsupervised clustering analysis, while for the online supervised classification we use much more efficient algorithm of spike waveforms projection into the lower-dimensional space using the PSVD, as follows:

U k = X V k S k 1 ,
(4)

where U k , V k , S k 1 are PSVD component matrices with k features, where k < r of the matrix rank. We have then reduced the problem to a lower dimensional one, the maximum number of linear independent row vectors being restricted to k. Such low-rank approximation of the original space filters out the small singular values that introduce “noise” subspaces and considerably improves the computational efficiency [14, 27]. Once the SVD decomposition is done, the projection process involves only a dot product computation, which takes O(1) time, since k is a constant (Figure 5). The LabVIEW program code for centering the matrix and SVD is presented in Figure 6.

Figure 5
figure 5

Schematic diagram of the Partial SVD application for extraction of PCs. White elements represent standard Partial SVD decomposition used in offline or “TEST” clustering. Grey elements show an online classification strategy in which the new SVD can be expressed as a product of old subspaces. These are small-matrix operations, and therefore rapid.

Figure 6
figure 6

LabVIEW programme code for centering the X matrix across the 2ndmode and SVD procedure. IX matrix containing spike shapes of the data set; II – centreing the matrix across the 2nd mode; III – SVD procedure and resulting matrices.

Determining the number of PCs to retain

A crucial problem in multivariate data analysis is the number of components to retain when applying PCA. This should be determined considering the tradeoff between dimensionality and the loss of data information [30]. Both underextraction and overextraction may have consequences that adversely impact the efficiency and meaning of PCA [31], resulting in classification errors. Our software solves this problem by automatically using the Scree Test Optimal Coordinates (n oc ) method, as numerical approach to the Cattell’s scree test introduced in Ref [32]. For online classification we maintain the parameter obtained during test acquisition phase until our measures of goodness of clustering detect that a new training phase is needed because the data have changed their properties too much (see Measures of cluster quality section).

FCM clustering and classification

Though SVD is a powerful tool for characterizing spike waveforms, it does not help to identify the neurons. It is merely a clustering technique wherein the dataset is divided into distinct clusters, which are ultimately interpreted as different single units. We have used the FCM approach based on the classical ISODATA method, using the selected above features/PCs as input variables for clustering. FCM is one of the best known and the most widely used fuzzy clustering algorithms [33]. However, due to the unsupervised nature it requires that the desired number of clusters is specified in advance. If this choice does not correspond to the actual number of clusters, the results of FCM deteriorate. In FSPS software we implemented an algorithm determining the number of clusters automatically and without supervision. To do so, we used histogram-based methods of dataset segmentation which are widely used in real-time pattern recognition systems [34]. The basic idea of algorithm we implemented in our FSPS software rests on the assumption that local densities and the number of peaks on histogram showing the distribution of 1 -norm values for every left singular vector in the U k corresponds to particular clusters (Figure 7). 1 -norm is considered to be generalized length (or magnitude) of the vector and calculated using the following equation:

X = x 0 + x 1 + + x k 1 ,
(5)

where X is input vector and X is a 1 -norm.

Figure 7
figure 7

Distribution of 1 -norm values of input singular vectors.

Then, to construct the data-histogram we used the optimal bin size width W for the most efficient unbiased estimation of the probability density function [35], where:

W = 3.49 σ N 1 / 3 ,
(6)

where σ is the standard deviation of the distribution and N is the number of available samples, which corresponds to the total number of spikes or singular vectors in U k matrix and their 1 -norm values in our specific case. The histogram was than thresholded to eliminate the noise content (low-amplitude peaks in the histogram which do not correspond to any distinct cluster) by using the value of first lower valley after the mode of 1 -norm distribution. Thus, the number of peaks on histogram above the selected threshold was used as indication of the number of objects to be used for FCM clustering technique. However, in the interactive mode, FSPS leaves the possibility to choose the number of clusters manually, according to visual examination of the clustering results and expert judgment.

To perform FCM clustering (whose details are described in Additional file 2), one has to specify also an exponent m (m > 1.0), which determines the degree of fuzziness of the resulting clustering process. As m→1 the fuzziness of the clustering result tends to the results derived with the well known ISODATA method [36]. As m→∞ the membership values of all the objects to each cluster tend to the reciprocal of the number of classes 1/c. The analysis of our data obtained during recordings, the distribution of PCs and measurements of cluster quality showed that, with m = 1.1 the FCM algorithm is performing clustering correctly on both real and simulated data (see Results). Besides, this value was consistently found to lead to good results with the well- and poor- separated classes and we were able to classify the cells with acceptable accuracy. Thus, we set this default value of m in our program, although it can be changed by the user if needed.

Measures of cluster quality

Since clustering algorithms define clusters that are not known a priori, it is fundamental to define a performance criterion to quantify the goodness of the resulting partition. In our FSPS software we have implemented the standard and popular figures of merit associated with FCM introduced by J.C.Bezdek (1981), such as the Partition Coefficient (pc), the Partition Entropy (pe) and Proportion Exponent (pex) that make use only of membership values and have the advantage of being easy to compute (see Additional file 2). However, these measures are often subject to numerical instability during the quantification of overlapping clusters of unequal size [37]. Therefore, we also included in our software one recently introduced objective validity index L ratio [20]. This parameter allows to obtain stable cluster evaluation from a particular recording site, and to take into account the clusters with a larger number of spikes. The evaluation of L ratio changes during single unit recordings was also used as measure of classification performance to monitor the stability of data acquisition. The threshold L ratio value is set to 5 as default, and can be modified by the user (we recommend a choice in the range 3–6). Once L ratio becomes bigger than its threshold value, the program alerts the user that it is advised to recompute full SVD and to update FCM prototype because classification is deteriorating.

Finally, for each isolated unit the FSPS software allows to compute a number of standard quantities and statistics of interest to the neurophysiologist, such as Peri-Stimulus Time Histograms (PSTHs), raster plots and inter-spike interval (ISI) and some clinically important indices that measured tonic and phasic activity, including burst index (BI) and pause index (PI) [38].

Results

Performance on simulated data

In order to validate our spike sorting approach and to compare it with other known algorithms, we tested it on simulated datasets described by Quiroga et. al. (2004) and compared to their already published results obtained by superparamagnetic clustering (SPC) and K-mean clustering techniques applied to different spike features (wavelets, PCA, using the first three PCs, and the whole spike shape) [10]. The dataset contains two types of spike shapes: noisy “non-overlapping” spike shapes, which were generated by taking the target waveform and adding noise, and “overlapping spikes” which were generated overlapping spikes with a latency shorter than 0.7 ms and then adding noise. Performance was quantified in terms of number of classification errors.

Table 1 shows the number of classification errors of the FSPS algorithm and the other tested algorithms when detecting and sorting noisy non-overlapping spikes. FSPS gave the lowest number of false matching spikes in most simulated datasets and did not exceed 2% up to noise level 0.2 in all examples with exception of Example 4, where 7.2% of mismatches were detected. However, even in this case the outcome of FSPS technique was still in 4,5-8,8 times better compared to other methods. The advantage of FSPS becomes apparent when spike shapes are more similar (Table 1, Examples 3 and 4, considered more difficult for clustering), while our results were competitive with those obtained using K-means or SPS clustering on wavelets in Examples 1 and 2, where spike shapes of three simulated neurons were markedly different. A nice feature of the performance of our FSPS algorithm was that it degraded gracefully with increasing noise, in part due to the better outlier identification of fuzzy clustering, and the performance was reasonably good also in the case of overlapping spikes (Table 2). The reason for this improved performance is probably due to better pre-processing strategy that we employed rather than the different clustering procedure. In particular, we verified the alignment procedure and the implementation of PSVD on the clustering performance. Figure 8 illustrates this point by depicting results of classification after clustering of simulated Example 2 with noise level 0.15, a dataset that was particularly difficult to cluster with the traditional 3 PCs clustering method [10]. With our procedure, the distribution of first three PCs at the fragments A and B for the datasets without (Figure 8A) and with (Figure 8B) overlapping spikes demonstrates three clean, compact and well distinguished clusters. The presence of overlapping spikes in the dataset B (763 out of 3411, that is 22,4%) creates less distant and more shaped clusters having complex outliers.

Table 1 Number of classification errors and noise levels, obtained using FSPS, SPC and K-means, in all simulated examples
Table 2 Number of classification errors for all simulated examples and overlapping spike shapes
Figure 8
figure 8

Result of classification of the simulated dataset from Example 2, noise level 0.15. A - The projection of the first three PCs of the dataset not containing overlapping spikes; B - the same, but with overlapped spikes. Three dense clusters are shown in both cases. C and D - original spike shapes of detected clusters in the datasets without and with overlapping spikes, respectively. Different spike events are shown in different colours, according to the outcome of the clustering algorithm. The number of correctly identified spikes is indicated. Values in brackets indicate the total number of spikes in each class. The spike shape shown in the lower part of the figure has been incorrectly classified and therefore marked in black.

Description of real multi-unit data

We then tested our spike sorting algorithm and its quality of separation using real polyspikes recorded from premotor area F5 of a Macaque monkey. To demonstrate the functionality of our method in a realistic situation we present spike sorting results of two different datasets, obtained from the same recording site, but in different experimental conditions, thus introducing the complexity of non-stationarity in the data. First dataset, referred to as A, was acquired when monkey performed goal-directed grasping movements in full vision, while other recordings (referred to as dataset B) were done 30 min after A during the same grasping in the dark. Each dataset contains raw signals of twelve 3 s trials of each of the two conditions with a 10-12 s intertrial interval. The total duration of acquisition of one dataset was therefore 2–3 min. The structure of the two datasets and the result of spike identification are shown in Additional file 2: Table S1 (dataset A) and Additional file 2: Table S2 (dataset B). In the waveform extraction phase, 475 and 295 false positive peaks exceeding the threshold value were automatically removed from datasets A and B, respectively. Thus, 6191 spikes out of 6666 identified in the dataset A and 6379 out of 6674 in the dataset B were processed. Dataset A was used at first as test acquisition for the unsupervised spike sorting, creation of SVD model (U k , S k , and V k ) and to obtain the parameters that were necessary to further supervised spikes classification of the dataset B.

SVD and pre-processing results on real multiunit recordings

The implementation of full SVD (Eq.2) on the dataset A gave origin to matrices U(6191×24), S(24×24) and VT (6191×24). The elaboration of singular values in diagonal S matrix by algorithm determining the optimal number of features detected n oc = 4 PCs accounting 67,3% of total variance. Then, the software calculates 1 -norm values for every left singular vector (contains PCs) composing the matrix U 6191 × 4 . The distribution of these 1 -norm values showed four detectable peaks on the line-graph histogram (see Figure 9A), thus four clusters available at the PCs feature space (Figure 10A,B). The matrix VT and other parameters for the input of FCM clustering algorithm found on dataset A were kept in memory and then used to classify the spikes of dataset B. The L ratio measure of quality of clustering remained below the threshold value of 5 (Table 3) and so the software algorithm did not detect the need to repeat the testing phase of the clustering algorithm until the end of experiment B. The 3D scatter plot of first three PCs of dataset B shows same four clusters (Figure 10C). While two dense and partially overlapping clusters, located in the leftmost part of each plot, seem to be almost identical, the other two become more spread and shifted in 3D space in the dataset B with respect to dataset A, as it follows also from the Figure 11 representing the time course of clusters. We projected these clusters onto PCs axis of maximum variance and plotted this projection for each trial that were recorded sequentially. Figure 11B shows that a negative trend occurs in the projection, while Figure 11C shows the positive one. The solid and dash lines represents the centroid of the cluster as it changes over time for dataset A and B, respectively. The points in the figure represent a single spike waveform changing shape slowly and continuously. The changes in the internal structure of the dataset B become more evident considering the distribution of 1 -norm values of singular vectors depicted in the line-histogram at the Figure 9B as Classes 1 and 4 become completely overlapped. Despite this fact all clusters are still well recognized and associated correctly by our approach thanks to cluster information previously learned in dataset A (Figure 10D).

Figure 9
figure 9

Distribution of 1 -norm values of input singular vectors for every Class. This figure shows the frequency distribution of 1 -norm values of singular vectors for each determined class in datasets A and B, respectively. Different classes are shown in different colours, according to the outcome of the clustering algorithm.

Figure 10
figure 10

Rotable 3D scatter-plots of first three PC scores. The Figure shows an example automatic clustering of dataset A (fragments A and B are conditions before and after clustering analysis) and automatic classification of dataset B (fragments C and D are conditions before and after clustering). Each dot represents a spike situated in the 3D PC feature space. Four dense clusters are clearly visible and classified. Noise-near waveforms and their PCs located far away from cluster centres, as evaluated by Mahalanobis distance, are marked as red dots.

Table 3 Performance of FSPS™ clustering/classification of datasets A and B with their respective quality measures
Figure 11
figure 11

Time course of clusters over recording trials. This Figure shows how the waveform moves along the axis of maximum variance. The trend represents the spike waveform changing shape due to electrode drift. Clusters 1–4 are shown in Fragments A-D, respectively.

To prove that the outcome of PCs clustering analysis/classification of the datasets is successful as well as obtained classes are assigned same single units, the results were backwards applied to the raw signals to build rasters and histograms describing individual neuron response (Figure 12). The firing properties were consistent across the two datasets suggesting that the units classified in dataset B are the same as those discovered in dataset A, and so the FSPS software can accurately track neurons despite non-stationarities in the data. Besides, the robustness of our method is demonstrated by comparing clustering results of two types of high-amplitude discharges, isolated as Class 2 and Class 3 in four PCs features space, and having specific reciprocal electrophysiological behaviour (see PSTHs and rasters in Figure 12).

Figure 12
figure 12

Validation of waveforms in determined Classes with corresponding histograms and rasters. This Figure shows how the outcome of classification of datasets A and B, respectively, can be backwards applied to the raw signals to build rasters and histograms describing individual neurons’ response during the experimental task. Notes: 0 – all spikes, Unclassified; 1 – Class 1, 2 – Class 2; 3 – Class 3; 4 – Class 4.

Despite the fact that the amplitude of their discharge has been mismatched in dataset B, thus making unfeasible amplitude-based sorting, they are still well separated by our technique because it takes into account the whole profile of spike shape. The analysis of spike times and ISI histograms of isolated neurons in both datasets shows no multi-unit contamination (Figure 13).

Figure 13
figure 13

Effects of sorting in inter-spike interval (ISI) distributions. Fragments A–D show the true ISI distribution of the timestamps for classes 1–4 of dataset A, respectively, determined by PCs clustering analysis Fragments E–H show the ISI distributions for the timestamps attributed to classes 1–4 of dataset B, determined by the PCs classification procedure. I – dataset A, II – dataset B. Given examples shows no events in the refractory period.

Exploring the limits of classification and its reliability for the online spike sorting

It is known that FCM algorithms have a tendency to find clusters of comparable size because they use a sum of squared errors objective function and approximately equal cluster populations result in smaller values of this objective function [17, 39]. This might become a serious problem in online applications, where the algorithm may be applied to relatively small stretches of data and so random fluctuations of spike rates may make the relative size of clusters at a given time very disproportionate.

To investigate this issue, in this section we evaluated the robustness of datasets classification to disparity in cluster size, by progressively eliminating spikes in a cluster and computing the performance of the clustering algorithm as function of the class saturation, i.e. of the fraction of spikes left in the cluster. Results are shown in Figure 14, showing the simulated data and the real datasets A and B. Color lines show the true positive rate (i.e. the percentage of spikes retaining true cluster membership) when reducing the size of a particular cluster while the size of other clusters remains unchanged. The accuracy of classification of the simulated Example 2 with noise level 0.15 is shown in Figure 14A. Unbalanced decrease of clusters up to 40% of their original size shows still high classification accuracy (right-hand side of the graph). Further cluster decrease shows minor deterioration of classification accuracy due to drifting of smaller clusters toward lager adjacent ones. An abrupt and pronounced deterioration in the partitioning of the data was found only when clusters 1, 2 or 3 remain less than 34,7%, 18,5% or 14,1%, of their original size, respectively.

Figure 14
figure 14

Limits of classification capability for the simulated and real datasets. A - ROC graph showing the performance of the FCM-classifier during balanced (dotted line) and unbalanced modification (solid lines) of classes in the simulated dataset; B – the same for the real dataset A; C - the same for the real dataset B.

Figure 14B shows the results obtained with real dataset A. The classifier is still performing well and true positive rate is higher 80% if it remains at least 20,1%, 4,9%, 1,1% and 5,1% spikes, respectively to Class 1, Class 2 Class 3 and Class 4. Figure 14C shows the outcome of the classification real dataset B. The true positive rate for each class remains still at higher than 80%, if classes 1, 2, 3 or 4 contain at least 18,9%, 16,1%, 13,3% and 12,5% spikes, respectively.

An additional exploratory test in which all clusters were modified in a balanced and “Uniform” manner showed excellent performance in simulated, real A and real B datasets (black dotted line in Figure 14A-C). In all these cases the classification accuracy was completely independent of the density of the clusters and the true positive rate was at its maximal initial value until the each cluster contained at least one spike.

Discussion

The present work is devoted at addressing the challenges involved in balancing the different needs for accuracy, speed and automation in spike sorting.

The first point of discussion regards the selection of the optimal spike-shape features to be used for sorting [1, 4042]. Here we chose the multivariate techniques for selection of spike features. The multivariate approach has proven successful in many industrial online applications [15, 43]. In this work, it is shown that neurophysiological research does not constitute an exception. We improved the performance of PC based classification by using a careful alignment of spike shapes and PSVD to reduce noise and computational time and to select the optimal number of components to be used, thereby choosing a reduced variable sets as inputs for the clustering algorithms. In agreement with other studies [8, 4446], we have indeed found that when careful alignment and PCA were implemented, the results of classification of simulated datasets showed high tolerance to the noise, with better performance in comparison with similar methods.

Another goal of this paper was to demonstrate the applicability and efficacy of our technique for online isolation of single neurons. Indeed, a straight PCA approach is not sufficient to describe the changing pattern of neural activity adequately [11, 13]. There are mainly two reasons for this. First, the neuronal activity displays a non-stationary behaviour [47]. Second, PCA is not an optimal method for feature extraction when the features are used in a supervised classifier [27]. Finally, PCA is a computationally intensive preprocessing technique, making hard its use in real-time processing. Despite these conceptual difficulties our way of implementation of the mentioned algorithms into LabVIEW environment allowed us to run online classification with small and acceptable delay as far as our experimental conditions are concerned. There are several examples of applications of multivariate statistical online monitoring (and modeling) associated with PCA and FCM overcoming problems associated with non-stationarity [4850].

A main focus of this article was the accuracy assessment of the FCM, which have been incorporated into FSPS software to produce both crisp and fuzzy classifications. The main advantage of FCM implementation is that fuzzy classification gives faster detection and smoother control than crisp classification. The importance of this becomes more evident during online classification. The tests reported in this article suggest that the FCM clustering copes well with the problems generated by the non-stationarity of the real data. It is interesting to note that the FCM procedure is based on an iterative clustering algorithm and can thus be regarded as an essentially unsupervised classifier. However, we also implemented a partially supervised mode, benefitting from partition and membership function previously obtained from training dataset. Consistently with this, the results of our tests performed on datasets prove that the implementation of FCM overcomes the problem of sensitivity for unequal cluster sizes, which is crucial for correct online classification. Thus, together with PCs extracted by PSVD from accurately preprocessed spike waveforms, FCM becomes a versatile noise tolerant technique for the sorting of neuronal action potentials having even small variation in their discharge.

Any kind of clustering or classification needs an objective measure of its quality. Although we implemented in the FSPS software most conventional indices associated with FCM, including partition coefficient, partition entropy and proportion exponent, our tests showed that the index L ratio was superior to classic FCM indexes and so was implemented as the default quality measure in our software. In terms of real recordings we found L ratio useful not only to determine whether the quality of a cluster is within acceptable limits, but also to control the stability of recordings, to predict the future behavior of the neuron and check whether the SVD model is going out of control.

An important feature of the FSPS software is that it is implemented entirely within LabVIEW. The latter constitutes one of the most frequently used programming languages for the data acquisition, analysis, control and visualization. LabVIEW is often faster than many other high-level programming languages used in neuroscience, such as MATLAB [22], and it is far better equipped for the development of experimental and clinically-oriented spike sorting applications [51]. The upshot is that the entire spike time acquisition process can be run within a single environment, which has the all-important added benefit of simplifying experimental procedures.

In recent years there have been successful attempts at creating a cross-platform GUI for data visualization, navigation and spike sorting features within another software environment using the Python framework [52, 53]. Python applications, like “SpikeSort”, “Spikepy”, “spyke” and “OpenElectrophy”, provide adequate tools for the exploration of data and offline spike sorting, while “NeurOnline” provides the means for online spike sorting. However, the LabVIEW code we used is far more convenient because it rarely calls the Operating System (OS) directly, so it can be used with different OSs without the need for major modifications. Moreover, LabVIEW supports thousands of hardware devices and, in addition to the popular desktop OSs (Windows, Mac, and Linux), it can target several embedded real-time controllers, ARM microprocessors, and field-programmable gate arrays (FPGAs), allowing the deployment of our FSPS code with the most appropriate hardware platform without the need to learn new toolchains.

All mentioned properties contributed to the creation of a fast, powerful, user-friendly and stand-alone multi-platform software, designed for clustering/classification of neural data. Moreover, our online approach will help physiologist to overcome new challenges in experimental electrophysiological research.

Conclusions

We believe that the software developed here complements existing spike sorting toolboxes and will be a useful tool for fast on- and offline sorting of spike trains with limited supervision or fully automated. Because of these properties, our tool will be particularly useful for the analysis of large parallel recordings (where human supervision is practically impossible or inconvenient) and will therefore be important for improving our understanding of population codes [5456] and for online applications such as the decoding of neural ensembles to control Brain Computer Interfaces or for clinical applications.

Availability and requirements

Project name: Neurolab

Project home page: http://www.spikesorting.com

Operating system(s): It was tested on Windows XP, Windows Vista, Windows 7

Programming language: NI LabVIEW 2009

Other requirements: for running in the online mode, the requirements are as follows. Hardware: Digital acquisition board from National Instruments (PCI or USB). Additional software: 1. LabVIEW Run-Time Engine 2009 for Windows 2000/7/7 x64/Vista/Vista x64/XP - (32-bit Standard RTE) - available free at: http://www.ni.com. 2. NI-DAQmx Run-Time Engine 9.3 or higher - (Core) for Windows 7 64 bit/7 x86/Server 2003 R2 (32-bit)/XP x86/Vista x64/Vista x86/Server 2008 R2 (64-bit) - available free at: http://www.ni.com;

License: FSPS software is distributed under Creative Commons Public License (CCPL BY-NC-ND) and can be used for non-commercial academic applications providing they properly reference this work in any publication that uses results generated by FSPS software.

Any restrictions to use by non-academics: Commercial License needed.

Abbreviations

BI:

Burst index

DAQ:

Digital Acquisition

FCM:

Fuzzy C-mean

FPGA(s):

Field-programmable Gate Array(s)

FSPS:

Fuzzy SPike Sorting

GUI:

Graphical User Interface

ISI:

Inter-spike Interval

OS(s):

Operating System(s)

PC(s):

Principal Component(s)

PCA:

Principal Component Analysis

PI:

pause index

PSTH(s):

Peri-Stimulus Time Histogram(s)

PSVD:

Partial Single Value Decomposition

SPC:

Superparamagnetic Clustering

SVD:

Single Value Decomposition

VI(s):

Virtual Instrument(s).

References

  1. Lewicki MS: A review of methods for spike sorting: the detection and classification of neural action potentials. Network. 1998, 9: R53-R78. 10.1088/0954-898X/9/4/001.

    Article  CAS  PubMed  Google Scholar 

  2. Quiroga RQ: Spike sorting. Scholarpedia. 2007, 12: 3583.

    Article  Google Scholar 

  3. Kim KH: Improved Algorithm for Fully-automated Neural Spike Sorting based on Projection Pursuit and Gaussian Mixture Model. Int J Contr Autom Syst. 2006, 4: 705-713.

    Google Scholar 

  4. Sato T, Suzuki T, Mabuchi K: Fast automatic template matching for spike sorting based on Davies-Bouldin validation indices. Conf Proc IEEE Eng Med Biol Soc. 2007, 2007: 3200-3203.

    PubMed  Google Scholar 

  5. Vargas-Irwin C, Donoghue JP: Automated spike sorting using density grid contour clustering and subtractive waveform decomposition. J Neurosci Methods. 2007, 164: 1-18. 10.1016/j.jneumeth.2007.03.025.

    Article  PubMed Central  PubMed  Google Scholar 

  6. Bar-Gad I, Ritov Y, Vaadia E, Bergman H: Failure in identification of overlapping spikes from multiple neuron activity causes artificial correlations. J Neurosci Methods. 2001, 107: 1-13. 10.1016/S0165-0270(01)00339-9.

    Article  CAS  PubMed  Google Scholar 

  7. Adamos DA, Kosmidis EK, Theophilidis G: Performance evaluation of PCA-based spike sorting algorithms. Comput Methods Programs Biomed. 2008, 91: 232-244. 10.1016/j.cmpb.2008.04.011.

    Article  PubMed  Google Scholar 

  8. Jung HK, Choi JH, Kim T: Solving alignment problems in neural spike sorting using frequency domain PCA. Neurocomputing. 2006, 69: 975-978. 10.1016/j.neucom.2005.06.006.

    Article  Google Scholar 

  9. Balasubramanian K, Obeid I: Fuzzy logic-based spike sorting system. J Neurosci Methods. 2011, 198: 125-134. 10.1016/j.jneumeth.2011.03.016.

    Article  PubMed  Google Scholar 

  10. Quiroga RQ, Nadasdy Z, Ben-Shaul Y: Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 2004, 16: 1661-1687. 10.1162/089976604774201631.

    Article  PubMed  Google Scholar 

  11. Jolliffe IT: Principal component analysis. 2002, New York: Springer, 2

    Google Scholar 

  12. Muresan RC, Pipa G, Wheeler DW: Single-unit recordings revisited: Activity in recurrent microcircuits. Artificial Neural Networks: Biological Inspirations - Icann 2005, Pt 1, Proceedings. 2005, 3696: 153-159. 10.1007/11550822_25.

    Google Scholar 

  13. Berrar DP, Dubitzky W, Granzow M: A practical approach to microarray data analysis. 2003, Kluwer Academic Publishers, Boston, MA

    Book  Google Scholar 

  14. Berry MW, Dumais ST, O’Brien GW: Using linear algebra for intelligent information retrieval. SIAM Rev. 1995, 37: 573-595. 10.1137/1037127.

    Article  Google Scholar 

  15. Brand M: Fast Online SVD Revisions for Lightweight Recommender Systems. SIAM International Conference on Data Mining. 2003, 37-46.

    Google Scholar 

  16. Dunn JC: A fuzzy relative of the ISODATA process and its use in detecting compact, well-separated clusters. J Cybern. 1973, 3: 32-57. 10.1080/01969727308546046.

    Article  Google Scholar 

  17. Bezdek JC: Pattern recognition with fuzzy objective function algorithms. 1981, Plenum Press, New York

    Book  Google Scholar 

  18. Bezdek JC, Ehrlich R, Full W: Fcm - the Fuzzy C-Means Clustering-Algorithm. Comput Geosci. 1984, 10: 191-203. 10.1016/0098-3004(84)90020-7.

    Article  Google Scholar 

  19. Cannon RL, Dave JV, Bezdek JC: Efficient Implementation of the Fuzzy C-Means Clustering Algorithms. IEEE Trans Pattern Anal Mach Intell. 1986, 8: 248-255.

    Article  CAS  PubMed  Google Scholar 

  20. Schmitzer-Torbert N, Jackson J, Henze D, Harris K, Redish AD: Quantitative measures of cluster quality for use in extracellular recordings. Neuroscience. 2005, 131: 1-11. 10.1016/j.neuroscience.2004.09.066.

    Article  CAS  PubMed  Google Scholar 

  21. Johnston WM, Hanna JRP, Millar RJ: Advances in dataflow programming languages. ACM Comput Surv. 2004, 36: 1-34. 10.1145/1013208.1013209.

    Article  Google Scholar 

  22. Gutierrez-Castrejon R, Duelk M: Using LabVIEW (TM) for advanced nonlinear optoelectronic device simulations in high-speed optical communications. Comput Phys Commun. 2006, 174: 431-440. 10.1016/j.cpc.2005.11.002.

    Article  CAS  Google Scholar 

  23. Gentilucci M, Fogassi L, Luppino G, Matelli M, Camarda R, Rizzolatti G: Somatotopic representation in inferior area 6 of the macaque monkey. Brain Behav Evol. 1989, 33: 118-121. 10.1159/000115912.

    Article  CAS  PubMed  Google Scholar 

  24. Donoho DL, Johnstone IM: Ideal Spatial Adaptation by Wavelet Shrinkage. Biometrika. 1994, 81: 425-455. 10.1093/biomet/81.3.425.

    Article  Google Scholar 

  25. De Boor C: A practical guide to splines: with 32 figures. 2001, New York: Springer, Revth edition

    Google Scholar 

  26. Bro R, Smilde AK: Centering and scaling in component analysis. J Chemometr. 2003, 17: 16-33. 10.1002/cem.773.

    Article  CAS  Google Scholar 

  27. Gabbiani F, Cox SJ: Mathematics for neuroscientists. 2010, Amsterdam: Elsevier/Academic Press, 1

    Google Scholar 

  28. Eldén L: Matrix methods in data mining and pattern recognition. 2007, Society for Industrial and Applied Mathematics, Philadelphia, PA

    Book  Google Scholar 

  29. Rencher AC, Christensen WF: Methods of multivariate analysis. 2012, Hoboken, New Jersey: Wiley, Third

    Book  Google Scholar 

  30. Hayton JC, Allen DG, Scarpello V: Factor retention decisions in exploratory factor analysis: A tutorial on parallel analysis. Organ Res Meth. 2004, 7: 191-205. 10.1177/1094428104263675.

    Article  Google Scholar 

  31. Costello AB, Osborne J: Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assess Res Eval. 2005, 10: 1-9.

    Google Scholar 

  32. Raiche G, Riopel M, Blais J-G: Non graphical solutions for the Cattell's scree test. 2006, International Annual meeting of the Psychometric Society

    Google Scholar 

  33. Höppner F, Klawonn F, Kruse R, Runkler T: Fuzzy Cluster Analysis: methods for classification, data analysis and image recognition. 1999, John Wiley & Sons Ltd., Chinchester

    Google Scholar 

  34. Shapiro LG, Stockman GC: Computer vision. 2001, Prentice Hall, Upper Saddle River, NJ

    Google Scholar 

  35. Scott DW: Optimal and Data-Based Histograms. Biometrika. 1979, 66: 605-610. 10.1093/biomet/66.3.605.

    Article  Google Scholar 

  36. Ball GH, Hall DJ: A clustering technique for summarizing multivariate data. Behav Sci. 1967, 12: 153-155. 10.1002/bs.3830120210.

    Article  CAS  PubMed  Google Scholar 

  37. Wang SR, Sun HJ, Jiang QS: FCM-based model selection algorithms for determining the number of clusters. Pattern Recognition. 2004, 37: 2027-2037. 10.1016/j.patcog.2004.03.012.

    Article  Google Scholar 

  38. Favre J, Taha JM, Baumann T, Burchiel KJ: Computer analysis of the tonic, phasic, and kinesthetic activity of pallidal discharges in Parkinson patients. Surg Neurol. 1999, 51: 665-672. 10.1016/S0090-3019(99)00030-0. discussion 672–663

    Article  CAS  PubMed  Google Scholar 

  39. Bezdek JC: Cluster validity with fuzzy sets. J Cybern. 1974, 3: 58-73.

    Article  Google Scholar 

  40. Horton PM, Nicol AU, Kendrick KM, Feng JF: Spike sorting based upon machine learning algorithms (SOMA). J Neurosci Methods. 2007, 160: 52-68. 10.1016/j.jneumeth.2006.08.013.

    Article  CAS  PubMed  Google Scholar 

  41. Letelier JC, Weber PP: Spike sorting based on discrete wavelet transform coefficients. J Neurosci Methods. 2000, 101: 93-106. 10.1016/S0165-0270(00)00250-8.

    Article  CAS  PubMed  Google Scholar 

  42. Luczak A, Narayanan NS: Spectral representation-analyzing single-unit activity in extracellularly recorded neuronal data without spike sorting. J Neurosci Methods. 2005, 144: 53-61. 10.1016/j.jneumeth.2004.10.009.

    Article  PubMed  Google Scholar 

  43. Dudzic M, Miletic I, Quinn S, Vaculik V, Champagne M: An industrial perspective on implementing on-line applications of multivariate statistics. J Process Contr. 2004, 14: 821-836. 10.1016/j.jprocont.2004.02.001.

    Article  Google Scholar 

  44. Madisetti V: Wireless, networking, radar, sensor array processing, and nonlinear signal processing. The electrical engineering handbook series. 2010, Boca Raton, FL: CRC Press, 1-16. VII, 2

    Google Scholar 

  45. Nash JC, Shlien S: Simple Algorithms for the Partial Singular Value Decomposition. Comput J. 1987, 30: 268-275. 10.1093/comjnl/30.3.268.

    Article  Google Scholar 

  46. Zviagintsev A, Perelman Y, Ginosar R: Algorithms and architectures for low power spike detection and alignment. J Neural Eng. 2006, 3: 35-42. 10.1088/1741-2560/3/1/004.

    Article  PubMed  Google Scholar 

  47. Snider RK, Bonds AB: Classification of non-stationary neural signals. J Neurosci Methods. 1998, 84: 155-166. 10.1016/S0165-0270(98)00110-1.

    Article  CAS  PubMed  Google Scholar 

  48. Kim SI, Yoon UC, Kim JS, Kim JS, Kim IY: Adaptable fuzzy C-Means for improved classification as a preprocessing procedure of brain parcellation. J Digit Imaging. 2001, 14: 238-240. 10.1007/BF03190353.

    Article  PubMed Central  PubMed  Google Scholar 

  49. Rosen C, Yuan Z: Supervisory control of wastewater treatment plants by combining principal component analysis and fuzzy c-means clustering. Water Sci Technol. 2001, 43: 147-156.

    CAS  PubMed  Google Scholar 

  50. Teppola P, Mujunen SP, Minkkinen P: Adaptive Fuzzy C-Means clustering in process monitoring. Chemometr Intell Lab Syst. 1999, 45: 23-38. 10.1016/S0169-7439(98)00087-2.

    Article  CAS  Google Scholar 

  51. Stewart CM, Newlands SD, Perachio AA: Spike detection, characterization, and discrimination using feature analysis software written in LabVIEW. Comput Methods Programs Biomed. 2004, 76: 239-251. 10.1016/j.cmpb.2004.07.001.

    Article  CAS  PubMed  Google Scholar 

  52. Spacek M, Blanche T, Swindale N: Python for large-scale electrophysiology. Front Neuroinform. 2008, 2: 9.

    Article  PubMed Central  PubMed  Google Scholar 

  53. Garcia S, Fourcaud-Trocme N: OpenElectrophy: An Electrophysiological Data- and Analysis-Sharing Framework. Front Neuroinform. 2009, 3: 14.

    Article  PubMed Central  PubMed  Google Scholar 

  54. Brown EN, Kass RE, Mitra PP: Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat Neurosci. 2004, 7: 456-461. 10.1038/nn1228.

    Article  CAS  PubMed  Google Scholar 

  55. Buzsaki G: Large-scale recording of neuronal ensembles. Nat Neurosci. 2004, 7: 446-451. 10.1038/nn1233.

    Article  CAS  PubMed  Google Scholar 

  56. Quiroga RQ, Panzeri S: Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci. 2009, 10: 173-185.

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by RobotCub (ROBotic Open-Architecture Technology for Cognition, Understanding and Behaviour), IST-004370, and by the BMI Project of RBCS Department at the Italian Institute of Technology. The authors thank Stefano Panzeri for his feedback on software development and manuscript writing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andriy Oliynyk.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

AO - conceived, refined and implemented the algorithm, developed and designed software, performed single-unit recordings, analyzed simulated data, evaluated results, wrote the draft of the paper and created the website. CB - contributed to the theoretical developments of spike sorting algorithm. FM - tested the software on simulated data. LF - conceived the project, refined the software requirements and co-wrote the paper. All authors read, commented upon, and approved the final manuscript.

Electronic supplementary material

Additional file 1: This file contains the FSPS Manual and Installation Notes.(PDF 49 KB)

12868_2012_2779_MOESM2_ESM.pdf

Additional file 2: This file contains supplemental text, figure and tables, which corroborate the findings presented in the main text. (PDF 2 MB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Oliynyk, A., Bonifazzi, C., Montani, F. et al. Automatic onlinespike sorting with singular value decomposition and fuzzy C-mean clustering. BMC Neurosci 13, 96 (2012). https://doi.org/10.1186/1471-2202-13-96

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2202-13-96

Keywords