.., xN}, where N is the number of data points. θ(t + 1) is then set to this value and the above procedure is repeated until a stable solution is obtained for a given value of m. Data xn is classified into the cluster that has the largest value of . If, however, this value is smaller than a critical value zth, the spike is regarded as not belonging to any cluster and is discarded. The solutions obtained for various values of m are examined with the minimum message length

(MML) criterion (Wallace & Freeman, 1987; Figueiredo & Jain, 2000; Shoham et al., 2003). Namely, we calculate the following penalized log-likelihood for different values of m (1) where Np is the number of parameters per component distribution (see Supporting information, Appendix S1). The second term penalizes solutions with large m, i.e. many clusters. The value of m that maximizes Fm is chosen. The VB is a general technique to solve for the posterior AT9283 nmr Sotrastaurin cost probability distribution of continuous variables. It calculates an approximate distribution of the posterior, assuming

that the probability variables are mutually independent. This assumption significantly reduces the cost of computations. Thus, in VB, we alternately renew the probability distributions of parameters z and θ independently according to (2) We implemented our spike-sorting algorithm in a C++ code and executed it on a GNU/Linux 64-bit environment (Sun Fire X4600 M2; Quad core AMP Opteron 8384 x 8). The program code used a double-precision single-instruction-multiple-data-oriented fast Mersenne Twister

pseudo-random number-generating algorithm (Saito & Matsumoto, 2008a,b). The algorithm was optimized for parallel computations in an OpenMP environment. The performance of the program remained stable without customizing to individual data sets. Unless otherwise stated, the results shown in this article were obtained with the same set of parameter values. We compared the performance of the following 24 (= 2 × 3 × 4) combinations: the CWM filter or MXH filter for spike detection, PCA, Harr wavelet or CDF97 wavelet for feature extraction, and EM or VB for the normal mixture model or Student’s t mixture model (NEM, REM, NVB and RVB) for spike clustering. We first clarified the excellent performance nearly of our RVB clustering methods using artificial data. The performance of the spike-sorting methods was then tested using the data obtained by simultaneous extracellular and intracellular recordings (Harris et al., 2000; Henze et al., 2000; data are available at http://crcns.org/data-sets/hc/hc-1). In these data, we knew the correct sequence of spikes, at least for a single neuron recorded intracellularly and therefore the correct answers for spike sorting were already partially known. Using this information, we examined the accuracy and robustness of the different methods.