Computational models of audition typically include a bank of filters and a population of neurons. For example, the inner ear is often modeled as the Gammatone filterbank followed by an ensemble of Integrate-and-Fire neurons. Alternatively, A1 neurons receive input from STRFs modeling the selectivity to frequency and amplitude of subcortical neurons. How much information about sounds is embedded in these models? How much information is required to maintain intelligibility of natural and synthetic sound stimuli at various levels of the auditory system? Is intelligibility maintained in auditory models based on biological neuron models? In addressing these questions there is a need to evaluate auditory models not only by computing metrics such as SNR but by also performing intelligibility tests. Intelligibility is a speech invariant  that reflects the degree of speech understanding.
Our model consists of a sound stimulus processed by a Gammatone filterbank or by a multichannel Gabor STRF. Each channel represents the additive input to a Hodgkin-Huxley neuron. Unlike Integrate-and-Fire neuron models, conductance-based models exhibit behaviors leading to a complex input (stimulus) / output (spike train) relationship. In order to measure the amount of information contained in the spike trains, we derive the best estimate of the original stimulus that maximizes a quadratic cost functional . While such a methodology provides a measure of information at various stages of subcortical or cortical processing, it does not reveal the degree of intelligibility of sound stimuli contained in the spike trains. Consequently, we provide support for the evaluation of the stimulus estimates based on intelligibility tests.
Furthermore, we formalize intelligibility as an invariant with respect to transformations in the auditory system. We estimate the auditory signal at different layers of the encoding and auditory processing architecture. Estimation error arises from two sources: the intrinsic deviation in the phase approximation of the Hodgkin-Huxley neurons and noise due to large scale numerical evaluations. Intelligibility tests are performed on the reconstructed speech stimuli. We find that a small subset of neurons lacks the collective firing frequency needed to encapsulate a high degree of intelligibility of typical auditory scenes, whereas a large population of neurons approximately yields the original sound.
This work establishes that models of the auditory system that exhibit a sufficiently large population of biologically plausible neurons and broad filterbank support, maintain the intelligibility of natural and synthetic sound stimuli. For weak auditory stimuli, Hodgkin-Huxley neurons encode sounds with approximately the same intelligibility as Integrate-and-Fire neurons and can thereby be interchangeably used in computational modeling.
The work presented here was supported by the AFOSR under grant number FA9550-09-1-0350.