The cochlea is an organ that sits behind the eardrum on each side of
the head. Its fuction is to convert the sound-induced mechanical
vibrations of the eardrum into electrical impulses that travel up the
parallel fibers of the auditory nerve to the brain, where they are
processed and understood.
A low-power silicon
implementation of the cochlea could potentially be used to develop
better cochlear implants, higher functionality hearing aids, novel
consumer products, and robust front ends for speech recognition
systems.
Subthreshold analog CMOS is a particularly interesting technology for the development of intelligent sensing and recognition systems. First commercially used in the development of Swiss watches by Vittoz, it has more recently been extended and popularized by Mead [19] for use in circuits that mimic biological systems in both function and sometimes form. The technology uses standard, and therefore inexpensive, CMOS transistors, operating with continuous currents and voltages rather than the binary voltages used in digital circuits. The current levels are tiny and are those of a transistor in the offstate; large circuits can be constructed in this medium with very little power consumption.
Subthreshold analog VLSI implementations of cochlea-circuits were
pioneered by Lyon and Mead [18] using a cascade
of
-order resonant sections. Subseqent cascade
implementations have been reported in the literature by
Lyon [17] and by Liu et al. [13].
Our implementation is also a cascade implementation. It differs from the
previously reported cascade implementations in two important respects.
First, it is adjustable so that cascades with finer frequency
resolution can be constructed without incurring a penalty in terms of
the delay to a section with a given best frequency. Second, this
implementation is nonlinear and implicitly models the saturating
active mechanism of the outer hair cells that gives rise to basilar
membrane response curves that become less sharply tuned as the input
amplitude is increased.
Earlier work under this project had focused on the design and analysis
of a circuit to model the behavior of a small length of the basilar
membrane in the cochlea. During the past year we were able to
successfully cast this design into silicon and construct a cochlea
circuit by cascading 60 of these sections in series. The circuit is
fabricated in a standard 2-micron CMOS process through MOSIS and
occupies approximately 15% of the area of a standard 4.6 mm
6.8 mm chip. The circuit is estimated to consume on the order of 300
nanowatts, not including off-chip communication requirements.
Experimental results demonstrate that the circuit is in fact adjustable, allowing cochleas with different resolutions to be simulated while maintaining the same gain and delay characteristics. An important aspect of this adjustability is the fact that the variation from nominal behavior of the cochlea-circuit due to circuit imperfections is reduced when the circuit is adjusted so that each section simulates a smaller rather than larger length of the basilar membrane. We suspect that a similar statistical mechanism is at work in the biological cochlea.
Experimental results also demonstrate that there are qualitative similarities between the large-input induced nonlinear response of the cochlea-circuit and that of the biological cochlea. Just like the vibration of the basilar membrane in a biological cochlea, the gain versus frequency curves at a given section of the cochlea-circuit become flatter and broader as the input amplitude is increased.
Figure 7 shows a schematic of the overall cochlea circuit, and Figure 8 shows the impulse response at every fifth section. As expected, the time period and delay increase exponentially as the section number increases.
Figure 8: Normalized impulse response at every fifth tap.
The design has several limitations. First is the fact that it is not
universally stable in its small signal linear region. This becomes an
important consideration given the high degree of variation present in
subthreshold circuits. A second limitation is the component count.
Each section in the current design uses seven wide-range
transconductance amplifiers and one buffer amplifier for driving the
signal off the chip. Given the result that adjusting the circuit to
get more sections per octave results in a lower variance system, one
would like sections with the least number of components in order to
limit the area requirements of high-resolution cochlea-circuits. A
third limitation arises from the observation that this particular
circuit was not designed with the goal of variance reduction in mind,
which would have led to different choices of building blocks (such as
simple rather than wide-range transconductance amplifiers), biasing
techniques and layout.
Experience with the first chip has led to several ongoing efforts. A second chip incorporating two 60 section cochlea-circuits and associated hair cell circuits has been designed and fabricated [1], but only partially tested as of this writing. It is meant to provide inputs to a cross-correlator circuit, so that the combined system can locate sound sources in the horizontal plane.
A third chip that has been designed but not fabricated is a finite element implementation of the same cochlear model that formed the basis of this work. It models the bulk of the fluid in the cochlear chambers using a two-dimensional resistive network based on the same principle used by Watts et al. [21]. More importantly, it has a circuit for the basilar membrane that explicitly incorporates a saturating active outer hair cell model. While this design has an even higher component count than the present design, it has the advantage of modeling the cochlea in form as well as function, so that it will be easier to incorporate other features such as feedback to the outer hair cells from higher processing centers.
Another effort that is in a much earlier stage is aimed at developing robust feature detectors and incorporating them onto the cochlea chip. The goal of this undertaking is to develop inexpensive but highly functional front ends for speech recognition systems.
An implicit result of our research is the viability of subthreshold analog VLSI as a medium for implementing dense analog computational systems. We designed a circuit to provide performance akin to the results obtained from a mathematical model, layed it out using vanilla design tools, fabricated it using a common process available through MOSIS, and obtained working chips on the first attempt. This should encourage other researchers to use this technology with a greater degree of confidence. However, we caution people that the design rules for obtaining low-variance systems based on subthreshold circuits are more stringent than those used in typical digital or analog circuit designs. Some of these rules are described in Vittoz [20]. However, because of the relative infancy of this design form, these rules do not appear to be widely known.