WO1997007624A1 - Echo cancelling using signal preprocessing in an acoustic environment - Google Patents

Echo cancelling using signal preprocessing in an acoustic environment Download PDF

Info

Publication number
WO1997007624A1
WO1997007624A1 PCT/SE1996/001037 SE9601037W WO9707624A1 WO 1997007624 A1 WO1997007624 A1 WO 1997007624A1 SE 9601037 W SE9601037 W SE 9601037W WO 9707624 A1 WO9707624 A1 WO 9707624A1
Authority
WO
WIPO (PCT)
Prior art keywords
echo
signals
environment
source
signal
Prior art date
Application number
PCT/SE1996/001037
Other languages
French (fr)
Inventor
Ingvar Claesson
Mattias Dahl
Sven Nordebo
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to AU68413/96A priority Critical patent/AU6841396A/en
Publication of WO1997007624A1 publication Critical patent/WO1997007624A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/085Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using digital techniques

Definitions

  • the present invention relates to echo suppression in a communications system, and more particularly to techniques that employ adaptive filtering to cancel or suppress echoes in a communications system, such as in telephony systems.
  • a communications system such as in telephony systems.
  • one problem that often needs to be addressed is the existence of echoes that can arise, for example, when a signal representing a talker's voice is received at a listener's station and then retransmitted back to the original talker. Because of delays introduced by the system, the talker will hear his or her own voice as an echo that occurs shortly after the words were actually spoken.
  • Circumstances that may give rise to such an echo include the existence of an acoustic coupling between the loudspeaker and the microphone at the listener's side of the connection.
  • This problem is particularly pronounced in "hands-free" communication equipment, such as a speakerphone, because the acoustic coupling between the loudspeaker and the microphone is particularly strong in such equipment.
  • hands-free mobile telephone equipment e.g., cellular telephone equipment
  • a cellular telephone's microphone might be mounted on the sun visor, while the loudspeaker may be a dash-mounted unit, or may alternatively be one that is associated with the car's stereo equipment.
  • FIG. la shows an apparatus, such as hands-free mobile telephony equipment, for providing two-way (full duplex) communication between a far end user (not shown) , and a near end user.
  • the speech signal from the far end user denoted by "far end speech" signal 1 0 1 in the figure, is transmitted towards the near end listener through an electronic channel.
  • the far end speech signal 101 is a digital signal that is supplied to a digital-to- analog (D/A) converter and amplifier 103.
  • the resulting amplified analog signal is then supplied to one or more loudspeakers which generate an acoustic signal 1 07 that propagates within, and is distorted by, the near-en d acoustic environment.
  • loudspeaker 105 only one of the possibly plural loudspeakers (i.e., loudspeaker 105) is illustrated in the Figures and described throughout the remainder of this disclosure.
  • the invention should be considered to be applicable to situations in which more than one loudspeaker (or other type of echo source ) are present.
  • a distorted version of the acoustic signal is thus received by a single microphone 111 that generates an analog signal that is amplified an d converted into a digital signal by the microphone amplifier and analog-to-digital converter 113. If the resulting signal, denoted by “echo signal” 115 in the figure, is transmitted back to the far end speaker, he or she will, due to the time delay in the closed loop, perceive it as an annoying echo.
  • an echo canceler (EC) 117 is introduced into the system.
  • the echo canceler 117 is a filter that closely approximates the filtering that the near-en d acoustic environment performs on the acoustic signal 107.
  • the echo canceler 117 is able to generate an echo replica 119 directly from the far end speech signal 101.
  • the echo replica 119 is thereafter subtracted from the echo signal 115, resulting in a residual echo signal 121 that is much less annoying for the far end speaker.
  • the filter characteristics (e.g., the tap weights in a finite impulse response, or FIR, filter) of the echo canceler 117 are dynamically adjusted by, for example, minimizing the energy of the residual echo signal 121, in order to adapt to changes in the near-end acoustic environment. Such changes may occur whenever the physical environment is altered, such as when a window or car door is opened or closed.
  • Fig. lb A more formal depiction of the acoustic echo problem is shown in Fig. lb.
  • the near-end acoustic environment and the analog electronic components have been combined so that they may be illustrated as the non-linear function f (s (k) ) 150, where k is a running discrete time index and s (k) denotes the far end speech.
  • f (s (k) ) 150 the echo signal, denoted by e (k) , can be written as
  • g (k) is generated as an output of a finite impulse response (FIR) filter, and is determined by the equation
  • ⁇ g lf . . . g N ⁇ are filter weights, which may alternatively be fixed or dynamically changed to adapt to new conditions.
  • the echo replica is given by a weighted sum of past values of the far end speech.
  • N is typically on the order of 200-2000.
  • the filter weights are dynamically determined in accordance with any of a number of well-known algorithms, such as the Normalized Least Mean Squares algorithm which operates by minimizing the energy of the residual echo signal r(k) .
  • the conventional echo cancellation solution reduces the echo level by approximately l0-20dB. This is far from sufficient for a high audible quality communication. Moreover, very long
  • FIR-filters are required to achieve even this level of echo cancellation.
  • something on the order of 4000 FIR taps would be required for certain speakerphone applications.
  • the conventional echo suppression techniques give imperfect performance as a result of the non-linearities in the echo path and in the analog electronics.
  • the speed of sound heavily depends on the absolute temperature.
  • the loudspeaker is well known to introduce different kinds of distortion.
  • an echo suppression apparatus comprising a plurality of sensors and processing means.
  • the plurality of sensors which may be microphones, are disposed in an environment f or receiving, at each of the sensors, signals, wherein each of the signals includes a component derived from an echo source in the environment.
  • the processing means utilizes spatial and/or temporal information obtained from the plurality o f sensors in order to transform the received signals into a processed signal having a reduced component derived from the echo source.
  • the echo source may be a loudspeaker having input signals derived from an electronic signal
  • the echo suppression apparatus further comprises means for generating an echo replica from the electronic signal, wherein the echo replica approximates the reduced component derived from the echo source.
  • Means are also provided for subtracting the echo replica from the processed signal.
  • the processor and the echo replica generating means may be adaptive filters, such as finite impulse response (FIR) filters whose tap weights are dynamically determined.
  • FIR finite impulse response
  • methods for reducing the echo component include receiving signals at each of a plurality of sensors that are disposed in the environment. The received signals are then processed to reduce the echo component. The processing may include the use of spatial and/or temporal information to reduce the echo component.
  • FIGS, la and lb are depictions of a prior art echo canceler for suppressing echoes in an apparatus for providing two-way communication between a far end speaker and a near end speaker;
  • FIGS. 2a and 2b are block diagrams of an echo canceler in accordance with one embodiment of the present invention
  • FIG. 3 is a detailed block diagram of a preferred pre-processor for use in the invention
  • FIG. 4 is a block diagram of a hardware configuration for adaptively determining tap weights for use in a pre-processor in accordance with the invention.
  • FIG. 2a A block diagram of an echo canceler in accordance with one embodiment of the present invention is shown in FIG. 2a.
  • the inventive echo canceler is based on the observation that the acoustic echo generally originates from a limited set of acoustic sources (e.g., one or a few loudspeakers) .
  • acoustic sources e.g., one or a few loudspeakers
  • echo suppression can be achieved using information about the spatial localization of the sources.
  • Temporal filtering may also be applied to suppress the echo signal being generated by the several acoustic sensors, as will be described in greater detail below.
  • a far-end speech signal 10 1 is supplied to a D/A converter and amplifier 103 as described in the BACKGROUND section, in order to produce an analog signal that is in turn supplied to a loudspeaker 105.
  • the loudspeaker generates an acoustic signal which propagates through a number of acoustic echo paths 2 03 to generate a corresponding number of acoustic echo signals 205. Because of non-uniformities in the near-end environment, each one of the acoustic echo signals 205 is, in general, different in one or more ways (e.g., degree of distortion, delay, etc.) from each of the other acoustic echo signals 205.
  • a plurality of sensors such as the microphones 201, are disposed at different locations within the near end environment in order to be able to receive different ones of the acoustic echo signals 205.
  • the sensors need not be spaced to conform with any particular geometric pattern, since, in accordance with the invention, the filters (described below) will always perform a best least square compromise.
  • a total width (aperture) of about 0.3-0.6m has been found to be sufficient.
  • the output from each of the microphones 201 is fed to a corresponding one of a plurality of microphone amplifiers and A/D converters 207, whose digital outputs are each supplied to a corresponding input of a pre-processor 209.
  • the pre ⁇ processor 209 makes use of the spatial and temporal information that is inherent in the collection of acoustic echo signals 205 in order to assist in the suppression of the echo. For example, if the location of the loudspeaker is known, then the pre-processor 209 may be designed to filter out (or at least attenuate) all sounds coming from that direction, while allowing sounds from all other directions to pass.
  • the spatial information is utilized to distinguish between the acoustic echo signal (most of which emanates from the direction of the loudspeaker) and all other acoustic signals. Temporal information may also be utilized to filter out those sounds which most likely emanated from the loudspeaker.
  • the pre-processor generates a single pre-processed echo signal 215 that includes a wanted part (e.g., the near- end user's voice) in addition to having a reduced echo signal component.
  • the inventive echo canceler may also include a conventional echo canceler (EC) 211, which is preferably a finite impulse response (FIR) filter.
  • EC echo canceler
  • the tap weights of the FIR filter are preferably set so that the output of the EC 211 is an echo replica 213 that closely approximates the echo component of the pre-processed echo 215.
  • the tap weights are adjuste d dynamically by means of well-known techniques such as the Normalized Least Mean Squares algorithm.
  • the echo replica 213 is then subtracted from the pre-processed echo signal 215, and the resulting residual echo signal 217 is transmitted to the far-end speaker (not shown) .
  • FIG. 2b is a more formalized depiction of the inventive echo canceler.
  • the near-end acoustic environment and the analog electronic components (A/D and D/A convertors, amplifiers, etc.) associated with each of M acoustic echo paths 203 have been combined in order to allow their depiction as the non-linear functions, f ( (s (k) ) 150, where 1 ⁇ £ ⁇ M, k is a running discrete time index and s (k) denotes the far end speech.
  • the echo signals denoted by e t (k) , can be written as
  • the multi-channel measurements are then pre- processed and reduced to a single channel signal (denoted by h (k) below) by the pre-processor 209.
  • the pre-processor 209 preferably comprises a number, M , of FIR filters 309-1, . . . , 309-M in one-to-one correspondence with the M microphone signals, as illustrated in FIG. 3.
  • the pre-processor 209 preferably performs both spatial and temporal filtering in accordance with techniques described in B.D. Van Veen and K.M. Buckley, "Beamforming: A Versatile Approach to Spatial Filtering", IEEE ASSP Magazine, pp. 4-24, April 1988, which is hereby incorporated herein by reference. Consequently, the signal h (k) preferably consists of a filtered sum of measurements, for example, with Af filters of equal length ⁇ 7,
  • e, (k-p) is the echo measured by channel £ (that is, the digital signal after amplification and A/D conversion) at time k-p.
  • Af denotes the number of measurement channels, and is typically in the range of from 1 to 8.
  • N denotes the number of filter taps in each of the pre-processor FIR filters 309-1, . . ., 309-M. Typical values for N are in the range from 64 to 256.
  • h (k) consists of the temporally and spatially weighted sum of measurements. Note that it is preferable for Af to be greater than 1, so that the spatial information associated with the echo may be utilized in the filtering process.
  • the signals from the Af sensors are subjected to spatial processing to eliminate those sound components that derive from the direction of the loudspeaker 105, while passing all other sound components.
  • Temporal processing may also be applied to eliminate those sound components having frequency characteristics that match the frequency characteristics of sounds produced by the loudspeaker 105.
  • the weights ⁇ , p can be determined from different techniques for beamformer design. The weights are preferably fixed once they have been determined. Determination of the weights can be performed by a number of techniques. Fixed beamfor er weights can be obtained, for example, by using a least squares criterium to solve an overdetermined system of linear equations, as described at page 12 of the above-cited Van Veen and Buckley article.
  • the array response vectors can be calculated if an accurate mathematical model is available for describing the sound field, array geometry, amplifier characteristics and other pertinent factors. This is an extremely difficult task for a microphone array in an automobile compartment due to nearfield considerations, reflections, channel matching, and the like.
  • the response vectors can also be measured in the actual environment, but this is also a complicated matter, making it difficult to obtain accurate results.
  • a preferred method for obtaining fixed weights is to use an adaptive modelling technique as depicted in FIG. 4.
  • the use of adaptive algorithms also has the advantage of enabling one to more easily design a system that cancels out noise (e.g., road and tire noise) in addition to the echo.
  • the beamformer design is performed in the actual acoustic environment in which it will be used. All of the electronic equipment that will subsequently be used in the echo cancelling system should be disposed in the acoustic environment during the weight-determination phase of the design. This equipment includes, but is not limited to, the array of microphones 401-1, . . ., 401-M and the telephone's hands-free loudspeaker 405. For purposes of determining the weights, a second loudspeaker 425 is also disposed within the acoustic environment at a location that approximates that from which a telephone user's voice would originate.
  • a signal, S sptech which simulates the voice of a typical user, is simultaneously fed to an adaptive algorithm as a desired signal, and to the second loudspeaker 425.
  • a typical jamming echo noise signal, S n fed to the hands-free loudspeaker 405.
  • the weights of the pre-processor are then adapted to correct filter weights according to any suitable criterion, such as the Least Mean Squares criterion using the well-known Least Mean Squares algorithm.
  • suitable adaptive algorithms are described in the above-referenced Van Veen and Buckley article, beginning at page 17.
  • the pre-processor 209 which receives signals from a plurality of sensors (e.g., the microphones 201), may be used alone to cancel an echo.

Abstract

An echo suppression apparatus has a number of sensors, such as microphones, disposed in an environment. Each sensor receives a signal. Each of the signals includes a component derived from an echo source in the environment. The echo source may be one or more loudspeakers which, in combination with the microphones, are part of a hands-free communication device. The echo suppression apparatus further includes a processor for processing the received signals based on spatial and/or temporal information in order to transform the received signals into a processed signal having a reduced component derived from the echo source. In accordance with another aspect of the invention, the echo source is one or more loudspeakers having input signals derived from an electronic signal, and the echo suppression apparatus may further include an echo canceler component for generating an echo replica from the electronic signal, wherein the echo replica approximates the reduced component derived from the echo source. The echo replica is then subtracted from the processed signal in order to further reduce the amount of echo that is perceived by a far-end user. The processor and the echo canceler component may, for example, each be adaptive finite impulse response (FIR) filters whose tap weights are dynamically determined.

Description

ECHO CANCELLING USING SIGNAL PREPROCESSING IN AN ACOUSTIC ENVIRONMENT
BACKGROUND The present invention relates to echo suppression in a communications system, and more particularly to techniques that employ adaptive filtering to cancel or suppress echoes in a communications system, such as in telephony systems. In conventional land-based as well as in mobile communications systems, one problem that often needs to be addressed is the existence of echoes that can arise, for example, when a signal representing a talker's voice is received at a listener's station and then retransmitted back to the original talker. Because of delays introduced by the system, the talker will hear his or her own voice as an echo that occurs shortly after the words were actually spoken. Circumstances that may give rise to such an echo include the existence of an acoustic coupling between the loudspeaker and the microphone at the listener's side of the connection. This problem is particularly pronounced in "hands-free" communication equipment, such as a speakerphone, because the acoustic coupling between the loudspeaker and the microphone is particularly strong in such equipment. The problem must also be addressed where hands-free mobile telephone equipment (e.g., cellular telephone equipment) is utilized. In an automobile, for example, a cellular telephone's microphone might be mounted on the sun visor, while the loudspeaker may be a dash-mounted unit, or may alternatively be one that is associated with the car's stereo equipment. With these components mounted in this fashion, a cellular phone user may carry on a conversation without having to hold a cellular unit or its handset. However, sound from the loudspeaker may also be picked up by the microphone, thereby returning the far end speaker's own voice to him after some, delay. The echo problem will now be described in greater detail with reference to FIG. la, which shows an apparatus, such as hands-free mobile telephony equipment, for providing two-way (full duplex) communication between a far end user (not shown) , and a near end user. The speech signal from the far end user, denoted by "far end speech" signal 101 in the figure, is transmitted towards the near end listener through an electronic channel. The far end speech signal 101 is a digital signal that is supplied to a digital-to- analog (D/A) converter and amplifier 103. The resulting amplified analog signal is then supplied to one or more loudspeakers which generate an acoustic signal 107 that propagates within, and is distorted by, the near-end acoustic environment. For simplicity of explanation, only one of the possibly plural loudspeakers (i.e., loudspeaker 105) is illustrated in the Figures and described throughout the remainder of this disclosure. However, the invention should be considered to be applicable to situations in which more than one loudspeaker (or other type of echo source) are present.
Continuing with the discussion, a distorted version of the acoustic signal, denoted by "acoustic echo" 109 in the figure, is thus received by a single microphone 111 that generates an analog signal that is amplified and converted into a digital signal by the microphone amplifier and analog-to-digital converter 113. If the resulting signal, denoted by "echo signal" 115 in the figure, is transmitted back to the far end speaker, he or she will, due to the time delay in the closed loop, perceive it as an annoying echo.
In order to suppress the echo signal 115, an echo canceler (EC) 117 is introduced into the system. In conventional systems, the echo canceler 117 is a filter that closely approximates the filtering that the near-end acoustic environment performs on the acoustic signal 107. In this way, the echo canceler 117 is able to generate an echo replica 119 directly from the far end speech signal 101. The echo replica 119 is thereafter subtracted from the echo signal 115, resulting in a residual echo signal 121 that is much less annoying for the far end speaker. Typically, the filter characteristics (e.g., the tap weights in a finite impulse response, or FIR, filter) of the echo canceler 117 are dynamically adjusted by, for example, minimizing the energy of the residual echo signal 121, in order to adapt to changes in the near-end acoustic environment. Such changes may occur whenever the physical environment is altered, such as when a window or car door is opened or closed.
A more formal depiction of the acoustic echo problem is shown in Fig. lb. In this figure, the near-end acoustic environment and the analog electronic components (A/D and D/A convertors, amplifiers, etc.) have been combined so that they may be illustrated as the non-linear function f (s (k) ) 150, where k is a running discrete time index and s (k) denotes the far end speech. Thus, the echo signal, denoted by e (k) , can be written as
e (k) = £ (s (k) ) (1)
Denoting the output from the echo canceler 117 (i.e., the echo replica) by g(k) , the residual echo, r(k) , is given by
r (k) = β (k) - g (k) (2)
In almost every practical echo canceler implementation, g (k) is generated as an output of a finite impulse response (FIR) filter, and is determined by the equation
N g(ic)=j; gks (k- t) (3) t=l where {glf . . . gN} are filter weights, which may alternatively be fixed or dynamically changed to adapt to new conditions. Thus, the echo replica is given by a weighted sum of past values of the far end speech. In telephony applications, N is typically on the order of 200-2000. In order to make the echo canceler 117 adaptive, the filter weights are dynamically determined in accordance with any of a number of well-known algorithms, such as the Normalized Least Mean Squares algorithm which operates by minimizing the energy of the residual echo signal r(k) .
For acoustic echoes originating from hands free operation of a mobile telephone, the conventional echo cancellation solution reduces the echo level by approximately l0-20dB. This is far from sufficient for a high audible quality communication. Moreover, very long
FIR-filters are required to achieve even this level of echo cancellation. As an example, something on the order of 4000 FIR taps would be required for certain speakerphone applications. In general, the conventional echo suppression techniques give imperfect performance as a result of the non-linearities in the echo path and in the analog electronics. For example, the speed of sound heavily depends on the absolute temperature. In addition, the loudspeaker is well known to introduce different kinds of distortion.
The above description is an overview of prior art echo cancellation techniques. In practice, more sophisticated algorithms are used. Such algorithms may include features like voice switching and pole-zero modeling (that is, the use of infinite impulse response filters) . A more complete description of prior art echo cancelers may be found in M.M. Sohndi & Kellerman, "Adaptive echo cancellation for speech signals", Advances in speech signal processing, (S. Furui & M.M. Sohndi eds. 1992), New York. Despite the increased sophistication, however, all of the above-described prior art methods for acoustic echo cancellation lead to a residual echo having one or more severe artifacts, such as an insufficient reduction of the echo level, and introduction of artifacts in the residual echo or in the speech of the near end speaker. Furthermore, some of these conventional techniques involve algorithms that exhibit slow convergence and computational problems, and entail high computational complexity.
SUMMARY
It is therefore an object of the present invention to provide an echo suppression technique and apparatus that gives improved performance over conventional techniques. It is another object of the invention to provide an echo canceler that avoids distortion of the speech from the near end speaker.
It is yet another object of the invention to provide an echo canceler that produces a remaining residual echo without annoying artifacts.
It is still another object of the invention to provide an echo canceler having a computational complexity on the order of that of conventional echo cancellation algorithms. It is yet another object of the invention to provide an echo canceler that avoids slow convergence and high computational complexity.
In accordance with one aspect of the present invention, the foregoing and other objects are achieved in an echo suppression apparatus comprising a plurality of sensors and processing means. The plurality of sensors, which may be microphones, are disposed in an environment for receiving, at each of the sensors, signals, wherein each of the signals includes a component derived from an echo source in the environment. The processing means utilizes spatial and/or temporal information obtained from the plurality of sensors in order to transform the received signals into a processed signal having a reduced component derived from the echo source.
In accordance with another aspect of the invention, the echo source may be a loudspeaker having input signals derived from an electronic signal, and the echo suppression apparatus further comprises means for generating an echo replica from the electronic signal, wherein the echo replica approximates the reduced component derived from the echo source. Means are also provided for subtracting the echo replica from the processed signal.
In yet other aspects of the invention, the processor and the echo replica generating means may be adaptive filters, such as finite impulse response (FIR) filters whose tap weights are dynamically determined.
In still other aspects of the invention, methods for reducing the echo component are also disclosed, which methods include receiving signals at each of a plurality of sensors that are disposed in the environment. The received signals are then processed to reduce the echo component. The processing may include the use of spatial and/or temporal information to reduce the echo component.
BRIEF DESCRIPTION OF THE DRAWINGS The objects and advantages of the invention will be understood by reading the following detailed description in conjunction with the drawings in which:
FIGS, la and lb are depictions of a prior art echo canceler for suppressing echoes in an apparatus for providing two-way communication between a far end speaker and a near end speaker;
FIGS. 2a and 2b are block diagrams of an echo canceler in accordance with one embodiment of the present invention; FIG. 3 is a detailed block diagram of a preferred pre-processor for use in the invention; and FIG. 4 is a block diagram of a hardware configuration for adaptively determining tap weights for use in a pre-processor in accordance with the invention.
DETAILED DESCRIPTION
The various features of the invention will now be described with respect to the figures, in which like parts are identified with the same reference characters. In the following discussion, the invention is described in the context of reducing echoes from an acoustic environment.
However, this context is intended to be merely illustrative. Those having ordinary skill in the art will readily be able to adapt inventive principles for the purpose of reducing non-acoustic echoes as well. A block diagram of an echo canceler in accordance with one embodiment of the present invention is shown in FIG. 2a. The inventive echo canceler is based on the observation that the acoustic echo generally originates from a limited set of acoustic sources (e.g., one or a few loudspeakers) . Thus, by using several acoustic sensors (e.g., several microphones), echo suppression can be achieved using information about the spatial localization of the sources. Temporal filtering may also be applied to suppress the echo signal being generated by the several acoustic sensors, as will be described in greater detail below.
As shown in FIG. 2a, a far-end speech signal 101 is supplied to a D/A converter and amplifier 103 as described in the BACKGROUND section, in order to produce an analog signal that is in turn supplied to a loudspeaker 105. The loudspeaker generates an acoustic signal which propagates through a number of acoustic echo paths 203 to generate a corresponding number of acoustic echo signals 205. Because of non-uniformities in the near-end environment, each one of the acoustic echo signals 205 is, in general, different in one or more ways (e.g., degree of distortion, delay, etc.) from each of the other acoustic echo signals 205.
In accordance with one aspect of the invention, a plurality of sensors, such as the microphones 201, are disposed at different locations within the near end environment in order to be able to receive different ones of the acoustic echo signals 205. The sensors need not be spaced to conform with any particular geometric pattern, since, in accordance with the invention, the filters (described below) will always perform a best least square compromise. In practice, a total width (aperture) of about 0.3-0.6m has been found to be sufficient. When applied to hands-free telephone equipment in an automobile, there is no need to sample the car interior any denser than once every 0.1m.
The output from each of the microphones 201 is fed to a corresponding one of a plurality of microphone amplifiers and A/D converters 207, whose digital outputs are each supplied to a corresponding input of a pre-processor 209. As will be explained in greater detail below, the pre¬ processor 209 makes use of the spatial and temporal information that is inherent in the collection of acoustic echo signals 205 in order to assist in the suppression of the echo. For example, if the location of the loudspeaker is known, then the pre-processor 209 may be designed to filter out (or at least attenuate) all sounds coming from that direction, while allowing sounds from all other directions to pass. That is, the spatial information is utilized to distinguish between the acoustic echo signal (most of which emanates from the direction of the loudspeaker) and all other acoustic signals. Temporal information may also be utilized to filter out those sounds which most likely emanated from the loudspeaker.
The pre-processor generates a single pre-processed echo signal 215 that includes a wanted part (e.g., the near- end user's voice) in addition to having a reduced echo signal component. In order to further reduce the echo component of the pre-processed echo signal 215, the inventive echo canceler may also include a conventional echo canceler (EC) 211, which is preferably a finite impulse response (FIR) filter. The tap weights of the FIR filter are preferably set so that the output of the EC 211 is an echo replica 213 that closely approximates the echo component of the pre-processed echo 215. In a preferred embodiment of the invention, the tap weights are adjusted dynamically by means of well-known techniques such as the Normalized Least Mean Squares algorithm. The echo replica 213 is then subtracted from the pre-processed echo signal 215, and the resulting residual echo signal 217 is transmitted to the far-end speaker (not shown) . The various aspects of the invention will now be described in greater detail with reference to FIG. 2b, which is a more formalized depiction of the inventive echo canceler. In the figure, the near-end acoustic environment and the analog electronic components (A/D and D/A convertors, amplifiers, etc.) associated with each of M acoustic echo paths 203 have been combined in order to allow their depiction as the non-linear functions, f( (s (k) ) 150, where 1 < £ < M, k is a running discrete time index and s (k) denotes the far end speech. Thus, the echo signals, denoted by et (k) , can be written as
et (k) = f, (s (k) ) , where £ = 1, ..., M (4)
The multi-channel measurements are then pre- processed and reduced to a single channel signal (denoted by h (k) below) by the pre-processor 209. The pre-processor 209 preferably comprises a number, M , of FIR filters 309-1, . . . , 309-M in one-to-one correspondence with the M microphone signals, as illustrated in FIG. 3. Each of the M FIR filters 309-1, . . ., 309-M has Nt number of filter taps, where £ = 1, . . . , Λf. Thus, it is not a requirement that all of the Af FIR filters 309-1, . . ., 309-M have the same number of taps. The pre-processor 209 preferably performs both spatial and temporal filtering in accordance with techniques described in B.D. Van Veen and K.M. Buckley, "Beamforming: A Versatile Approach to Spatial Filtering", IEEE ASSP Magazine, pp. 4-24, April 1988, which is hereby incorporated herein by reference. Consequently, the signal h (k) preferably consists of a filtered sum of measurements, for example, with Af filters of equal length Λ7,
M N
Λ(ic) =∑ Σ ω«. e«^-P) (5)
?=1 p=l
where e, (k-p) is the echo measured by channel £ (that is, the digital signal after amplification and A/D conversion) at time k-p. As indicated above, Af denotes the number of measurement channels, and is typically in the range of from 1 to 8. Also, N denotes the number of filter taps in each of the pre-processor FIR filters 309-1, . . ., 309-M. Typical values for N are in the range from 64 to 256. Thus, in a preferred embodiment of the invention, h (k) consists of the temporally and spatially weighted sum of measurements. Note that it is preferable for Af to be greater than 1, so that the spatial information associated with the echo may be utilized in the filtering process. That is, the signals from the Af sensors are subjected to spatial processing to eliminate those sound components that derive from the direction of the loudspeaker 105, while passing all other sound components. Temporal processing may also be applied to eliminate those sound components having frequency characteristics that match the frequency characteristics of sounds produced by the loudspeaker 105. The weights ω,p can be determined from different techniques for beamformer design. The weights are preferably fixed once they have been determined. Determination of the weights can be performed by a number of techniques. Fixed beamfor er weights can be obtained, for example, by using a least squares criterium to solve an overdetermined system of linear equations, as described at page 12 of the above-cited Van Veen and Buckley article. However, this approach requires the determination of the array response vectors over a dense set of spatial-frequency points. The array response vectors can be calculated if an accurate mathematical model is available for describing the sound field, array geometry, amplifier characteristics and other pertinent factors. This is an extremely difficult task for a microphone array in an automobile compartment due to nearfield considerations, reflections, channel matching, and the like. The response vectors can also be measured in the actual environment, but this is also a complicated matter, making it difficult to obtain accurate results.
A preferred method for obtaining fixed weights is to use an adaptive modelling technique as depicted in FIG. 4. In addition to being easier to work with, the use of adaptive algorithms also has the advantage of enabling one to more easily design a system that cancels out noise (e.g., road and tire noise) in addition to the echo.
The beamformer design is performed in the actual acoustic environment in which it will be used. All of the electronic equipment that will subsequently be used in the echo cancelling system should be disposed in the acoustic environment during the weight-determination phase of the design. This equipment includes, but is not limited to, the array of microphones 401-1, . . ., 401-M and the telephone's hands-free loudspeaker 405. For purposes of determining the weights, a second loudspeaker 425 is also disposed within the acoustic environment at a location that approximates that from which a telephone user's voice would originate.
A signal, Ssptech, which simulates the voice of a typical user, is simultaneously fed to an adaptive algorithm as a desired signal, and to the second loudspeaker 425. At the same time, a typical jamming echo noise signal, Sn fed to the hands-free loudspeaker 405. The weights of the pre-processor are then adapted to correct filter weights according to any suitable criterion, such as the Least Mean Squares criterion using the well-known Least Mean Squares algorithm. Alternatively, suitable adaptive algorithms are described in the above-referenced Van Veen and Buckley article, beginning at page 17.
Referring back now to FIG. 2a, the pre-processor 209, which receives signals from a plurality of sensors (e.g., the microphones 201), may be used alone to cancel an echo. However, in accordance with another aspect of the invention, the pre-processed echo signal (h (k) ) 215 is then preferably used in order to form the residual echo signal (r (k) ) 217 as r (k) = h (k) - g(k) (6) where g(k) is the output from a conventional echo canceler 211 as previously described.
The invention has been described with reference to a particular embodiment. However, it will be readily apparent to those skilled in the art that it is possible to embody the invention in specific forms other than those of the preferred embodiment described above. This may be done without departing from the spirit of the invention. For example, application of the invention is not limited only to hands-free telephone equipment in an automobile. Instead, the invention may be applied to cancel echoes in many other situations. An incomplete list of examples includes eliminating echoes that arise in connection with domestic speakerphones, teleconference studios and spoken-commanded computers. Moreover, the use of multiple sensors and pre¬ processing as taught in this disclosure is not limited to the cancellation of only acoustic echoes. Rather, it may be applied as well in situations where it is desired to remove any type of echo signal, such as those existing in the form of electromagnetic radiation. All that is required is that the sensors be appropriate for the type of signals to be detected.
Thus, the preferred embodiment is merely illustrative and should not be considered restrictive in any way. The scope of the invention is given by the appended claims, rather than the preceding description, and all variations and equivalents which fall within the range of the claims are intended to be embraced therein.

Claims

WHAT IS CLAIMED IS:
1. An echo suppression apparatus comprising: a plurality of sensors disposed in an environment for receiving signals at each of the sensors, wherein each of the signals includes a component derived from an echo source in the environment; and means for processing the received signals in order to transform the received signals into a processed signal having a reduced component derived from the echo source.
2. The echo suppression apparatus of claim 1, wherein the processing means includes means for using spatial information to process the received signals.
3. The echo suppression apparatus of claim l, wherein the processing means includes means for using temporal information to process the received signals.
4. The echo suppression apparatus of claim 3, wherein the processing means further includes means for using spatial information to process the received signals.
5. The echo suppression apparatus of claim 1, wherein the environment is an acoustic environment, and wherein the signals are acoustic signals.
6. The echo suppression apparatus of claim 5, wherein: the acoustic environment is a passenger compartment of an automobile; each of the sensors is a microphone belonging to hands-free communications equipment; and the echo source is a loudspeaker belonging to the hands-free communications equipment.
7. The echo suppression apparatus of claim 1, wherein: the echo source is a loudspeaker having input signals derived from an electronic signal; and the echo suppression apparatus further comprises: means for generating an echo replica from the electronic signal, wherein the echo replica approximates the reduced component derived from the echo source; and means for subtracting the echo replica from the processed signal.
8. The echo suppression apparatus of claim 7, wherein the echo replica generating means is an adaptive filter.
9. A method for suppressing an echo component in signals that are received from an environment, the method comprising the steps of: receiving the signals at each of a plurality of sensors disposed in the environment, wherein at least one of the signals includes a component derived from an echo source in the environment; and processing the received signals in order to transform the received signals into a processed signal having a reduced component derived from the echo source.
10. The method of claim 9, wherein the step of processing includes using spatial information to process the received signals.
11. The method of claim 9, wherein the step of processing includes using temporal information to process the received signals.
12. The method of claim 11, wherein the step of processing further includes using spatial information to process the received signals.
13. The method of claim 9, wherein the environment is an acoustic environment, and wherein the signals are acoustic signals.
14. The method of claim 13, wherein: the acoustic environment is a passenger compartment of an automobile; each of the sensors is a microphone belonging to hands-free communications equipment; and the echo source is a loudspeaker belonging to the hands-free communications equipment.
15. The method of claim 9, wherein: the echo source is a loudspeaker having input signals derived from an electronic signal; and the method further comprises the steps of: generating an echo replica from the electronic signal, wherein the echo replica approximates the reduced component derived from the echo source; and subtracting the echo replica from the processed signal.
16. The method of claim 15, wherein the step of generating the echo replica comprises using an adaptive filter to generate the echo replica.
PCT/SE1996/001037 1995-08-21 1996-08-21 Echo cancelling using signal preprocessing in an acoustic environment WO1997007624A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU68413/96A AU6841396A (en) 1995-08-21 1996-08-21 Echo cancelling using signal preprocessing in an acoustic environment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US51747395A 1995-08-21 1995-08-21
US08/517,473 1995-08-21

Publications (1)

Publication Number Publication Date
WO1997007624A1 true WO1997007624A1 (en) 1997-02-27

Family

ID=24059953

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE1996/001037 WO1997007624A1 (en) 1995-08-21 1996-08-21 Echo cancelling using signal preprocessing in an acoustic environment

Country Status (2)

Country Link
AU (1) AU6841396A (en)
WO (1) WO1997007624A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001093554A2 (en) * 2000-05-26 2001-12-06 Koninklijke Philips Electronics N.V. Method and device for acoustic echo cancellation combined with adaptive beamforming
US6707910B1 (en) 1997-09-04 2004-03-16 Nokia Mobile Phones Ltd. Detection of the speech activity of a source
EP1542444A2 (en) * 2003-12-12 2005-06-15 Motorola, Inc. An echo canceler circuit and method
US7599483B2 (en) 2003-12-12 2009-10-06 Temic Automotive Of North America, Inc. Echo canceler circuit and method
WO2013086353A1 (en) * 2011-12-07 2013-06-13 Ndt Technologies Magnetic inspection device and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3922488A (en) * 1972-12-15 1975-11-25 Ard Anstalt Feedback-cancelling electro-acoustic transducer apparatus
US5121426A (en) * 1989-12-22 1992-06-09 At&T Bell Laboratories Loudspeaking telephone station including directional microphone

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3922488A (en) * 1972-12-15 1975-11-25 Ard Anstalt Feedback-cancelling electro-acoustic transducer apparatus
US5121426A (en) * 1989-12-22 1992-06-09 At&T Bell Laboratories Loudspeaking telephone station including directional microphone

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6707910B1 (en) 1997-09-04 2004-03-16 Nokia Mobile Phones Ltd. Detection of the speech activity of a source
WO2001093554A2 (en) * 2000-05-26 2001-12-06 Koninklijke Philips Electronics N.V. Method and device for acoustic echo cancellation combined with adaptive beamforming
WO2001093554A3 (en) * 2000-05-26 2002-10-03 Koninkl Philips Electronics Nv Method and device for acoustic echo cancellation combined with adaptive beamforming
EP1542444A2 (en) * 2003-12-12 2005-06-15 Motorola, Inc. An echo canceler circuit and method
EP1542444A3 (en) * 2003-12-12 2007-10-31 Motorola, Inc. An echo canceler circuit and method
US7599483B2 (en) 2003-12-12 2009-10-06 Temic Automotive Of North America, Inc. Echo canceler circuit and method
US7680265B2 (en) 2003-12-12 2010-03-16 Continental Automotive Systems, Inc. Echo canceler circuit and method
US8238546B2 (en) 2003-12-12 2012-08-07 Continental Automotive Systems, Inc. Echo canceler circuit and method
WO2013086353A1 (en) * 2011-12-07 2013-06-13 Ndt Technologies Magnetic inspection device and method
US9103798B2 (en) 2011-12-07 2015-08-11 Ndt Technologies, Inc. Magnetic inspection device and method

Also Published As

Publication number Publication date
AU6841396A (en) 1997-03-12

Similar Documents

Publication Publication Date Title
CN1169312C (en) Echo canceler for non-linear circuit
US8000482B2 (en) Microphone array processing system for noisy multipath environments
US7054451B2 (en) Sound reinforcement system having an echo suppressor and loudspeaker beamformer
CN101689371B (en) A device for and a method of processing audio signals
EP1855457B1 (en) Multi channel echo compensation using a decorrelation stage
EP1743323B1 (en) Adaptive beamformer, sidelobe canceller, handsfree speech communication device
CN100446530C (en) Generating calibration signals for an adaptive beamformer
JP3199155B2 (en) Echo canceller
US20030026437A1 (en) Sound reinforcement system having an multi microphone echo suppressor as post processor
EP0454242A1 (en) Digital echo canceller comprising a double-talk detector
EP1879180A1 (en) Reduction of background noise in hands-free systems
US20060188089A1 (en) Reduction in acoustic coupling in communication systems and appliances using multiple microphones
EP1131892A1 (en) Signal processing apparatus and method
US5636272A (en) Apparatus amd method for increasing the intelligibility of a loudspeaker output and for echo cancellation in telephones
KR20050073604A (en) Statistical adaptive-filter controller
EP1210814B1 (en) Methods and apparatus for improving adaptive filter performance by inclusion of inaudible information
US6694020B1 (en) Frequency domain stereophonic acoustic echo canceller utilizing non-linear transformations
Schmidt Applications of acoustic echo control-an overview
WO1997007624A1 (en) Echo cancelling using signal preprocessing in an acoustic environment
KR100272131B1 (en) Adaptive reverbation cancelling apparatus
JP3403655B2 (en) Method and apparatus for identifying unknown system using subband adaptive filter
JPH10145487A (en) High-quality loudspeaker information communication system
Edamakanti et al. Master Thesis Electrical Engineering May 2014
MXPA98010418A (en) Eco breaker for non-line circuits

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE HU IL IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG UZ VN AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA