EP0918317A1

EP0918317A1 - Frequency filtering method using a Wiener filter applied to noise reduction of audio signals

Info

Publication number: EP0918317A1
Application number: EP98402894A
Authority: EP
Inventors: Dominique Thomson-CSF Prop. Intel. Pastor; Gérard Thomson-CSF Prop. Intel. Reynaud; Pierre-Albert Thomson-CSF Prop. Intel. Breton
Original assignee: Thales Avionics SAS
Current assignee: Thales Avionics SAS
Priority date: 1997-11-21
Filing date: 1998-11-20
Publication date: 1999-05-26
Anticipated expiration: 2018-11-20
Also published as: DE69817507D1; US6445801B1; FR2771542B1; EP0918317B1; FR2771542A1; JPH11265198A

Abstract

The noisy signal is converted into the frequency domain by Fourier analysis (1). A model is created of the noise (2) and estimates made of the spectral density, energy level and the conversion coefficient for statistical dispersion. In parallel to this similar estimates are made of the noisy signal (3). The two results are then recombined in a Wiener filter (4) which compensates for energy levels and applies noise overestimation. The signal is then reconstructed (5).

Description

La présente invention concerne un procédé de filtrage fréquentiel mettant en oeuvre un filtre de Wiener.The present invention relates to a method of frequency filtering using a Wiener filter.

Elle s'applique notamment, bien que non exclusivement, au débruitage de signaux sonores contenant de la parole captée en milieux bruités et de façon plus générale au débruitage de tous signaux sonores.It applies in particular, although not exclusively for the denoising of sound signals containing speech captured in noisy environments and more general to denoising of all sound signals.

Les domaines principaux concernent les communications téléphoniques ou radiotéléphoniques, la reconnaissance vocale, la prise de son à bord d'aéronefs civils ou militaires, et de façon plus générale de tous véhicules bruyants, les intercommunications de bord, etc.The main areas concern telephone or radiotelephone communications, the voice recognition, sound recording on board aircraft civil or military, and more generally of all noisy vehicles, on-board intercoms, etc.

A titre d'exemple non limitatif, dans le cas d'un aéronef, les bruits résultent des moteurs, de la climatisation, de la ventilation des équipements de bord ou des bruits aérodynamiques. Tous ces bruits sont captés, au moins partiellement, par le microphone dans lequel parle le pilote ou un autre membre de l'équipage. En outre, pour ce type d'application en particulier, une des caractéristiques des bruits est d'être très variables dans le temps. En effet, ils sont très dépendants du régime de fonctionnement des moteurs (phase de décollage, régime stabilisé, etc.). Les signaux utiles, c'est-à-dire les signaux représentant les conversations, présentent également des particularités : ils sont le plus souvent de brève durée.By way of nonlimiting example, in the case of a aircraft, noises result from engines, air conditioning, ventilation of on-board equipment or aerodynamic noises. All these noises are picked up, at less partially, by the microphone in which the pilot or another member of the crew. In addition, for this type of application in particular, one of the characteristics noises is to be very variable over time. In indeed, they are very dependent on the operating regime engines (take-off phase, stabilized speed, etc.). Useful signals, i.e. signals representing the conversations, also present particularities: they are most often short-lived.

Enfin, quelle que soit l'application envisagée, si on s'intéresse au "voisement", on peut mettre en évidence certaines particularités. Comme il est connu, le voisement concerne des caractéristiques élémentaires de morceaux de parole, et plus précisément concerne les voyelles, ainsi qu'une partie des consonnes : "b", "d", "g", "j", etc. Ces lettres se caractérisent par un signal audiophonique de structure pseudo-périodique.Finally, whatever the application envisaged, if we are interested in "voicing", we can highlight certain peculiarities. As is known, the voicing relates to elementary characteristics of pieces of speech, and more specifically concerns vowels, as well only part of the consonants: "b", "d", "g", "j", etc. These letters are characterized by an audio signal from pseudo-periodic structure.

En traitement de la parole, il est courant de considérer que les régimes stationnaires, notamment le voisement précité, s'établissent sur des durées comprises entre 10 et 20 ms. Cet intervalle de temps est caractéristique des phénomènes élémentaires de la production de la parole et sera dénommé trame ci-après.In speech processing, it is common to consider that stationary regimes, in particular the mentioned above, are established on durations included between 10 and 20 ms. This time interval is characteristic of the elementary phenomena of production of the floor and will be referred to as the frame below.

Aussi, il est usuel que les procédé de débruitage prennent en compte cette caractéristique importante des signaux sonores comprenant de la parole.Also, it is usual for denoising processes take into account this important characteristic of audio signals including speech.

Ces procédés comprennent généralement les étapes principales suivantes : un découpage en trames du signal audiophonique à débruiter, le traitement de ces trames par une opération de transformée de Fourier (ou d'une transformée similaire) pour passer dans le domaine fréquentiel, le traitement de débruitage proprement dit par filtrage numérique, et un traitement, dual du premier, par une transformée de Fourier inverse, pour revenir dans le domaine temporel. La dernière étape consiste en une reconstruction du signal. Cette reconstruction peut être obtenue en multipliant chacune des trames par une fenêtre de pondération.These processes generally include the steps following main ones: a division into frames of the signal to denoise, the processing of these frames by a Fourier transform operation (or a similar transform) to move into the field frequency, the proper denoising treatment by digital filtering, and dual processing of the first, by an inverse Fourier transform, to return to the time domain. The last step is a signal reconstruction. This reconstruction can be obtained by multiplying each of the frames by a window of weighting.

Un des filtres numériques les plus utilisés pour ce type d'application est le filtre de Wiener, en particulier un filtre de Wiener dit optimal. Celui-ci présente l'avantage de traiter de façon différenciée les trames successives.One of the most used digital filters for this type of application is the Wiener filter, in particular a so-called optimal Wiener filter. This one presents the advantage of differentiating the frames successive.

En d'autres termes, et de façon plus générale, Le filtrage optimal de Wiener se trouve au centre des méthodes optimales de traitement du signal, basées sur les caractéristiques statistiques du second ordre et donc de la notion de corrélation. In other words, and more generally, The Wiener's optimal filtering is at the center of the methods signal processing, based on second order statistical characteristics and therefore of the notion of correlation.

Le filtrage de Wiener permet la séparation des signaux par décorrélation. Son importance est liée à la simplicité des calculs théoriques. En outre, il peut s'appliquer à une multitude de processus particuliers, et notamment, en ce qui concerne l'application préférée visée par l'invention, l'extraction d'un bruit polluant un signal de parole.Wiener filtering allows the separation of decorrelation signals. Its importance is linked to the simplicity of theoretical calculations. In addition, it can apply to a multitude of specific processes, and in particular, with regard to the preferred application targeted by the invention, the extraction of noise polluting a signal of speech.

Cependant, dans l'art connu, un problème classique rencontré lors du débruitage par filtrage de Wiener est la présence d'un bruit, appelé bruit musical, qui dégrade la perception du signal débruité. Ce bruit musical est dû aux fluctuations des densités spectrales du bruit présent dans le signal d'entrée. Pour certaines trames, en effet, la densité spectrale du bruit est supérieure, au moins sur un canal fréquentiel, à celle du modèle de bruit que l'on utilise dans ces techniques. Dans ce cas, les mécanismes propres au filtrage de Wiener provoquent l'apparition d'un bruit résiduel sur le signal débruité. Celui-ci est particulièrement désagréable d'un point de vue perceptuel de part son instabilité. En effet, lors de l'écoute d'un signal de parole, on distingue des bruits résiduels sous la forme de "glouglou", qui s'apparente à des distorsions que l'on peut attribuer à une grande variabilité du bruit polluant le signal de parole débruité ou signal "utile".However, in the known art, a classic problem encountered during denoising by Wiener filtering is the presence of a noise, called musical noise, which degrades the perception of the denoised signal. This musical noise is due to fluctuations in the spectral densities of the noise present in the input signal. For some frames, in fact, the noise spectral density is higher, at least on a frequency channel, to that of the noise model that we used in these techniques. In this case, the mechanisms specific to Wiener filtering cause the appearance of a residual noise on the denoised signal. It is particularly unpleasant from a perceptual point of view of share its instability. Indeed, when listening to a signal speech, there are residual noises in the form of "gurgling", which resembles distortions that one can attribute to a large variability of the polluting noise the denoised speech signal or "useful" signal.

L'invention se fixe donc pour but de pallier les inconvénients des procédés de filtrage de l'art connu, notamment l'inconvénient principal qui vient d'être rappelé : la présence d'un bruit résiduel parasite dans le signal débruité, dit "bruit musical". L'invention vise, de façon plus générale, à augmenter l'intelligibilité de la parole, dans son application principale.The invention therefore sets itself the aim of overcoming the disadvantages of filtering methods of the known art, especially the main drawback which has just been recalled: the presence of a parasitic residual noise in the denoised signal, called "musical noise". The invention aims to more generally, to increase the intelligibility of the speech, in its main application.

En vue d'atténuer fortement les effets du bruit musical, l'invention tire profit des deux observations expérimentales suivantes :

la probabilité de bruit musical est d'autant plus forte que l'estimée des densités spectrales du bruit est instable d'une trame à l'autre ;
la probabilité de présence de bruit musical est d'autant plus forte que l'estimée de la densité spectrale du bruit est faible par rapport à sa densité spectrale réelle.

In order to greatly reduce the effects of musical noise, the invention takes advantage of the following two experimental observations:

the probability of musical noise is all the greater when the estimate of the spectral densities of the noise is unstable from one frame to another;
the probability of the presence of musical noise is higher when the estimate of the spectral density of the noise is low compared to its real spectral density.

Selon une caractéristique principale de l'invention, le filtre de Wiener utilisé pour le filtrage numérique est modifié de façon optimisée en y introduisant un terme de compensation énergétique visant à surestimer le niveau de bruit. En outre, ce terme de compensation est adaptatif.According to a main characteristic of the invention, the Wiener filter used for digital filtering is modified in an optimized way by introducing a term of energy compensation aimed at overestimating the level of noise. In addition, this compensation term is adaptive.

L'invention a donc pour objet un procédé de filtrage fréquentiel pour le débruitage de signaux sonores bruités constitués de signaux sonores dits utiles mélangés à des signaux de bruit, le procédé comprenant au moins une étape de découpage desdits signaux sonores en une série de trames identiques d'une longueur déterminée et une étape de filtrage fréquentiel à l'aide d'un filtre de Wiener, caractérisé en ce qu'il comprend, en outre, les étapes suivantes :

élaboration à partir desdits signaux bruités d'un modèle de bruit sur un nombre N déterminé desdites trames, N étant compris entre des bornes minimale et maximale prédéterminées ;
application d'une transformée de Fourier auxdites N trames ;
estimation, pour chaque trame dudit modèle, de la densité spectrale de cette trame ;
estimation de la densité spectrale moyenne dudit modèle de bruit ;
calcul, à partir de ces deux estimations, d'un coefficient de surestimation statistique, ledit coefficient statistique étant égal au rapport maximal, pour lesdites N trames du modèle de bruit, entre le maximum de la densité spectrale d'une trame considérée dudit modèle de bruit, et le maximum de la densité spectrale estimée du modèle de bruit ;
estimation, pour chaque trame desdits signaux à débruiter, de sa densité spectrale ; et
modification, pour chaque trame desdits signaux à débruiter, des coefficients dudit filtre de Wiener pour que la relation suivante soit vérifiée : W(ν) = β1 - α · maxi · γ x ν γ u ν , relation dans laquelle α et β sont des coefficients fixes prédéterminés, dits coefficient statique de compensation énergétique et coefficient d'atténuation exponentielle, respectivement, ν décrit l'ensemble des canaux fréquentiels de ladite transformée de Fourier, γ_u(ν) étant l'estimée de la densité spectrale de la trame à débruiter, γ _x (ν) est ladite densité spectrale du modèle de bruit, et maxi ledit coefficient de surestimation statistique, modifiant le coefficient statique de compensation énergétique α.

The subject of the invention is therefore a method of frequency filtering for the denoising of noisy sound signals consisting of so-called useful sound signals mixed with noise signals, the method comprising at least one step of splitting said sound signals into a series of identical frames. of a determined length and a frequency filtering step using a Wiener filter, characterized in that it further comprises the following steps:

elaboration from said noisy signals of a noise model on a determined number N of said frames, N being between predetermined minimum and maximum limits;
applying a Fourier transform to said N frames;
estimation, for each frame of said model, of the spectral density of this frame;
estimation of the average spectral density of said noise model;
calculation, from these two estimates, of a statistical overestimation coefficient, said statistical coefficient being equal to the maximum ratio, for said N frames of the noise model, between the maximum of the spectral density of a considered frame of said model noise, and the maximum of the estimated spectral density of the noise model;
estimation, for each frame of said signals to be denoised, of its spectral density; and
modification, for each frame of said signals to be denoised, of the coefficients of said Wiener filter so that the following relation is verified: W (ν) = β 1 - α · max · γ x ν γ u ν , relation in which α and β are predetermined fixed coefficients, called static energy compensation coefficient and exponential attenuation coefficient, respectively, ν describes the set of frequency channels of said Fourier transform, γ _u (ν) being the estimated of the spectral density of the frame to be denoised, γ _x (ν) is said spectral density of the noise model, and maxi said statistical overestimation coefficient, modifying the static energy compensation coefficient α.

L'invention sera mieux comprise et d'autres caractéristiques et avantages apparaítront à la lecture de la description qui suit en référence aux figures annexées, parmi lesquelles :

la figure 1 illustre, sous forme de bloc diagramme, les principales étapes du procédé selon l'invention ;
la figure 2 illustre schématiquement un filtre de Wiener de l'art connu ;
la figure 3 est un diagramme illustrant la densité spectrale d'un modèle de bruit et les densités spectrales γ _u de chaque trame de ce modèle de bruit ;
les figures 4a et 4b sont des diagrammes comparatifs illustrant ces mêmes paramètres avec surestimation de la densité spectrale du modèle de bruit ;
la figure 5 est un diagramme illustrant ces mêmes paramètres avec surestimation adaptative de la densité spectrale du modèle de bruit ;
la figure 6 représente un exemple typique de signal issu d'une prise de son bruitée ;
la figure 7 est un organigramme représentant les étapes d'un procédé particulier de recherche d'un modèle de bruit ;
et la figure 8 est un organigramme détaillé représentant les étapes du procédé de filtrage numérique selon un mode de réalisation préféré de l'invention.

The invention will be better understood and other characteristics and advantages will appear on reading the description which follows with reference to the appended figures, among which:

FIG. 1 illustrates, in the form of a block diagram, the main steps of the method according to the invention;
FIG. 2 schematically illustrates a Wiener filter of the known art;
FIG. 3 is a diagram illustrating the spectral density of a noise model and the spectral densities γ _u of each frame of this noise model;
FIGS. 4a and 4b are comparative diagrams illustrating these same parameters with overestimation of the spectral density of the noise model;
FIG. 5 is a diagram illustrating these same parameters with adaptive overestimation of the spectral density of the noise model;
FIG. 6 represents a typical example of a signal coming from a noisy sound recording;
FIG. 7 is a flowchart showing the steps of a particular method of searching for a noise model;
and FIG. 8 is a detailed flow diagram representing the steps of the digital filtering method according to a preferred embodiment of the invention.

Les principales phases et étapes du procédé selon l'invention vont maintenant être décrites par référence au bloc diagramme de la figure 1. Chaque bloc, référencés 0 à 5, représente une phase du procédé, elle-même pouvant être subdivisée en étapes élémentaires.The main phases and stages of the process according to the invention will now be described with reference to block diagram of figure 1. Each block, referenced 0 to 5, represents a phase of the process, which itself can be subdivided into elementary steps.

Dans ce qui suit, pour fixer les idées et sans que cela limite en quoi que ce soit la portée de l'invention, on va se placer, dans le cadre du traitement de la parole bruitée. Comme il a été indiqué précédemment, il est courant de considérer que les régimes stationnaires, notamment le voisement, s'établissent sur des durées comprises entre 10 et 20 ms, intervalle de temps caractéristique des phénomènes élémentaires de la production de la parole et qui sera dénommé trame ci-après.In what follows, to fix the ideas and without this in any way limits the scope of the invention, we will be placed, within the framework of speech processing noisy. As noted earlier, it is common to consider that stationary regimes, in particular the voiced, established over durations between 10 and 20 ms, time interval characteristic of the phenomena of speech production and that will be referred to as the frame below.

Comme dans l'art connu, le procédé de l'invention, comprend une étape de découpage en trames du signal audiophonique à débruiter (bloc 0).As in the known art, the method of the invention, includes a step of splitting the signal into frames audiophonic to denois (block 0).

Dans la pratique, on met en oeuvre des techniques numériques. Aussi, les signaux de trame ne sont pas des signaux à "évolution continue", mais des signaux discrets, obtenus par échantillonnage. On suppose que les signaux sont échantillonnés à la période T_e , avant traitement numérique. Il est courant de considérer alors 2^p échantillons pour une trame de signal, en choisissant p de manière à ce que la valeur 2^p T_e soit de l'ordre grandeur de la durée D d'une trame. A titre d'exemple, pour une fréquence d'échantillonnage de 10 kHz, on choisit souvent des trames de 12,8 ms, de manière à pouvoir disposer de 128 points pour chaque trame, ce qui constitue une puissance de deux. Le nombre d'échantillons correspondant à une trame sera noté ci-après LGtrame. La relation suivante : D = LGtrame × T_e est donc satisfaite. L'étape de découpage en trames, comme indiqué sur la figure 1, est donc précédée d'une étape de numérisation par échantillonnage.In practice, digital techniques are used. Also, the frame signals are not "continuously evolving" signals, but discrete signals obtained by sampling. It is assumed that the signals are sampled at period T _e , before digital processing. It is common to then consider 2 ^p samples for a signal frame, choosing p so that the value 2 ^p T _e is of the order of magnitude of the duration D of a frame. By way of example, for a sampling frequency of 10 kHz, frames of 12.8 ms are often chosen, so that 128 points are available for each frame, which constitutes a power of two. The number of samples corresponding to a frame will be noted below LGtrame. The following relation: D = LGframe × T _e is therefore satisfied. The frame cutting step, as indicated in FIG. 1, is therefore preceded by a sampling digitization step.

Par convention, le signal d'entrée sera noté u(t), le signal utile s(t) et le bruit perturbateur x(t) de telle façon que : u(t) = s(t) + x(t) en temps continu u(kTe ) = s(kTe ) + x(kTe ) en temps discret By convention, the input signal will be noted u (t) , the useful signal s (t) and the disturbing noise x (t) so that: u (t) = s (t) + x (t) in continuous time u (( kT e ) = s (( kT e ) + x (( kT e ) in discrete time

Les étapes de numérisation et de découpage en trames (bloc 0) sont communes à l'art connu. Les échantillons numériques ainsi créés sont rangés dans une mémoire tampon circulante de type "FIFO" (c'est-à-dire du type "premier entré - premier sorti") afin d'être lus sous forme de trames successives.The stages of digitization and cutting into frames (block 0) are common to known art. The samples numeric thus created are stored in a buffer memory circulating type "FIFO" (ie type "first in - first out ") to be read as frames successive.

Les trames successivement lues subissent alors une série d'étapes de traitement autonomes, selon deux voies que l'on peut qualifier de "parallèles".The successively read frames then undergo a series of autonomous processing steps, in two ways that we can qualify as "parallels".

Les opérations effectuées dans le bloc 1, consiste à identifier des segments du signal à débruiter ne contenant que du bruit. La sortie de ce bloc est constituée d'une suite d'échantillons numériques représentatifs du bruit seul. En d'autres termes, un modèle de bruit est élaboré à partir des signaux bruités, ou plus précisément à partir des trames successivement lues (bloc 0). De nombreux procédés peuvent être mis en oeuvre et un exemple de procédé de recherche de modèle de bruit sera explicité ci-après.The operations carried out in block 1, consists in identify segments of the signal to be denoised that do not contain only noise. The output of this block consists of a series of digital samples representative of noise only. In other words, a noise model is developed at from noisy signals, or more precisely from frames successively read (block 0). Many processes can be implemented and an example of a method of Noise model research will be explained below.

Dans le bloc 2, trois étapes sont réalisées et consistent, à partir des échantillons fournis par le bloc 1, à effectuer :

l'estimation de la densité spectrale moyenne du bruit (par exemple par spectre moyen et corrélogramme lissé) ;
la détermination de l'énergie moyenne du modèle de bruit ;
et la détermination d'un coefficient traduisant la dispersion statistique du bruit.

In block 2, three steps are carried out and consist, from the samples provided by block 1, of performing:

estimation of the average spectral density of the noise (for example by average spectrum and smoothed correlogram);
determining the average energy of the noise model;
and determining a coefficient reflecting the statistical dispersion of the noise.

Les étapes ci-dessus, et notamment la dernière étape qui constitue une des caractéristiques principales de l'invention, seront détaillées ci-après.The above steps, including the last step which is one of the main characteristics of the invention will be detailed below.

Dans la branche "parallèle", le bloc 3 comporte une étape d'estimation de la densité spectrale de la trame courante de signal et de calcul de son énergie.In the "parallel" branch, block 3 has a step of estimating the spectral density of the frame signal current and its energy calculation.

Dans le bloc 4, selon une autre caractéristique essentielle de l'invention, les coefficients du filtre fréquentiel effectuant le débruitage du signal sont déterminés de la manière qui sera détaillée ci-après. Comme il a été indiqué, le procédé de l'invention est basé sur une compensation énergétique et une surestimation du bruit.In block 4, according to another characteristic essential of the invention, the coefficients of the filter frequency denoising the signal are determined in the manner which will be detailed below. As it has been indicated, the method of the invention is based on a energy compensation and noise overestimation.

Enfin, dans le bloc 5, le signal temporel débruité est reconstruit, en assurant la meilleure continuité possible entre les trames. Dans d'autres applications que l'application principale visée par l'invention les signaux peuvent être exploités tels quels par des divers procédés tels que la reconnaissance automatique de la parole. En soi, cette phase du procédé est commune à l'art connu, et il n'y a pas lieu de détailler la méthode de reconstruction ou d'exploitation des signaux en sortie du bloc 4.Finally, in block 5, the denoised time signal is rebuilt, ensuring the best continuity possible between frames. In applications other than the main application targeted by the invention the signals can be used as is by various methods such as automatic speech recognition. In itself, this phase of the process is common to the known art, and there is no no need to detail the reconstruction method or for processing the signals at the output of block 4.

Selon la caractéristique principale de l'invention, le procédé permet de modifier et d'optimiser les coefficients du filtre de Wiener utilisé pour la phase de débruitage proprement dite (bloc 4), de façon à éliminer ou, pour le moins, fortement atténuer, les bruits parasites dits "musicaux".According to the main characteristic of the invention, the process makes it possible to modify and optimize the coefficients of the Wiener filter used for the phase of denoising proper (block 4), so as to eliminate or, to say the least, strongly attenuate the so-called parasitic noises "musical".

Comme il a été rappelé, ces bruits sont attribuables à deux causes principales :

a/ la probabilité de bruit musical est d'autant plus forte que l'estimée des densités spectrales du bruit est instable d'une trame à l'autre ;

b/ la probabilité de présence de bruit musical est d'autant plus forte que l'estimée de la densité spectrale du bruit est faible par rapport à la densité spectrale réelle du bruit.

As mentioned, these noises are due to two main causes:

a / the probability of musical noise is all the greater when the estimate of the spectral densities of the noise is unstable from one frame to another;

b / the probability of the presence of musical noise is higher the lower the estimated spectral density of the noise compared to the real spectral density of the noise.

Selon l'invention, en relation avec la cause a/, la dispersion est quantifiée par un coefficient issu de l'analyse effectuée dans le bloc 2, à partir du modèle de bruit élaboré dans le bloc 1.According to the invention, in relation to the cause a /, the dispersion is quantified by a coefficient from the analysis carried out in block 2, from the model of noise developed in block 1.

De même, en relation avec la cause b/, pour réduire l'influence de la densité spectrale du bruit, en particulier lorsqu'elle est faible, le procédé selon l'invention effectue une surestimation de cette densité spectrale, en y introduisant un degré d'adaptivité afin d'optimiser la perception du signal débruité.Similarly, in relation to the cause b /, to reduce the influence of the spectral density of the noise, in particular when it is low, the method according to the invention overestimates this spectral density, in y introducing a degree of adaptability in order to optimize the perception of the denoised signal.

Avant de décrire plus en détail le procédé de l'invention, il est utile de rappeler brièvement les caractéristiques d'un filtre de Wiener selon l'art connu.Before describing in more detail the process of the invention, it is useful to briefly recall the characteristics of a Wiener filter according to known art.

La figure 2 illustre de façon très schématique un filtre de Wiener utilisé pour débruiter un signal bruité U(n). FIG. 2 very schematically illustrates a Wiener filter used to denoise a noisy signal U (n) .

A titre d'exemples non limitatifs, des filtres de Wiener sont décrits dans les livres suivants, auxquels on pourra se référer avec profit :

Yves THOMAS : "Signaux et systèmes linéaires", éditions MASSON (1994) ; et :
François MICHAUT : "Méthodes adaptatives pour le signal", édition HERMES (1992).

By way of nonlimiting examples, Wiener filters are described in the following books, to which one can profitably refer:

Yves THOMAS: "Signals and linear systems", MASSON editions (1994); and:
François MICHAUT: "Adaptive methods for the signal", HERMES edition (1992).

Sur la figure 2 les conventions suivantes ont été adoptées :

U(n) : transformée de Fourier discrète du processus aléatoire observé, soit le signal bruité ;
S(n) : transformée de Fourier discrète du processus "désiré", à estimer par filtrage linéaire de U(n) ;
X(n) : transformée de Fourier discrète du bruit additif polluant le signal utile ;
S and(n) : estimation de S(n) exprimée dans le domaine de

Fourier, avec ε=S and-S= erreur d'estimation (S étant le signal débruité réel) ; et

W(z) : filtre d'estimation exprimé dans le domaine fréquentiel.

In Figure 2 the following conventions have been adopted:

U (n) : discrete Fourier transform of the observed random process, ie the noisy signal;
S (n) : discrete Fourier transform of the "desired" process, to be estimated by linear filtering of U (n);
X (n) : discrete Fourier transform of the additive noise polluting the useful signal;
S and (n) : estimate of S (n) expressed in the domain of

Fourier, with ε = S and-S = estimation error ( S being the real denoised signal); and

W (z) : estimation filter expressed in the frequency domain.

Le filtre optimal de Wiener minimise la distance entre les variables aléatoires S(n) et S and(n) mesurée par l'erreur quadratique moyenne J : J=E[(S(n) - S (n))2] The optimal Wiener filter minimizes the distance between the random variables S (n) and S and (n) measured by the mean square error J: J = E [(S (n) - S (not) ) 2 ]

La minimisation de ce critère revient à rendre l'erreur d'estimation orthogonale au signal observé, ce qui se traduit par le principe d'orthogonalité : E[ε(n).U*(n)] = 0 The minimization of this criterion amounts to rendering the estimation error orthogonal to the observed signal, which results in the principle of orthogonality: E [ε (n) .U * (n)] = 0

En notant :

γ_S: la densité spectrale du signal utile, et
γ_X: la densité spectrale du bruit parasite,

le filtre de Wiener est décrit par la relation suivante : W(n) = γ S(n) γ S(n) + γ X(n) Noting:

γ _S: the spectral density of the useful signal, and
γ _X: the spectral density of the parasitic noise,

the Wiener filter is described by the following relation: W (n) = γ S (not) γ S (not) + γ X (not)

En prenant en compte l'indépendance de S(n) et de X(n), on obtient la relation ci-dessous : γU=γS + γX relation dans laquelle γ_U représentant la densité spectrale du signal observé.By taking into account the independence of S (n) and X (n) , we obtain the relation below: γ U = γ S + γ X relation in which γ _U representing the spectral density of the observed signal.

La relation décrivant le filtre de Wiener devient donc finalement : W(n) = γ S(n) γ S(n) + γ X(n) = 1 - γ X(n) γ U(n) The relation describing the Wiener filter therefore finally becomes: W (n) = γ S (not) γ S (not) + γ X (not) = 1 - γ X (not) γ U (not)

En pratique, c'est cette seconde formulation du filtre de Wiener qui est utilisée, puisqu'elle ne fait intervenir que des termes directement accessibles, c'est-à-dire, d'une part, le signal bruité reçu du bloc 3 et, d'autre part, le bruit, préalablement déterminé par le calcul du modèle de bruit (bloc 1).In practice, it is this second formulation of the Wiener filter which is used, since it does not intervene only directly accessible terms, that is to say, on the one hand, the noisy signal received from block 3 and, on the other hand, the noise, previously determined by the calculation of the noise model (block 1).

Il doit être remarqué que les coefficients W(n) du filtre de Wiener sont toujours positifs. Si des artefacts de calcul provoque une valeur négative pour un coefficient, ce coefficient est rendu égal à zéro.It should be noted that the coefficients W (n) of the Wiener filter are always positive. If computational artifacts cause a negative value for a coefficient, that coefficient is made equal to zero.

Selon l'art connu, la suppression du bruit additif par une méthode de soustraction spectrale, telle qu'elle est réalisée par un filtre Wiener, débouche sur la création de bruits dits "musicaux". Pour éviter l'apparition de ces bruits parasites désagréables à l'écoute et nuisibles à l'intelligibilité de la parole, ou pour le moins empêcher au maximum leur apparition, selon une caractéristique essentielle de l'invention, les coefficients du filtre de Wiener sont modifiés à l'aide de paramètres déterminés dans les blocs 2 et 3, de la manière qui va maintenant être détaillée.According to known art, the elimination of additive noise by a spectral subtraction method, as it is produced by a Wiener filter, leads to the creation of so-called "musical" noises. To avoid the appearance of these unpleasant noises when listening and harmful to speech intelligibility, or at the very least prevent maximum their appearance, according to a characteristic essential of the invention, the coefficients of the filter Wiener are modified using parameters determined in blocks 2 and 3, in the way that will now be detailed.

Lorsque le signal d'entrée ne contient que du bruit, le "bruit musical" supplémentaire est présent parce que, dans la pratique, l'estimation du rapport γ _s / γ _u fluctue à chaque fréquence, bien qu'en théorie ce rapport devrait être égal à l'unité quelles que soient les fréquences. Ce sont ces erreurs d'estimation qui produisent des filtres atténuateurs dont les variations des coefficients sont aléatoires, selon les fréquences et au cours du temps.When the input signal contains only noise, additional "musical noise" is present because, in practice, the estimate of the ratio γ _s / γ _u fluctuates at each frequency, although in theory this ratio should be equal to unity whatever the frequencies. It is these estimation errors which produce attenuating filters whose variations in the coefficients are random, depending on the frequencies and over time.

Pour fixer les idées, on considère l'exemple du débruitage d'un bruit seul, échantillonné à 44 kHz. On détermine la densité spectrale γ _x d'un modèle de bruit choisi à l'aide de ce signal et les densités spectrales γ _u de chaque trame (de longueur LGtrame) de ce bruit.To fix the ideas, we consider the example of the denoising of a single noise, sampled at 44 kHz. The spectral density γ _x of a noise model chosen using this signal is determined and the spectral densities γ _u of each frame (of length LG frame ) of this noise.

On a représenté la variation de ces deux paramètres sous forme de courbes dans le diagramme de la figure 3, en fonction du nombre de canaux de transformée de Fourier FFT. Pour tracer les courbes, il a été supposé que la longueur de trame était de 128 échantillons, soit LGtrame=128.The variation of these two parameters has been shown in the form of curves in the diagram of FIG. 3, as a function of the number of FFT Fourier transform channels. To plot the curves, it was assumed that the frame length was 128 samples, i.e. LGtrame = 128.

Ce diagramme montre clairement que les allures des deux courbes γ _x et γ _u sont similaires mais les deux estimées présentent une différence d'amplitude nette. Le pic principal de γ _u , qui se situe à la fréquence 2.75 kHz (64 canaux FFT correspondant à 22 kHz, soit la demi-fréquence d'échantillonnage) a une amplitude environ sept fois supérieure à celui de γ _x situé à la même fréquence. Ceci constitue la raison principale de la présence des bruits "musicaux". Lorsque, pour certaines fréquences référencées ν, γ _u (ν) est bien supérieur à γ _x (ν), cela signifie, en théorie, que la trame ne contient pas seulement du bruit mais une autre partie de signal. Dans ce cas, le filtrage de Wiener selon l'art connu débruite la trame correspondante comme si elle contenait du signal de parole utile, ce qui entraíne la présence de résidus de bruits.This diagram clearly shows that the paces of the two curves γ _x and γ _u are similar, but the two estimates have a clear difference in amplitude. The main peak of γ _u , which is located at the frequency 2.75 kHz (64 FFT channels corresponding to 22 kHz, i.e. the sampling half-frequency) has an amplitude approximately seven times greater than that of γ _x located at the same frequency . This is the main reason for the presence of "musical" noises. When, for certain frequencies referenced ν, γ _u (ν) is much greater than γ _x (ν), this means, in theory, that the frame contains not only noise but another part of the signal. In this case, Wiener filtering according to known art denoising the corresponding frame as if it contained useful speech signal, which causes the presence of noise residues.

Pour éviter cet effet parasite, le procédé selon l'invention modifie de façon optimisée les coefficients du filtre de Wiener et introduit un terme de compensation énergétique, venant surestimer artificiellement le niveau du bruit, avec différents niveaux d'adaptativité de cette compensation.To avoid this parasitic effect, the process according to the invention optimally modifies the coefficients of the Wiener filter and introduces a compensation term energy, artificially overestimating the level of noise, with different levels of adaptivity of this compensation.

Les coefficients du filtre de Wiener modifié obéissent à la relation suivante : W(ν) = β1 - α · maxi · Ex Eu · γ x ν γ u ν The coefficients of the modified Wiener filter obey the following relation: W (ν) = β 1 - α · max · E x E u · γ x ν γ u ν

En se reportant de nouveau à la relation (7), on constate aisément que quatre nouveaux termes ont été introduits, à savoir :

β : coefficient d'atténuation exponentielle ;

α : coefficient statique de compensation énergétique ;

E _x / E _u : rapport de pondération énergétique ; et

maxi : coefficient de surestimation statistique issu de l'analyse statistique du bruit, ce à partir d'un modèle de bruit établi lors de la phase du procédé correspondant au bloc 1.

Referring again to relation (7), it is easy to see that four new terms have been introduced, namely:

β: exponential attenuation coefficient;

α: static coefficient of energy compensation;

E _x / E _u : energy weighting ratio; and

max : statistical overestimation coefficient from the statistical noise analysis, based on a noise model established during the process phase corresponding to block 1.

Chacun de ces termes va maintenant être explicité.Each of these terms will now be explained.

Le coefficient d'atténuation exponentielle β est un terme communément utilisé dans la littérature consacré au domaine du filtrage numérique et, plus particulièrement, au débruitage. Une valeur typique de ce paramètre est de 0,5.The exponential attenuation coefficient β is a term commonly used in the literature on digital filtering and, more specifically, denoising. A typical value for this parameter is 0.5.

A titre d'exemple non limitatif, on pourra se reporter à l'article de L. Arslan, A. Mc Cree et V. Viswana- Than, intitulé :"New Methods for adaptive noise suppression", IEEEE, mai 1995, pages 812-815.As a nonlimiting example, we can see article by L. Arslan, A. Mc Cree and V. Viswana- Than, entitled: "New Methods for adaptive noise deletion ", IEEEE, May 1995, pages 812-815.

Le coefficient de compensation énergétique statique ^α permet de surestimer le bruit et est particulièrement pertinent dans le cas de la suppression de bruit seul. En effet, une valeur typique de α=10 appliquée à l'exemple de la figure 3 augmente l'estimée du spectre moyen de bruit γ _x d'environ +10 dB, ce qui permet alors de diminuer le niveau de bruit résiduel, puisque les coefficients du filtre de Wiener ne peuvent être négatifs. Dans le cas contraire, ils sont alors forcés à zéro.The static energy compensation coefficient ^α makes it possible to overestimate the noise and is particularly relevant in the case of noise suppression alone. Indeed, a typical value of α = 10 applied to the example of FIG. 3 increases the estimate of the average noise spectrum γ _x by approximately +10 dB, which then makes it possible to reduce the level of residual noise, since the coefficients of the Wiener filter cannot be negative. Otherwise, they are then forced to zero.

Cependant, si cette modification est très efficace pour éliminer le bruit seul, elle pose à son tour des problèmes lorsque les trames à débruiter contiennent du signal utile. Si ce signal utile est beaucoup plus énergétique que le bruit, ce coefficient multiplicateur α n'a pas d'effet sur la dégradation de ce signal. Mais, dans le cas contraire, il peut exister des fréquences ν pour lesquelles une trame de signal utile a une énergie non négligeable mais proche de celle du bruit pour les mêmes fréquences. Dans ce cas, la multiplication par α de γ _x (ν) impose des coefficients de Wiener W(ν) nuls et donc entraíne une disparition de l'énergie du signal pour ces fréquences. However, if this modification is very effective in eliminating noise alone, it in turn poses problems when the frames to be denoised contain useful signal. If this useful signal is much more energetic than the noise, this multiplier coefficient α has no effect on the degradation of this signal. But, in the opposite case, there may exist frequencies ν for which a useful signal frame has a non negligible energy but close to that of noise for the same frequencies. In this case, the multiplication by α of γ _x (ν) imposes zero Wiener coefficients W ( ν ) zero and therefore causes the signal energy to disappear for these frequencies.

Ce problème est illustré par les figures 4a et 4b. Sur ces figures les conventions suivantes ont été adoptées.

γ _u : : densité spectrale de la trame de signal considérée (trame de signal faiblement énergétique devant le bruit) ; et

γ _x : densité spectrale du modèle de bruit choisi (bloc 1).

This problem is illustrated in Figures 4a and 4b. In these figures the following conventions have been adopted.

γ _u :: spectral density of the signal frame considered (low energy signal frame in front of noise); and

γ _x : spectral density of the noise model chosen (block 1).

La courbe de la figure 4a permet de constater que l'énergie du signal dans la bande de fréquences Δν, représentée par la densité spectrale γ _x , n'est pas négligeable.The curve in FIG. 4a shows that the energy of the signal in the frequency band Δν, represented by the spectral density γ _x , is not negligible.

En se référant à la figure 4b, on peut constater que la multiplication de γ _x par le paramètre α=10 rend α.γ_x supérieur à γ _u dans la bande Δν. Il s'ensuit que le gain de Wiener est nul pour cette bande de fréquences qui n'apparaít plus dans la trame débruitée.Referring to Figure 4b, we can see that the multiplication of γ _x by the parameter α = 10 makes α. γ _x greater than γ _u in the Δν band. It follows that the Wiener gain is zero for this frequency band which no longer appears in the denoised frame.

Le rapport de pondération énergétique décrit ci-dessous permet de réduire cette distorsion dans le signal débruité.The energy weighting report described below reduces this distortion in the signal noisy.

Comme indiqué précédemment, le débruitage du bruit seul est correct, mais il peut être trop brutal dans les parties du signal utile.As previously mentioned, noise denoising alone is okay but it can be too brutal within parts of the useful signal.

Dans une variante préférée de l'invention, on remédie à cet inconvénient en faisant varier le coefficient α, ce en fonction de la présence ou non d'une partie de signal utile dans le signal à débruiter. De façon avantageuse, α reste proche de d'une valeur typique égale à 10, lorsque le signal bruité ne contient que du bruit, et varie entre 0 et 10, lorsqu'un signal utile est présent dans le signal bruité. On introduit donc avantageusement un degré d'adaptativité. In a preferred variant of the invention, remedy this drawback by varying the coefficient α, depending on the presence or not of part of useful signal in the signal to denois. In a way advantageous, α remains close to a typical equal value to 10, when the noisy signal contains only noise, and varies between 0 and 10, when a useful signal is present in the noisy signal. A degree is therefore advantageously introduced adaptivity.

C'est la fonction qui est assignée au rapport E _x / E _u qui vient multiplier ^α dans la relation (8), rapport dans lequel E _x est l'énergie moyenne du modèle de bruit et E _u l'énergie de la trame courante. Cela permet donc aux coefficients du filtre de Wiener de changer à chaque trame de façon différenciée selon la présence plus ou moins grande (en terme d'énergie) du signal de parole.It is the function which is assigned to the ratio E _x / E _u which comes to multiply ^α in the relation (8), ratio in which E _x is the average energy of the noise model and E _u the energy of the current frame . This therefore allows the coefficients of the Wiener filter to change at each frame in a differentiated manner according to the greater or lesser presence (in terms of energy) of the speech signal.

Si E _x ≅ E _u, alors α≅10 et la trame est considérée comme du bruit seul. Elle est correctement débruitée.If E _x ≅ E _u , then α≅10 and the frame is considered as noise alone. It is properly denoised.

Si au contraire E _x << E _u, cela signifie que la trame considérée est très énergétique devant le bruit et qu'il est nécessaire d'atténuer au minimum cette partie de signal.If on the contrary E _x << E _u , this means that the frame considered is very energetic in front of the noise and that it is necessary to attenuate this signal part as much as possible.

Cette troisième modification est illustrée par la figure 5. Sur cette figure, la trame de signal considérée est la même que celle utilisée pour les figures 4a et 4b, α = 10 et E x E u = 0,2. This third modification is illustrated by FIG. 5. In this figure, the signal frame considered is the same as that used for FIGS. 4a and 4b, α = 10 and E x E u = 0.2.

Grâce à cette pondération du coefficient α par E _xx / E _uu , la bande de fréquences Δν' dans laquelle le signal utile est éliminé, (c'est-à-dire les fréquences pour lesquelles les coefficients de γ _x sont supérieurs à ceux de γ _u ) est bien moins importante que lors de la modification par multiplication du seul coefficient α=10.Thanks to this weighting of the coefficient α by E _xx / E _uu , the frequency band Δν 'in which the useful signal is eliminated, (i.e. the frequencies for which the coefficients of γ _x are greater than those of γ _u ) is much less important than during the modification by multiplication of the only coefficient α = 10.

Ce type de filtre présente donc une bonne efficacité en termes d'élimination des segments de signal dégradés dans lesquels la parole est absente et de diminution des distorsions infligées au signal de parole utile.This type of filter therefore has good efficiency. in terms of eliminating degraded signal segments in which speech is absent and decrease in distortions inflicted on the wanted speech signal.

La probabilité de génération du "bruit musical" est également liée, comme il a été indiqué, à la variance des estimées de la densité spectrale du bruit sur l'ensemble des trames.The probability of generating "musical noise" is also related, as noted, to the variance of estimates of the spectral density of noise over all frames.

En effet, plus les densités spectrales estimées du bruit varient d'une trame à l'autre, plus la formation du bruit "musical" est probable.Indeed, the higher the estimated spectral densities of the noise vary from frame to frame, plus the formation of the "musical" noise is likely.

Selon un autre aspect important de l'invention, on rend dépendant la valeur du coefficient de surestimation des propriétés statistiques du bruit. Pour ce faire, il est introduit un coefficient, appelé maxi ci-après, proportionnel à la dispersion des valeurs de densités spectrales du bruit.According to another important aspect of the invention, the value of the overestimation coefficient is made dependent on the statistical properties of the noise. To do this, a coefficient, called maximum below, is introduced proportional to the dispersion of the spectral density values of the noise.

Le coefficient de surestimation devient alors :
α = α*maxi, avec maxi satisfaisant la relation suivante :

relation dans laquelle :

N est le nombre de trames du modèle de bruit ;
ν décrit l'ensemble des canaux fréquentiels, soit LGtrame/2 canaux ;
γ _i(ν) est la densité spectrale de la i ème trame du modèle de bruit dans le canal ν ; et
γ _x(ν) est la densité spectrale du modèle de bruit.

The overestimation coefficient then becomes:
α = α * max , with max satisfying the following relation:

relationship in which:

N is the number of frames of the noise model;
ν describes all the frequency channels, ie LGframe / 2 channels;
γ _i ( ν ) is the spectral density of the i th frame of the noise model in the channel ν; and
γ _x ( ν ) is the spectral density of the noise model.

Le coefficient maxi est égal au rapport maximal, pour toutes les trames du modèle de bruit, entre le maximum de la densité spectrale de la trame du modèle de bruit considérée, et le maximum de la densité spectrale estimée du modèle de bruit. The maximum coefficient is equal to the maximum ratio, for all the frames of the noise model, between the maximum of the spectral density of the frame of the noise model considered, and the maximum of the estimated spectral density of the noise model.

En d'autres termes, ce coefficient caractérise la disparité maximale du bruit pour les canaux fréquentiels portant une énergie importante. Multiplié par le coefficient α, il apporte une atténuation complémentaire proportionnelle à cette disparité.In other words, this coefficient characterizes the maximum noise disparity for frequency channels carrying significant energy. Multiplied by the coefficient α, it provides proportional additional attenuation to this disparity.

Pour élaborer une partie des paramètres entrant dans la modification des coefficients du filtre de Wiener, il est nécessaire de disposer d'un modèle de bruit (bloc 1 de la figure 1).To elaborate part of the parameters entering changing the coefficients of the Wiener filter it is necessary to have a noise model (block 1 of the figure 1).

L'élaboration d'un modèle de bruit d'un signal bruité est une opération classique en soi. Cependant, la méthode spécifique mise en oeuvre pour cette opération peut être une méthode de l'art connu, mais aussi une méthode originale.Developing a signal noise model noisy is a classic operation in itself. However, the specific method implemented for this operation can to be a method of known art, but also a method original.

On va décrire ci-après, par référence aux figures 6 et 7, une méthode d'élaboration d'un modèle de bruit, particulièrement adaptée aux applications principales visées par le procédé de l'invention, notamment le débruitage de signaux de parole bruités.We will describe below, with reference to Figures 6 and 7, a method for developing a noise model, particularly suitable for the main targeted applications by the process of the invention, in particular the denoising of noisy speech signals.

La méthode repose sur une recherche permanente et automatique d'un modèle de bruit. Cette recherche est faite sur les échantillons de signal u(t) numérisés et stockés dans une mémoire tampon d'entrée. Cette mémoire est capable de mémoriser simultanément tous les échantillons de plusieurs trames du signal d'entrée (au moins 2 trames et, dans le cas général, N trames).The method is based on a permanent and automatic search for a noise model. This research is done on the signal samples u (t) digitized and stored in an input buffer memory. This memory is capable of simultaneously memorizing all the samples of several frames of the input signal (at least 2 frames and, in the general case, N frames).

Le modèle de bruit recherché est constitué par une succession de plusieurs trames dont la stabilité en énergie et le niveau d'énergie relative font penser qu'il s'agit d'un bruit ambiant et non d'un signal de parole ou d'un autre bruit perturbateur. On verra plus loin comment se fait cette recherche automatique. The noise model sought consists of a succession of several frames including energy stability and the relative energy level suggest that it is ambient noise and not a speech signal or other disturbing noise. We will see later how this automatic search.

Lorsqu'un modèle de bruit est trouvé, tous les échantillons des N trames successives représentant ce modèle de bruit sont conservés en mémoire, de sorte que le spectre de ce bruit peut être analysé et peut servir au débruitage. Mais la recherche automatique de bruit continue à partir du signal d'entrée u(t) pour trouver éventuellement un modèle plus récent et plus adapté, soit parce qu'il représente mieux le bruit ambiant, soit parce que le bruit ambiant a évolué. Le modèle de bruit plus récent est mis en mémoire à la place du précédent, si la comparaison avec le précédent montre qu'il est plus représentatif du bruit ambiant.When a noise model is found, all the samples of the N successive frames representing this noise model are kept in memory, so that the spectrum of this noise can be analyzed and can be used for denoising. But the automatic noise search continues from the input signal u (t) to possibly find a newer and more suitable model, either because it better represents the ambient noise, or because the ambient noise has evolved. The more recent noise model is stored in place of the previous one, if the comparison with the previous one shows that it is more representative of the ambient noise.

Les postulats de départ pour l'élaboration automatique d'un modèle de bruit sont les suivants :

le bruit qu'on veut éliminer est le bruit de fond ambiant,
le bruit ambiant a une énergie relativement stable à court terme,
la parole est le plus souvent précédée d'un bruit de respiration du pilote qu'il ne faut pas confondre avec le bruit ambiant; mais ce bruit de respiration s'éteint quelques centaines de millisecondes avant la première émission de parole proprement dite, de sorte qu'on ne retrouve que le bruit ambiant juste avant l'émission de parole,
et enfin, les bruits et la parole se superposent en termes d'énergie de signal, de sorte qu'un signal contenant de la parole ou un bruit perturbateur, y compris la respiration dans le microphone, contient forcément plus d'énergie qu'un signal de bruit ambiant.

The starting postulates for the automatic development of a noise model are as follows:

the noise that we want to eliminate is the ambient background noise,
ambient noise has a relatively stable energy in the short term,
speech is most often preceded by a breathing noise from the pilot which should not be confused with ambient noise; but this breathing noise goes out a few hundred milliseconds before the first actual speech emission, so that we only find the ambient noise just before the speaking emission,
and finally, noise and speech are superimposed in terms of signal energy, so that a signal containing speech or disturbing noise, including breathing in the microphone, necessarily contains more energy than a ambient noise signal.

Il en résulte qu'on fera l'hypothèse simple suivante : le bruit ambiant est un signal présentant une énergie minimale stable à court terme. Par court terme, il faut entendre quelques trames, et on verra dans l'exemple pratique donné ci-après que le nombre de trames destiné à évaluer la stabilité du bruit est de 5 à 20. L'énergie doit être stable sur plusieurs trames, faute de quoi on doit supposer que le signal contient plutôt de la parole ou un bruit autre que le bruit ambiant. Elle doit être minimale, faute de quoi on considère que le signal contient de la respiration ou des éléments phonétiques de parole ressemblant à du bruit mais se superposant au bruit ambiant.It follows that we will make the simple assumption next: ambient noise is a signal with a minimum energy stable in the short term. In the short term, it must hear a few frames, and we will see in the example practice given below that the number of frames intended for assess the noise stability is from 5 to 20. The energy must be stable on several frames, otherwise we must assume that the signal contains speech or a noise other than ambient noise. It must be minimal, otherwise, the signal is considered to contain breathing or phonetic elements of speech resembling noise but superimposed on ambient noise.

La figure 6 représente une configuration typique d'évolution temporelle de l'énergie d'un signal microphonique au moment d'un début d'émission, de parole, avec une phase de bruit de respiration, qui s'éteint pendant quelques dizaines à centaines de millisecondes pour faire place au bruit ambiant seul, après quoi un niveau d'énergie élevé indique la présence de parole, pour revenir enfin au bruit ambiant.Figure 6 shows a typical configuration of time evolution of the energy of a signal microphone at the start of a broadcast, speech, with a breathing noise phase, which goes out during a few tens to hundreds of milliseconds to make room for ambient noise alone, after which an energy level high indicates the presence of speech, to finally return to the ambient noise.

La recherche automatique du bruit ambiant consiste alors à trouver au moins N1 trames successives (par exemple N1 = 5) dont les énergies sont proches les unes des autres, c'est-à-dire que le rapport entre l'énergie de signal contenue dans une trame et l'énergie de signal contenue dans la ou, de préférence, les trames précédentes est situé à l'intérieur d'une gamme de valeurs déterminée (par exemple compris entre 1/3 et 3). Lorsqu'une telle succession de trames d'énergie relativement stable a été trouvée, on stocke les valeurs numériques de tous les échantillons de ces N trames. Cet ensemble de NxP échantillons constitue le modèle courant de bruit. Il est utilisé dans le débruitage. L'analyse des trames suivantes continue. Si on trouve une autre succession d'au moins N1 trames successives répondant aux mêmes conditions de stabilité d'énergie (rapports d'énergies de trames dans une gamme déterminée), on compare alors l'énergie moyenne de cette nouvelle succession de trames à l'énergie moyenne du modèle stocké, et on remplace ce dernier par la nouvelle succession si le rapport entre l'énergie moyenne de la nouvelle succession et l'énergie moyenne du modèle stocké est inférieur à un seuil de remplacement déterminé qui peut être de 1,5 par exemple. The automatic search for ambient noise then consists in finding at least N1 successive frames (for example N1 = 5) whose energies are close to each other, that is to say that the ratio between the signal energy contained in a frame and the signal energy contained in the or, preferably, the preceding frames is located within a determined range of values (for example between 1/3 and 3). When such a succession of relatively stable energy frames has been found, the digital values of all the samples of these N frames are stored. This set of NxP samples constitutes the current noise model. It is used in denoising. Analysis of the following frames continues. If we find another succession of at least N1 successive frames meeting the same conditions of energy stability (frame energy ratios in a determined range), then we compare the average energy of this new succession of frames to l average energy of the stored model, and the latter is replaced by the new succession if the ratio between the average energy of the new succession and the average energy of the stored model is less than a determined replacement threshold which may be 1, 5 for example.

De ce remplacement d'un modèle de bruit par un modèle plus récent moins énergétique ou pas beaucoup plus énergétique, il résulte que le modèle de bruit se cale globalement sur le bruit ambiant permanent. Même avant une prise de parole, précédée d'une respiration, il existe une phase où le bruit ambiant seul est présent pendant une durée suffisante pour pouvoir être pris en compte comme modèle de bruit actif. Cette phase de bruit ambiant seul, après respiration, est brève. Le nombre N1 est choisi relativement faible, afin qu'on ait le temps de recaler le modèle de bruit sur le bruit ambiant après la phase de respiration.From this replacement of a noise model by a more recent model which is less energetic or not much more energetic, it follows that the noise model is generally wedged on permanent ambient noise. Even before speaking, preceded by breathing, there is a phase where ambient noise alone is present for a sufficient time to be taken into account as an active noise model. This phase of ambient noise alone, after breathing, is brief. The number N1 is chosen to be relatively low, so that there is time to readjust the noise model to the ambient noise after the breathing phase.

Si le bruit ambiant évolue lentement, l'évolution sera prise en compte du fait que le seuil de comparaison avec le modèle stocké est supérieur à 1. S'il évolue plus rapidement dans le sens croissant, l'évolution risque de ne pas être prise en compte, de sorte qu'il est préférable de prévoir de temps en temps une réinitialisation de la recherche d'un modèle de bruit. Par exemple, dans un avion au sol à l'arrêt, le bruit ambiant sera relativement faible, et il ne faudrait pas qu'au cours de la phase de décollage le modèle de bruit reste figé sur ce qu'il était à l'arrêt du fait qu'un modèle de bruit n'est remplacé que par un modèle moins énergétique ou pas beaucoup plus énergétique. On expliquera plus loin les méthodes de réinitialisation envisagées.If the ambient noise changes slowly, the change will be taken into account that the comparison threshold with the stored model is greater than 1. If it evolves more rapidly in the increasing direction, evolution risks not not be taken into account, so it is better to schedule a reset from time to time looking for a noise pattern. For example, on an airplane on the ground when stationary, the ambient noise will be relatively low, and it should not be that during the takeoff phase the noise model remains frozen on what it was at standstill the fact that a noise model is only replaced by a less energetic or not much more energetic. The reset methods will be explained later. considered.

La figure 7 représente un organigramme des opérations de recherche automatique d'un modèle de bruit ambiant.Figure 7 shows a flowchart of automatic noise pattern search operations ambient.

Le signal d'entrée u (t) , échantillonné à la fréquence F_e = 1/T_e et numérisé par un convertisseur analogique-numérique, est stocké dans une mémoire tampon capable de stocker tous les échantillons d'au moins 2 trames.The input signal u (t) , sampled at the frequency F _e = 1 / T _e and digitized by an analog-digital converter, is stored in a buffer memory capable of storing all the samples of at least 2 frames.

Le numéro de la trame courante dans une opération de recherche d'un modèle de bruit est désigné par n et est compté par un compteur au fur et à mesure de la recherche. A l'initialisation de la recherche, n est mis à 1. Ce numéro n sera incrémenté au fur et à mesure de l'élaboration d'un modèle de plusieurs trames successives. Lorsqu'on analyse la trame courante n, le modèle comprend déjà par hypothèse n-1 trames successives répondant aux conditions imposées pour faire partie d'un modèle.The number of the current frame in a search operation for a noise model is designated by n and is counted by a counter as the search is carried out. At the initialization of the search, n is set to 1. This number n will be incremented as a model of several successive frames is developed. When analyzing the current frame n , the model already includes by hypothesis n -1 successive frames meeting the conditions imposed to be part of a model.

On considère d'abord qu'il s'agit d'une première élaboration de modèle, aucun autre modèle précédent n'ayant été construit. On verra ensuite ce qui se passe pour des élaborations ultérieures.We first consider that this is a first model development, no other previous model having been built. We will then see what happens for further elaborations.

L'énergie de signal de la trame est calculée par sommation des carrés des valeurs numériques des échantillons de la trame. Elle est conservée en mémoire.The signal energy of the frame is calculated by summation of the squares of the numerical values of the samples of the frame. It is kept in memory.

On lit ensuite la trame suivante de rang n = 2, et son énergie est calculée de la même manière. Elle est également conservée en mémoire.We then read the next frame of rank n = 2, and its energy is calculated in the same way. It is also kept in memory.

On calcule le rapport entre les énergies des deux trames. Si ce rapport est compris entre deux seuils S et S' dont l'un est supérieur à 1 et l'autre est inférieur à 1, on considère que les énergies des deux trames sont proches et que les deux trames peuvent faire partie d'un modèle de bruit. Les seuils S et S' sont de préférence inverses l'un de l'autre (S' = 1/S) de sorte qu'il suffit de définir l'un pour avoir l'autre. Par exemple, une valeur typique est S = 3, S' = 1/3. Si les trames peuvent faire partie d'un même modèle de bruit, les échantillons qui les composent sont stockés pour commencer à construire le modèle, et la recherche continue par itération en incrémentant n d'une unité.The ratio between the energies of the two frames is calculated. If this ratio is between two thresholds S and S 'one of which is greater than 1 and the other of which is less than 1, it is considered that the energies of the two frames are close and that the two frames can be part of a noise model. The thresholds S and S ' are preferably inverse to each other ( S ' = 1 / S ) so that it suffices to define one to have the other. For example, a typical value is S = 3, S '= 1/3. If the frames can be part of the same noise model, the samples that compose them are stored to start building the model, and the search continues by iteration by incrementing n by one.

Si le rapport entre les énergies des deux premières trames sort de l'intervalle imposé, les trames sont déclarées incompatibles et la recherche est réinitialisée en remettant n à 1. If the ratio between the energies of the first two frames leaves the imposed interval, the frames are declared incompatible and the search is reset by resetting n to 1.

Dans le cas où la recherche continue, on incrémente le rang n de la trame courante, et on effectue, dans une boucle de procédure itérative, un calcul d'énergie de la trame suivante et une comparaison avec l'énergie de la trame précédente ou des trames précédentes, en utilisant les seuils S et S'.In the case where the search continues, the rank n of the current frame is incremented, and in an iterative procedure loop, the energy of the next frame is calculated and a comparison with the energy of the previous frame or previous frames, using thresholds S and S ' .

On notera à ce propos que deux types de comparaison sont possibles pour ajouter une trame à n-1 trames précédentes qui ont déjà été considérées comme homogènes en énergie : le premier type de comparaison consiste à comparer uniquement l'énergie de la trame n à l'énergie de la trame n-1. Le deuxième type consiste à comparer l'énergie de la trame n à chacune des trames 1 à n-1. La deuxième manière aboutit à une plus grande homogénéité du modèle mais elle a l'inconvénient de ne pas prendre en compte suffisamment bien les cas où le niveau de bruit croít ou décroít rapidement.It will be noted in this connection that two types of comparison are possible for adding a frame to n -1 previous frames which have already been considered as energy homogeneous: the first type of comparison consists in comparing only the energy of the frame n to l energy of the frame n -1. The second type consists in comparing the energy of frame n with each of frames 1 to n -1. The second way leads to a greater homogeneity of the model but it has the disadvantage of not taking into account sufficiently well the cases where the noise level increases or decreases quickly.

Ainsi, l'énergie de la trame de rang n est comparée avec l'énergie de la trame de rang n-1 et éventuellement d'autres trames précédentes (pas forcément toutes d'ailleurs).Thus, the energy of the frame of rank n is compared with the energy of the frame of rank n -1 and possibly of other previous frames (not necessarily all for that matter).

Si la comparaison indique qu'il n'y a pas homogénéité avec les trames précédentes, du fait que le rapport des énergies n'est pas compris entre 1/S et S, deux cas sont possibles :

ou bien n est inférieur ou égal à un nombre minimal N1 en dessous duquel le modèle ne peut pas être considéré comme significatif du bruit ambiant parce que la durée d'homogénéité est trop courte; par exemple N1 = 5; dans ce cas on abandonne le modèle en cours d'élaboration, et on réinitialise la recherche au début en remettant n à 1 ;
ou bien n est supérieur au nombre minimal N1. Dans ce cas, puisqu'on trouve maintenant un manque d'homogénéité, on considère qu'il y a peut-être un début de parole après une phase de bruit homogène, et on conserve à titre de modèle de bruit tous les échantillons des n-1 trames de bruit homogènes qui ont précédé le manque d'homogénéité. Ce modèle reste stocké jusqu'à ce qu'on trouve un modèle plus récent qui semble également représenter du bruit ambiant. La recherche est réinitialisée de toute façon en remettant n à 1.

If the comparison indicates that there is no homogeneity with the previous frames, because the energy ratio is not between 1 / S and S , two cases are possible:

or n is less than or equal to a minimum number N1 below which the model cannot be considered as significant of the ambient noise because the duration of homogeneity is too short; for example N1 = 5; in this case we abandon the model being developed, and we reset the search at the beginning by giving n to 1;
or n is greater than the minimum number N1. In this case, since we now find a lack of homogeneity, we consider that there may be a start of speech after a homogeneous noise phase, and we keep as a noise model all the samples of the n -1 homogeneous noise frames which preceded the lack of homogeneity. This model remains stored until a newer model is found which also appears to represent ambient noise. The search is reset anyway by resetting n to 1.

Mais la comparaison de la trame n avec les précédentes aurait pu encore aboutir à la constatation d'une trame encore homogène en énergie avec la ou les précédentes. Dans ce cas, ou bien n est inférieur à un deuxième nombre N2 (par exemple N2 = 20) qui représente la longueur maximale souhaitée pour le modèle de bruit, ou bien n est devenu égal à ce nombre N2. Le nombre N2 est choisi de manière à limiter le temps de calcul dans les opérations ultérieures d'estimation de densité spectrale de bruit.But the comparison of the frame n with the previous ones could still have resulted in the observation of a frame still homogeneous in energy with the previous one (s). In this case, either n is less than a second number N2 (for example N2 = 20) which represents the maximum length desired for the noise model, or else n has become equal to this number N2. The number N2 is chosen so as to limit the computation time in the subsequent operations for estimating the spectral noise density.

Si n est inférieur à N2, la trame homogène est ajoutée aux précédentes pour contribuer à construire le modèle de bruit, n est incrémenté et la trame suivante est analysée.If n is less than N2 , the homogeneous frame is added to the previous ones to help build the noise model, n is incremented and the next frame is analyzed.

Si n est égal à N2, la trame est également ajoutée aux n-1 trames homogènes précédentes et le modèle de n trames homogènes est stocké pour servir dans l'élimination du bruit. La recherche d'un modèle est par ailleurs réinitialisée en remettant n à 1.If n is equal to N2 , the frame is also added to the previous n -1 homogeneous frames and the model of n homogeneous frames is stored for use in eliminating noise. The search for a model is also reset by setting n to 1.

Les étapes précédentes concernent la première recherche de modèle. Mais une fois qu'un modèle a été stocké, il peut à tout moment être remplacé par un modèle plus récent.The previous steps relate to the first model search. But once a model has been stored, it can be replaced at any time by a model more recent.

La condition de remplacement est encore une condition d'énergie, mais cette fois elle porte sur l'énergie moyenne du modèle et non plus sur l'énergie de chaque trame.The replacement condition is still a energy condition but this time it's about the average energy of the model and no longer on the energy of each frame.

Par conséquent, si un modèle possible vient d'être trouvé, avec N trames où N1 < N < N2, on calcule l'énergie moyenne de ce modèle qui est la somme des énergies des N trames, divisée par N, et on la compare à l'énergie moyenne des N' trames du modèle précédemment stocké.Consequently, if a possible model has just been found, with N frames where N1 < N < N2, we calculate the average energy of this model which is the sum of the energies of the N frames, divided by N , and we compare it at the average energy of the N 'frames of the previously stored model.

Si le rapport entre l'énergie moyenne du nouveau modèle possible et l'énergie moyenne du modèle actuel en vigueur est inférieur à un seuil de remplacement SR, le nouveau modèle est considéré comme meilleur et on le stocke à la place du précédent. Sinon, le nouveau modèle est rejeté et l'ancien reste en vigueur.If the ratio between the average energy of the new possible model and the average energy of the current model in force is less than a replacement threshold SR, the new model is considered to be better and it is stored in place of the previous one. Otherwise, the new model is rejected and the old one remains in force.

Le seuil SR est de préférence légèrement supérieur à 1.The threshold SR is preferably slightly greater than 1.

Si le seuil SR était inférieur ou égal à 1, on stockerait à chaque fois les trames homogènes les moins énergétiques, ce qui correspond bien au fait qu'on considère que le bruit ambiant est le niveau d'énergie au dessous duquel on ne descend jamais. Mais, on éliminerait toute possibilité d'évolution du modèle si le bruit ambiant se mettait à augmenter.If the SR threshold were less than or equal to 1, we would store the least energetic homogeneous frames each time, which corresponds well to the fact that we consider that ambient noise is the energy level below which we never descend . But, we would eliminate any possibility of evolution of the model if the ambient noise started to increase.

Si le seuil SR était trop élevé au dessus de 1, on risquerait de mal distinguer le bruit ambiant et d'autres bruits perturbateurs (respiration), voire même certains phonèmes qui ressemblent à du bruit (consonnes sifflantes ou chuintantes par exemple). L'élimination de bruit à partir d'un modèle de bruit calé sur la respiration ou sur des consonnes sifflantes ou chuintantes risquerait alors de nuire à l'intelligibilité du signal débruité.If the SR threshold was too high above 1, there is a risk of making a poor distinction between ambient noise and other disturbing noises (breathing), or even certain phonemes that sound like noise (whistling or hissing consonants for example). Eliminating noise from a model of noise calibrated on respiration or on whistling or hissing consonants would risk damaging the intelligibility of the denoised signal.

Dans un exemple préféré le seuil SR est d'environ 1,5. Au-dessus de ce seuil on conservera l'ancien modèle ; en dessous de ce seuil on remplacera l'ancien modèle par le nouveau. Dans les deux cas, on réinitialisera la recherche en recommençant la lecture d'une première trame du signal d'entrée u(t), et en mettant n à 1. In a preferred example the threshold SR is approximately 1.5. Above this threshold we will keep the old model; below this threshold we will replace the old model with the new one. In both cases, the search will be reinitialized by recommencing the reading of a first frame of the input signal u (t) , and setting n to 1.

Pour rendre l'élaboration du modèle de bruit plus fiable, on peut prévoir que la recherche d'un modèle est inhibée si une émission de parole est détectée dans le signal utile. Les traitements numériques de signal couramment utilisés en détection de parole permettent d'identifier la présence de paroles en se fondant sur les spectres caractéristiques de périodicité de certains phonèmes, notamment les phonèmes correspondant à des voyelles ou à des consonnes voisées.To make the development of the noise model more reliable, we can predict that the search for a model is inhibited if speech emission is detected in the useful signal. Digital signal processing commonly used in speech detection allow identify the presence of words based on the characteristic spectra of periodicity of certain phonemes, in particular phonemes corresponding to vowels or voiced consonants.

Le but de cette inhibition est d'éviter que certains sons soient pris pour du bruit, alors que ce sont des phonèmes utiles, qu'un modèle de bruit fondé sur ces sons soit stocké et que la suppression du bruit postérieure à l'élaboration du modèle tende alors à supprimer tous les sons similaires.The purpose of this inhibition is to prevent certain sounds are taken for noise, whereas they are useful phonemes, that a noise model based on these sounds be stored and that noise suppression after the development of the model then tends to remove all similar sounds.

Par ailleurs, il est souhaitable de prévoir de temps en temps une réinitialisation de la recherche du modèle pour permettre une remise à jour du modèle alors que les augmentations du bruit ambiant n'ont pas été prises en compte du fait que SR n'est pas beaucoup supérieur à 1.Furthermore, it is desirable to provide from time to time for a reset of the search for the model to allow updating of the model when the increases in ambient noise have not been taken into account since SR is not much greater than 1.

Le bruit ambiant peut en effet augmenter de façon importante et rapide, par exemple pendant la phase d'accélération des moteurs d'un avion ou d'un autre véhicule, aérien, terrestre ou maritime. Mais le seuil SR impose que le modèle de bruit précédent soit conservé lorsque l'énergie moyenne de bruit augmente trop vite.Ambient noise can indeed increase significantly and rapidly, for example during the acceleration phase of the engines of an airplane or other vehicle, air, land or sea. However, the threshold SR requires that the previous noise model be kept when the average noise energy increases too quickly.

Si on souhaite remédier à cette situation, on peut procéder de différentes manières, mais la manière la plus simple est de réinitialiser le modèle périodiquement en recherchant un nouveau modèle et en l'imposant comme modèle actif indépendamment de la comparaison entre ce modèle et le modèle précédemment stocké. La périodicité peut être basée sur la durée moyenne d'élocution dans l'application envisagée ; par exemple les durées d'élocution sont en moyenne de quelques secondes pour l'équipage d'un avion, et la réinitialisation peut avoir lieu avec une périodicité de quelques secondes.If we want to remedy this situation, we can do it in different ways but the most simple is to reset the model periodically to looking for a new model and imposing it as a model active regardless of the comparison between this model and the previously stored model. Periodicity can be based on the average duration of speech in the application considered; for example the speaking times are in average of a few seconds for the crew of an airplane, and resetting can take place with a frequency of a few seconds.

La mise en oeuvre de la méthode d'élaboration d'un modèle de bruit (figure 1 : bloc 1) et, de façon plus générale du procédé selon l'invention, peut se faire à partir de calculateurs non spécialisés, pourvus de programmes de calcul nécessaires et recevant les échantillons de signaux numérisés tels qu'ils sont fournis par un convertisseur analogique-numérique, via un port adapté.The implementation of the method of developing a noise model (Figure 1: block 1) and, more general of the process according to the invention, can be done at from non-specialized computers, provided with necessary calculation programs and receiving the digitized signal samples as supplied by an analog-digital converter, via a port adapted.

Cette mise en oeuvre peut aussi se faire à partir d'un calculateur spécialisé à base de processeurs de signaux numériques, ce qui permet de traiter plus rapidement un plus grand nombre de signaux numériques.This implementation can also be done from a specialized computer based on signal processors digital, which enables faster processing large number of digital signals.

Les calculateurs sont associés, comme il est bien connu, à différents types de mémoires, statiques et dynamique, pour enregistrer les programmes et les données intermédiaires, ainsi qu'à des mémoires circulantes de type "FIFO". Le système comprend enfin un convertisseur analogique-numérique, pour la numérisation des signaux u(t), et un convertisseur numérique-analogique, en tant que de besoin, si les signaux débruités doivent être utilisés sous forme analogique.The computers are associated, as is well known, with different types of memories, static and dynamic, for recording the programs and the intermediate data, as well as with circulating memories of the "FIFO" type. Finally, the system includes an analog-to-digital converter, for digitizing u (t) signals , and a digital-to-analog converter, as needed, if the denoised signals are to be used in analog form.

En conclusion, et pour décrire de façon plus détaillée le procédé de l'invention, on peut découper les étapes de façon différente de ce qui a été décrit en référence à la figure 1 (qui illustre le procédé de façon plus synthétique). La figure 8 est un diagramme résumant toutes les étapes du procédé de filtrage selon l'invention, dans un mode de réalisation préféré.In conclusion, and to describe more detailed the process of the invention, we can cut the steps differently than what has been described in reference to Figure 1 (which illustrates the process so more synthetic). Figure 8 is a summary diagram all the stages of the filtering process according to the invention, in a preferred embodiment.

Ces étapes se répartissent en un premier sous-ensemble d'étapes permettant de déterminer les paramètres dépendant du modèle de bruit, et un second sous-ensemble d'étapes permettant de déterminer les paramètre dépendant seulement de la trame courante du signal à débruiter.These stages are divided into a first subset steps to determine the parameters depending on the noise model, and a second subset steps to determine the dependent parameters only from the current frame of the signal to be denoised.

La première étape du premier sous-ensemble, comprend une étape initiale de sélection d'un modèle de bruit adapté à l'application spécifique, avantageusement un modèle de bruit déterminé par la méthode décrite ci-dessus, en référence aux figures 6 et 7.The first step of the first subset, includes an initial step of selecting a suitable noise model to the specific application, advantageously a model of noise determined by the method described above, in reference to Figures 6 and 7.

Ce premier sous-ensemble d'étapes comprend deux branches.This first subset of steps includes two branches.

Dans la première branche, on calcule pour chaque trame du modèle de bruit (dans le domaine temporel), l'énergie de la trame, puis on calcule l'énergie moyenne des trames du modèle, ce qui permet d'estimer l'énergie moyenne du modèle, c'est-à-dire le paramètre E_x. In the first branch, the energy of the frame is calculated for each frame of the noise model (in the time domain), then the average energy of the frames of the model is calculated, which makes it possible to estimate the average energy of the model, i.e. the parameter E _x .

Dans la deuxième branche, on applique une transformée de Fourier aux trames du modèle de bruit, de façon à passer dans le domaine fréquentiel. Puis on détermine successivement la densité spectrale de la trame i (avec i = 1 .. N) du modèle de bruit dans le canal fréquentiel ν, soit γ _i(ν), et la densité spectrale du modèle de bruit dans le canal fréquentiel ν, soit γ _x(ν). A partir de ces deux paramètres, on détermine le coefficient statistique maxi de telle sorte qu'il vérifie la relation (9). Le paramètre γ _x(ν) est également utilisé pour le calcul d'un des autres coefficients du filtre de Wiener.In the second branch, a Fourier transform is applied to the frames of the noise model, so as to pass into the frequency domain. Then we determine successively the spectral density of the frame i (with i = 1 .. N ) of the noise model in the frequency channel ν , ie γ _i ( ν ) , and the spectral density of the noise model in the frequency channel ν , let γ _x ( ν ) . From these two parameters, the maximum statistical coefficient is determined so that it verifies the relation (9). The parameter γ _x ( ν ) is also used for the calculation of one of the other coefficients of the Wiener filter.

Le second sous-ensemble d'étapes comprend également deux branches.The second subset of steps also includes two branches.

Dans la première branche, on détermine l'énergie de la trame courante, soit E_u, et dans la seconde branche, on effectue l'estimation de la densité spectrale de la trame courante γ _u . In the first branch, the energy of the current frame, E _u , is determined, and in the second branch, the spectral density of the current frame γ _{u is} estimated.

A partir de ces deux paramètres et des paramètres γ _x et E_x, déterminés précédemment, on obtient les coefficients [E_x/E_u] et [γ _x(ν)/γ _u(ν)].From these two parameters and the parameters γ _x and E _x , determined previously, we obtain the coefficients [E _x / E _u ] and [γ _x ( ν ) / γ _u ( ν ) ].

Tous les coefficients du filtre de Wiener, conforme à la relation (8), sont donc déterminés à l'issu de ces étapes. Les coefficients α et β sont des coefficients fixes prédéterminés, typiquement égaux à 10 et 0,5, respectivement.All coefficients of the Wiener filter, conforming to relation (8), are therefore determined at the end of these steps. The coefficients α and β are fixed coefficients predetermined, typically equal to 10 and 0.5, respectively.

A la lecture de ce qui précède, on constate aisément que l'invention atteint bien les buts qu'elle s'est fixés.On reading the above, we can easily see that the invention does achieve the goals it has set for itself.

Il doit être clair cependant que l'invention n'est pas limitée aux seuls exemples de réalisations explicitement décrits, notamment en relation avec les figures 1 à 8.It should be clear, however, that the invention is not not limited to only examples of achievements explicitly described, in particular in relation to FIGS. 1 to 8.

En particulier, les exemples numériques n'ont été donnés que pour mieux préciser l'invention mais sont essentiellement liés à l'application spécifique envisagée. De ce fait, ils participent d'un simple choix technologique à la portée de l'Homme du Métier.In particular, the numerical examples have only been given only to better specify the invention but are essentially related to the specific application envisaged. Therefore, they participate in a simple technological choice within the reach of the skilled person.

En outre, comme il été rappelé, l'invention ne se réduit pas au seul domaine du filtrage de signaux contenant de la parole bruitée, même si ce domaine constitue une des applications préférées.Furthermore, as will be recalled, the invention is not not reduced to the only domain of filtering of signals containing noisy speech, even if this domain constitutes one of the favorite apps.

Claims

Frequency filtering method for denoising noisy sound signals ( u (t) ) consisting of so-called useful sound signals mixed with noise signals, the method comprising at least one step of cutting (0) said sound signals into a series of frames identical of a determined length and a frequency filtering step (4) using a Wiener filter, characterized in that it further comprises the following steps:

elaboration from said noisy signals ( u (t) ) of a noise model (1) over a determined number N of said frames, N being comprised between predetermined minimum and maximum limits;

applying a Fourier transform to said N frames;

estimation (2), for each frame of said model, of the spectral density of this frame;

estimation (2) of the average spectral density of said noise model;

calculation (2), from these two estimates, of a statistical overestimation coefficient, said statistical coefficient being equal to the maximum ratio, for said N frames of the noise model, between the maximum of the spectral density of a frame considered said noise model, and the maximum of the estimated spectral density of the noise model;

estimation (3), for each frame of said signals to be denoised ( u (t) ), of its spectral density; and:

modification (4), for each frame of said signals to be denoised ( u (t) ), of the coefficients of said Wiener filter so that the following relation is verified: W (ν) = β 1 - α · max · γ x ν γ u ν , relation in which α and β are predetermined fixed coefficients, called static energy compensation coefficient and exponential attenuation coefficient, respectively, ν describes the set of frequency channels of said Fourier transform, γ _u ( ν ) being the estimated of the spectral density of the frame to be denoised, γ _x ( ν ) is said spectral density of the noise model, and maxi said statistical overestimation coefficient, modifying the static energy compensation coefficient α.

Method according to claim 1, characterized in that said maximum statistical coefficient satisfies the following relationship:

Method according to one of claims 1 or 2, characterized in that it comprises the following additional steps:

calculating the average energy of said noise model E _x ;

calculation, for each frame of said signals to be denoised ( u (t) ), of the energy of the current frame E _u ; and

multiplication of said static energy compensation coefficient α by an energy weighting coefficient equal to the ratio E _x / E _u , so as to selectively modify these coefficients for each frame of said signals to be denoised ( u (t) ) by applying a coefficient continuously variable between an extrema and a minima, the extrema being substantially equal to unity when said useful signals are absent from said signals to be denoised ( u (t) ) and substantially equal to zero when the energy of said useful signals is much greater than the energy of said noise signals, and that said coefficients of the Wiener filter satisfy the following relation: W (ν) = β 1 - α · max · E x E u · γ x ν γ u ν

Method according to any one of previous claims, characterized in that said static energy compensation coefficient α is equal to 10.

Method according to any one of previous claims, characterized in that said exponential attenuation coefficient β is 0.5.

Method according to any one of the preceding claims, characterized in that it comprises an initial step (0) consisting in digitizing said signals to be denoised ( u (t) ) by sampling, each frame comprising p samples.

Method according to Claim 6, characterized in that the said noise model (1) is obtained by a repetitive search carried out continuously in the said signals to be denoised ( u (t) ), by searching for N successive frames, of p samples each, having the expected characteristics of a noise, by storing the corresponding NxP samples to constitute said noise model, and by repeating the search to find a new noise model and to store the new model to replace the previous one or to keep the previous model according to the characteristics respective of the two models.

Application of the method according to any one of the preceding claims to the denoising of noisy speech signals ( u (t) ).

Application of the method according to claim 8, characterized in that the duration of said frames is in the range 10 to 20 ms.