asrman: Speech Enhancement Using Kalman Filter

http://dea.brunel.ac.uk/cmsp/Home_Esfandiar/KalmanTime.htm

Speech Enhancement Using Kalman Filter (Time Domain Approach)

The use of Kalman Filter for speech enhancement in the form that is presented here was first introduced by Paliwal (1987) [3]. This method however is best suitable for reduction of white noise to comply with Kalman assumption. In deriving Kalman equations it it normally assumed that the process noise (the additive noise that is observed in the observation vector) is uncorrelated and has a normal distribution. This assumption leads to whiteness character of this noise. There are, however, different methods developed to fit the Kalman approach to colored noises [4].
It is assumed that speech signal is stationary during each frame, that is, the AR model of speech remains the same across the segment. To fit the one-dimensional speech signal to the state space model of Kalman filter we introduce the state vector as:

x(k)=[x(k-p+1) x(k-p+2) x(k-p+3) ... x(k)]^T (1)

where x(k) is the speech signal at time k. Speech signal is contaminated by additive white noise n(k):

y(k)=x(k)+n(k) (2)

The speech signal could be modelled with an AR process of order p.

x(k)=Σa_ix(k-i) + u(k) i=1...p (3)

where a_i's are AR (LP) coefficients and u(k) is the prediction error which is assumed to have a normal distribution ~N(0,Q). substituting equation 1 into equation 3 we get:

x(k)=Ax(k-1)+Gu(k) (4)

where,

G=[0 0 ... 0 1]^T

G has a length of p (LP order). and the observation equation would be:

y(k)=Hx(k)+n(k) (5)

H=G^T

n(k) as stated earlier has a Gaussian distribution ~N(0,R). The rest of the formulation for this filter is the same as general case.

There are several methods for extraction of LP model parameters from noisy data [5]. In this demonstration however, These parameters are assumed to be given so that we can assess the potential of Kalman Filter for speech enhancement without worrying about the extraction of these parameters and the effect of this error on the system. Other methods try to calculate the LP model parameters first and then use them for de-noising the speech signal or iteratively estimate and correct these values and enhance the speech (EM algorithm). a pre-cleaning block may also be used to extract an estimate of these values (like simple spectral subtraction methods). The initial value for x is the noisy data providing the a posteriori estimate error covariance matrix with diagonal value of R. The LP coefficients are calculated for segments that might or might not be overlapping. In the latter case special care should be taken to guarantee the continuity of the filter parameters (e.g. make sure you store filter parameters midway the segment where you are going to start your next segment filtering, so that you can use these values when going to next segment). It was mentioned in [3]that the use of x(k-p+1) calculated at time k would result in better performance relative to the value that was filtered for the first time (e.g. x(k-p+1) calculated at time k-p+1) since more information is incorporated for in calculating this value. This implementation results in a delayed Kalman filter.

Figure 1. a sample output of the Kalman filter, input SNR=0dB. A complete sentence (left) and a vowel section for detailed resolution (right)

you can also listen to this sample file bu clicking on the links below:
Clean Noisy Restored
An implementation of this method in Matlab may be downloaded here. You will also need to download this toolbox which contains files used in the m file. Note that in this implementation you need to provide the clean signal to the function as well as the noisy signal.
Home

asrman

Blog Archive

Sunday, November 13, 2011

Speech Enhancement Using Kalman Filter

No comments:

Post a Comment