Multitask Noisy Speech Enhancement System
- Speech band equalizer
- Dynamics processing
- Noise gate
- Signal level limiter
- Clipping restoration
- Noise reduction
- Noise whitening
- Blind deconvolution
- Spectrum analyser
- Time stretching
- Spectral expander
- Fourier corrector
- Neural network corrector
- Joint approximation
- Homomorphic approximation
The blind deconvolution module removes linear distortions introduced to the signal by the transmission channel. These distortions modify the signal spectrum: some frequency bands are amplified and other bands are attenuated. This results in unnaturally sounding speech.
The blind deconvolution module uses the same algorithm as the noise whitening module. The main difference is that a speech signal (the pattern) is used in the processing instead of part that contains only noise. Ideally, the pattern should contain only speech signal from the same speaker as in the processed recording and it should not contain any noise. The aim of the processing is to match the spectrum of the parts of recording that contain speech signal to the spectrum of the pattern.
The algorithm works as follows:
The averaging of the noise spectrum may be described by the equation:
The equation above describes power spectrum estimate. Power spectrum of the stationary random signal xn is related to the correlation by discrete Fourier transform:
Average signal power, limited by Nyquist frequency, is given by:
and power spectral density (PSD) is:
Practically, averaged spectrum is obtained by calculating the spectra of the segmented signal (segments may overlap and they may be time-weighted using windows). The averaged signal spectrum is the average spectral energy from all segments. Spectrum in each segment is calculated using the discrete cosine transform. Averaged spectrum is additionally smoothed using the moving average filter in cepstrum domain in order to avoid extreme variations of spectrum magnitude.
Before the user starts the processing with the blind deconvolution module, noise pattern has to be selected and opened in the application from a separate sound file. The following module parameters may be set by the user during the processing.
To obtain best results using this module it is recommended to make, if possible, separate recording of the speaker present in the restored recording, in low noise environment, using high quality microphone and recorder.
|© 2004 Multimedia Systems Department, Gdansk University of Technology and Air Force Academy in Deblin|