Multitask Noisy Speech Enhancement System

www.denoise.net

Overview
Example
Restorer
- Speech band equalizer
- Dynamics processing
- Noise gate
- Signal level limiter
- Clipping restoration
- Noise reduction
- Noise whitening
- Blind deconvolution
- Spectrum analyser
- Time stretching
- Spectral expander
- Fourier corrector
- Neural network corrector
- Decorrelation
- Joint approximation
- Homomorphic approximation
- Reverberation
Recorder
Browser
- Synchronisation
- Normalisation
Contact info

Time stretching

The time stretching module prolongs the time of the recording without affecting the voice pitch (the glottal tone frequency is not changed). Up to 100% longer playback time may be obtained (two times the original playback time). Two time stretching methods are available: interbuffer overlapping and intrabuffer overlapping. Additionally, cepstral analysis is implemented for estimation of glottal tone frequency, to maintain phase continuity.

Interbuffer overlapping

The signal is divided into segments, 1024 samples each. These segments are 'moved apart' on the time line, and the spaces between segments are filled with estimated samples. Estimation is performed using linearly weighted averaging of samples from the previous and the next signal section in order to minimize phase distortion and maintain signal continuity.

If the signal is periodic or quasi-periodic (glottal tone is present), the stretching factor depends on the fundamental frequency of the signal and is the multiple of the signal period. The cepstral analysis is used as glottal tone detector and the estimator of its frequency.

Intrabuffer overlapping The signal is divided into segments, 1024 samples each. A copy of each segment is created and shifted in time, so that the original segment and its copy partially overlap. The sample values in the overlap region are computed as the linearly weighted average of sample values in the original section and its copy.

If the signal is periodic or quasi-periodic (glottal tone is present), the stretching factor depends on the fundamental frequency of the signal and is the multiple of the signal period. The cepstral analysis is used as glottal tone detector and the estimator of its frequency.

---

The glottal tone detector uses cepstral analysis performed by means of cosine transform. Instantaneous power cepstrum of the signal is computed from the single signal section (1024 samples). The algorithm seeks for a local maximum of the cepstrum in the frequency range ca. 78 - 719 Hz (in most case, the glottal tone frequency is within this range). The user may change the frequency range used in analysis to obtain better results - sonogram and cepstrum plot of the recording, obtained using the spectrum analyser module, are helpful in determining the changes in glottal tone frequency. Additionally, the algorithm for tracking the local cepstrum maximum is implemented by time-windowing of the cepstral coefficients. If the speech signal segment is unvoiced, the detected maximum of the cepstrum is ignored, according to the detection threshold set by the user. As a result, the glottal tone detector function is not affected by distortions that mask the speech signal. The glottal tone frequency is estimated from the local cepstrum maximum found by the detector, using the second order interpolation.

The time stretching module window allows user to set the stretching factor (0% - 100%) and the expected glottal tone frequency (low voice - high voice). Additional parameters may be set in the advance mode: overlapping method, phase correction, weighting function (rectangular or triangular), time window, frequency range used by the glottal tone detector and the detection threshold.