next up previous contents
Next: The short-term Fourier transform Up: Short-Term Fourier Analysis  Previous: Real valued input


For speech processing we want to assume the signal is short-time stationary and perform a Fourier transform on these small blocks. Solution: multiple the signal by a window function that is zero outside some defined range.

The rectangular window is defined as:

But consider the discontinuities this can generate, as illustrated in figure 14.

Figure 14: A waveform truncated with a rectangular window

One way to avoid disconitinuities at the ends is to taper the signal to zero or near zero and hence reduce the mismatch.

The most common in speech analysis is the Hamming window:
This is simply a raised cosine and is plotted out in figure 15.

Figure 15: The hamming window

As stated in section 4.1.4, multiplication of the signal by a window function in the time domain is the same as convolving the signal in the frequency domain.

Rectangular window gives maximum sharpness but large side-lobes (ripples) - hamming window blurs in frequency but produces much less leakage. For example:

n = 512;
m = 128;
h = hamming(m);
sr = zeros(size(1:n));
sh = zeros(size(1:n));
f = pi / 3;

for i = 1:m ; sr(i) =        sin(f * i) ; end
for i = 1:m ; sh(i) = h(i) * sin(f * i) ; end

plot(20 * log10(ar(1:n/2)))
plot(20 * log10(ah(1:n/2)))

Figure 16 shows the reactangular and hamming windowed sine wave. Figure 17 shows the magnitude of the FFT and Figure /reffigwinLog the magnitude on a dB scale.

Figure 16: rectangular and Hamming windowed sine wave

Figure 17: FFT of rectangular and Hamming windowed sine wave

Figure 18: FFT of rectangular and Hamming windowed sine wave in dB

Speech Vision Robotics group/Tony Robinson