next up previous contents
Next: The short-term Fourier transform Up: Short-Term Fourier Analysis  Previous: Real valued input

Windowing

For speech processing we want to assume the signal is short-time stationary and perform a Fourier transform on these small blocks. Solution: multiple the signal by a window function that is zero outside some defined range.

The rectangular window is defined as:
eqnarray318

But consider the discontinuities this can generate, as illustrated in figure 14.

  figure325
Figure 14: A waveform truncated with a rectangular window

One way to avoid disconitinuities at the ends is to taper the signal to zero or near zero and hence reduce the mismatch.

The most common in speech analysis is the Hamming window:
eqnarray332
This is simply a raised cosine and is plotted out in figure 15.

  figure340
Figure 15: The hamming window

As stated in section 4.1.4, multiplication of the signal by a window function in the time domain is the same as convolving the signal in the frequency domain.

Rectangular window gives maximum sharpness but large side-lobes (ripples) - hamming window blurs in frequency but produces much less leakage. For example:

n = 512;
m = 128;
h = hamming(m);
sr = zeros(size(1:n));
sh = zeros(size(1:n));
f = pi / 3;

for i = 1:m ; sr(i) =        sin(f * i) ; end
for i = 1:m ; sh(i) = h(i) * sin(f * i) ; end
ar=abs(fft(sr));
ah=abs(fft(sh));

plot(ar(1:n/2))
plot(20 * log10(ar(1:n/2)))
plot(20 * log10(ah(1:n/2)))

Figure 16 shows the reactangular and hamming windowed sine wave. Figure 17 shows the magnitude of the FFT and Figure /reffigwinLog the magnitude on a dB scale.

  figure351
Figure 16: rectangular and Hamming windowed sine wave

  figure361
Figure 17: FFT of rectangular and Hamming windowed sine wave

  figure371
Figure 18: FFT of rectangular and Hamming windowed sine wave in dB



Speech Vision Robotics group/Tony Robinson