Maxkit: DSP 卷積 Convolution

Convolution 最基本的應用是訊號的 filtering，以下會討論兩種最常見的 filtering：average filters 及 Gaussian filters

卷積運算

Convolution 可解釋為：針對兩個時間函數進行數學運算，產生另一個時間函數的過程。

給定兩個函數 \(x(t), h(t)\)，連續時間的卷積定義為：

\(y(t) = \int_{-∞}^{∞}x(τ)h(t-τ)dτ = \int_{-∞}^{∞}h(τ)x(t-τ)dτ\) 或

\(y(t) = x(t) * h(t)\)

卷積運算符合交換率，線性與時間不變性原則，是 LTI 系統。

數位訊號的卷積定義為：

\( y[n] = \sum_{k=-∞}^{∞}x[k] \cdot h[n-k] = \sum_{k=-∞}^{∞}h[k] \cdot x[n-k] \) 或

\( y[n]=x[n]*h[n] \)

離散的卷積運算，也符合交換率

因為任意數位訊號都可以用單位脈衝 Unit Impulse表示

\(x[n]=\sum_{n=-∞}^{∞}x[k]δ[n-k]\)

代入 DSP 系統，可得到

\(y[n] = T\{\sum_{n=-∞}^{∞}x[k]δ[n-k]\} = \sum_{n=-∞}^{∞}x[k] \cdot T\{δ[n-k]\}\)

跟卷積的定義比較

\(h[n-k] = T\{δ[n-k]\}\) ，假設 DSP 系統具有時間不變性，則

\(h[n] = T\{δ[n]\}\)

DSP 的輸入訊號為單位脈衝 Unit Impulse δ[n]，輸出訊號為 h[n]，因此 h[n] 經常稱為脈衝響應 Impulse Response。如果輸入 \( \{x(n)\}, n=1,2,...N\) 長度為 N，脈衝響應 \( \{h(n)\}, n=1,2,...M\) 長度為 M，則卷積運算結果，長度為 \(M+N-1\)

ex: 數位訊號 \(x=\{1,2,4,3,2,1,1\}\) ，脈衝響應為 \(h=\{1,2,3,1,1\}\)，卷積運算

\( y[n] = \sum_{k=-∞}^{∞}x[k] \cdot h[n-k]\)

則 \(y[0] = \sum_{k=-∞}^{∞}x[k] \cdot h[-k] = x[0] \cdot h[0] = 1\)

\(y[1] = \sum_{k=-∞}^{∞}x[k] \cdot h[1-k] = x[0] \cdot h[1] + x[1] \cdot h[0] = 1 \cdot 2 + 2 \cdot 1 = 4\)

\(y[2] = \sum_{k=-∞}^{∞}x[k] \cdot h[2-k] = x[0] \cdot h[2] + x[1] \cdot h[1] + x[2] \cdot h[0] = 1 \cdot 3 + 2 \cdot 2 + 4 \cdot 1 = 11\)

另一種計算方法：

因輸入訊號的數量 N=7，脈衝響應數量為 M = 5，卷積結果長度為 \(M+N-1 = 11\)

計算前先將 \(x[n]\) 兩邊補上 \(M-1=4\) 個 0，稱為 Zero-Padding，再將 h(n) 旋轉 180 度

x	h
0 0 0 0 1 2 4 3 2 1 1 0 0 0 0	1 2 3 1 1
1 1 3 2 1
\(y[0] = 0 \cdot 1 + 0 \cdot 1 + 0 \cdot 3 + 0 \cdot 2 + 1 \cdot 1 = 1 \)

x	h
0 0 0 0 1 2 4 3 2 1 1 0 0 0 0	1 2 3 1 1
1 1 3 2 1
\(y[1] = 0 \cdot 1 + 0 \cdot 1 + 0 \cdot 3 + 1 \cdot 2 + 2 \cdot 1 = 4 \)

最後卷積運算結果為

n	0	1	2	3	4	5	6	7	8	9	10
y[n]	1	4	11	18	23	20	16	10	6	2	1
			✓	✓	✓	✓	✓	✓	✓

實際運算時，會擷取卷積運算的部分結果

n	0	1	2	3	4	5	6
x[n]	1	2	4	3	2	1	1
y[n]	11	18	23	20	16	10	6

numpy 有提供 full, same 兩種卷積運算的結果

import numpy as np

x = np.array( [ 1, 2, 4, 3, 2, 1, 1 ] )
h = np.array( [ 1, 2, 3, 1, 1 ] )
y = np.convolve( x, h, 'full' )
y1 = np.convolve( x, h, 'same' )

print( "x =", x )
print( "h =", h )
print( "Full Convolution y =", y )
print( "Convolution y =", y1 )

執行結果

$ python convolution.py
x = [1 2 4 3 2 1 1]
h = [1 2 3 1 1]
Full Convolution y = [ 1  4 11 18 23 20 16 10  6  2  1]
Convolution y = [11 18 23 20 16 10  6]

交換率、結合率

import numpy as np

x = np.array( [ 1, 2, 4, 3, 2, 1, 1 ] )
h = np.array( [ 1, 1, 1 ] ) / 3

y = np.convolve( x, h, 'full' )
y1 = np.convolve( h, x, 'full' )
z = np.convolve( x, h, 'same' )
z1 = np.convolve( h, x, 'same' )

print( "x =", x )
print( "h =", h )

# 卷積運算滿足 交換率
print( "交換率")
print( "Full Convolution y =", y )
print( "Full Convolution y1 =", y1 )

print( "Convolution z =", z )
print( "Convolution z1 =", z1 )

print( "結合率")
h2 = np.array( [ 1, 2, 1 ] ) / 4
k1 = np.convolve( np.convolve( x, h, 'full' ), h2, 'full' )
k2 = np.convolve( x, np.convolve( h, h2, 'full' ), 'full' )
print( "Full Convolution k1 =", k1 )
print( "Full Convolution k2 =", k2 )

執行結果

$ python convolution.py
x = [1 2 4 3 2 1 1]
h = [0.33333333 0.33333333 0.33333333]
交換率
Full Convolution y = [0.33333333 1.         2.33333333 3.         3.         2.
 1.33333333 0.66666667 0.33333333]
Full Convolution y1 = [0.33333333 1.         2.33333333 3.         3.         2.
 1.33333333 0.66666667 0.33333333]
Convolution z = [1.         2.33333333 3.         3.         2.         1.33333333
 0.66666667]
Convolution z1 = [1.         2.33333333 3.         3.         2.         1.33333333
 0.66666667]
結合率
Full Convolution k1 = [0.08333333 0.41666667 1.16666667 2.16666667 2.83333333 2.75
 2.08333333 1.33333333 0.75       0.33333333 0.08333333]
Full Convolution k2 = [0.08333333 0.41666667 1.16666667 2.16666667 2.83333333 2.75
 2.08333333 1.33333333 0.75       0.33333333 0.08333333]

串聯、並聯

卷積運算的 DSP 系統符合線性時間不變性，是 LTI (Linear Time-Invariant) 系統。LTI 系統可以透過串聯或並聯，組合成另一個 LTI 系統。

串聯可表示為

\( x[n] → h_1[n] → h_2[n] → y[n] = x[n] → h_1[n] * h_2[n] → y[n]\)

\(y[n]=h_2[n]*(h_1[n]*x[n])=(h_2[n]*h_1[n])*x[n]=(h_1[n]*h_2[n])*x[n]\) (卷積運算符合結合率與交換率)

兩個 LTI 系統串聯後，修改先後順序，不影響結果。

並聯可表示為

\(x[n] → h_1[n]+h_2[n] → y[n] \)

\(y[n]=h_1[n]*x[n]+h_2[n]*x[n]=(h_1[n]+h_2[n])*x[n]\) (滿足分配率)

濾波器 filter

卷積最基本的應用是訊號的濾波 filtering，因此脈衝響應 Impulse Response 也經常稱為濾波器 filter。

平均濾波器 Average Filter

\(h[n]=\frac{1}{M} \{1,1,....1 \}, n=0,1,....,M-1\)

ex: \(x={1,2,4,3,2,1,1}, n=0,1,...6\) ，套用大小為 3 的平均濾波器，\(h[n]=\frac{1}{3} \{1,1,1\}, n=0,1,2\)

輸出訊號為 \(y= \{1,\frac{7}{3}, 3, 3, 2, \frac{4}{3}, \frac{2}{3} \}, n=0,1,...6\)

平均濾波器典型用來去除雜訊 (Noise Removal)，也稱為 Smoothing。通常設計 Smoothing 的濾波器，其脈衝響應的總和為 1 ( \(\sum_{n}h[n]=1\))

import numpy as np
import numpy.random as random
import matplotlib.pyplot as plt

# 產生 200 個點
t = np.linspace( 0, 1, 200, endpoint = False )
# 作 sinusoid() 弦波，並加上 均勻雜訊 Uniform Noise
x = 10 * np.cos( 2 * np.pi * 5 * t ) + random.uniform ( -5, 5, 200 )
#  average filter
h = np.ones( 7 ) / 7
# convolution
y = np.convolve( x, h, 'same' )

plt.figure( 1 )
plt.plot( t, x )
plt.xlabel( 't (second)' )
plt.ylabel( 'Amplitude' )

plt.figure( 2 )
plt.plot( t, y )
plt.xlabel( 't (second)' )
plt.ylabel( 'Amplitude' )

plt.show( )

左圖是加上雜訊的原始訊號，右圖是套用 average filter 的結果

高斯濾波器 Gaussian Filter

\(g[n]=e^{-n^2/2𝜎^2} \) 高斯函數的平均值 \(𝜇=0\)，通常介於 -3𝜎 ~ 3𝜎 之間，因此設定高斯濾波器的大小為 6𝜎+1

如果高斯濾波器的 𝜎=1.0 ，則獲取高斯濾波器的數值，介於 -3 ~ 3 之間，濾波器大小為 7，且 \(\sum_{n}g[n]=1\)，為了讓輸出訊號的數值範圍更接近輸入訊號，因此在取高斯函數作為濾波器的係數後，再進一步正規化。

高斯濾波器的典型應用也是去雜訊

import numpy as np
import numpy.random as random
import scipy.signal as signal
import matplotlib.pyplot as plt

# 產生 200 個點
t = np.linspace( 0, 1, 200, endpoint = False )
# 作 sinusoid() 弦波，並加上 均勻雜訊 Uniform Noise
x = 10 * np.cos( 2 * np.pi * 5 * t ) + random.uniform( -5, 5, 200 )

sigma = 3                                      # 標準差
filter_size = 6 * sigma + 1                    # 濾波器大小  19
gauss = signal.gaussian( filter_size, sigma )  # 濾波器係數

sum = np.sum( gauss )                          # 正規化
gauss = gauss / sum

y = np.convolve( x, gauss, 'same' )

plt.figure( 1 )
plt.plot( t, x )
plt.xlabel( 't (second)' )
plt.ylabel( 'Amplitude' )

plt.figure( 2 )
plt.plot( t, y )
plt.xlabel( 't (second)' )
plt.ylabel( 'Amplitude' )

plt.show( )

wav 濾波

一般來說，平均、高斯濾波的結果較平滑，造成輸出訊號的振幅會變小，使得音量變小。加上正規化 Normalization的處理，可調整輸出音訊檔的訊號範圍，介於 -30000~30000 之間。

import numpy as np
import wave
from scipy.io import wavfile
import struct
import scipy.signal as signal
import sys

def average_filtering( x, filter_size ):
    h = np.ones( filter_size ) / filter_size
    y = np.convolve( x, h, 'same' )
    return y

def gaussian_filtering( x, sigma):
    filter_size = 6 * sigma + 1
    gauss = signal.gaussian( filter_size, sigma )

    sum = np.sum( gauss )
    gauss = gauss / sum

    y = np.convolve( x, gauss, 'same' )
    return y

def normalization( x, maximum ):
    x_abs = abs( x )
    max_value = max( x_abs )
    y = x / max_value * maximum
    return y

def main( ):
    inputfile = "input.wav"
    output1 = "output1.wav"
    output2 = "output2.wav"

    # 讀取 wav
    wav = wave.open( inputfile, 'rb' )
    num_channels = wav.getnchannels( )  # 通道數
    sampwidth    = wav.getsampwidth( )      # 樣本寬度
    fs           = wav.getframerate( )          # 取樣頻率(Hz)
    num_frames   = wav.getnframes( )        # 音框數 = 樣本數
    comptype     = wav.getcomptype( )           # 壓縮型態
    compname     = wav.getcompname( )           # 無壓縮
    wav.close( )

    sampling_rate, x = wavfile.read( inputfile )    # 輸入訊號

    ##########################
    # DSP Average Filtering
    print( "Filtering" )
    print( "(1) Average Filtering" )

    # filter_size = eval( input( "Filter Size = " ) )
    filter_size = 31
    y = average_filtering( x, filter_size )

    # 正規化 調整音量
    y = normalization( x, 30000 )

    # 寫入 wav
    wav_file = wave.open( output1, 'w' )
    wav_file.setparams(( num_channels, sampwidth, fs, num_frames, comptype, compname ))

    for s in y:
        wav_file.writeframes( struct.pack( 'h', int ( s ) ) )

    wav_file.close( )

    ##########################
    # DSP Gaussian Filtering
    print( "(2) Gaussian Filtering" )

    # sigma = eval( input( "Sigma = " ) )
    sigma = 31
    y = gaussian_filtering( x, sigma )

    # 正規化 調整音量
    y = normalization( x, 30000 )

    # 寫入 wav
    wav_file = wave.open( output2, 'w' )
    wav_file.setparams(( num_channels, sampwidth, fs, num_frames, comptype, compname ))

    for s in y:
        wav_file.writeframes( struct.pack( 'h', int ( s ) ) )

    wav_file.close( )


if __name__ == "__main__":
    sys.exit(main())