2020/10/26

TensorFlow手寫數字辨識_CNN

TensorFlow手寫數字辨識_CNN

以 MLP 方式建立的模型,正確率約為 96%,要再進一步提升正確率,就要使用 Yann Lecun 提出的 CNN Convolutional Neural Network。

CNN 簡介

卷積運算就是將一個影像,經過卷積運算後,產生多個影像,分為兩個部分

  1. 卷積與縮減取樣,提取影像的特徵

    經過第一次卷積、第一次縮減取樣、第二次卷積、第二次縮減取樣,提取影像的特徵

  2. 完全連結神經網路

    提取影像特徵後,reshape 為1維的向量,送進 平坦層、隱藏層、輸出層 組成的累身經網路進行處理

池化層用來 downsampling,優點:

  1. 減少所需處理的資料點:減少後續運算所需時間
  2. 讓影像位置差異變小:手寫數字的位置不同,會影響辨識結果,減少影像大小可讓位置差異變小
  3. 參數的數量與計算量下降:控制 overfitting 的問題

tensorflow CNN

import tensorflow as tf
import numpy as np

# STEP 1 讀取資料
mnist = tf.keras.datasets.mnist
# Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test)

(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 將 training 的 input 資料 28*28 的 2維陣列 轉為 1維陣列,再轉成 float32
# 每一個圖片,都變成 784 個 float 的 array
# training 與 testing 資料數量分別是 60000 與 10000 筆
# X_train_2D 是 [60000, 28*28] 的 2維陣列
x_train_2D = x_train.reshape(60000, 28*28).astype('float32')
x_test_2D = x_test.reshape(10000, 28*28).astype('float32')
print('x_train_2D.shape=', x_train_2D.shape)
# x_train_2D.shape=(60000, 784)

# 將圖片的數字 (0~255) 標準化,最簡單的方法就是直接除以 255
# x_train_norm 是標準化後的結果,每一個數字介於 0~1 之間
x_train_norm = x_train_2D/255
x_test_norm = x_test_2D/255

# 將 training 的 label 進行 one-hot encoding,例如數字 7 經過 One-hot encoding 轉換後是 array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32),即第7個值為 1

y_train_one_hot_tf=tf.one_hot(y_train,10)
y_test_one_hot_tf=tf.one_hot(y_test,10)

y_train_one_hot = None
y_test_one_hot = None
with tf.compat.v1.Session() as sess:
    init = tf.compat.v1.global_variables_initializer()
    sess.run(init)
    y_train_one_hot = sess.run(y_train_one_hot_tf)
    y_test_one_hot = sess.run(y_test_one_hot_tf)

# 將 x_train, y_train 分成 train 與 validation 兩個部分
x_train_norm_data = x_train_norm[0:50000]
x_train_norm_validation = x_train_norm[50000:60000]

y_train_one_hot_data = y_train_one_hot[0:50000]
y_train_one_hot_validation = y_train_one_hot[50000:60000]


### 建立模型

# 先建立一些共用的函數
def weight(shape):
    return tf.Variable(tf.random.truncated_normal(shape, stddev=0.1),
                       name ='W')
# bias 張量,先以 constant 建立常數,然後用 Variable 建立張量變數
def bias(shape):
    return tf.Variable(tf.constant(0.1, shape=shape)
                       , name = 'b')
# 卷積運算 功能相當於濾鏡
#  x 是輸入的影像,必須是 4 維的張量
#  W 是 filter weight 濾鏡的權重,後續以隨機方式產生 filter weight
#  strides 是 濾鏡的跨步 step,設定為 [1,1,1,1],格式是 [1, stride, stride, 1],濾鏡每次移動時,從左到右,上到下,各移動 1 步
#  padding 是 'SAME',此模式會在邊界以外 補0 再做運算,讓輸入與輸出影像為相同大小
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1],
                        padding='SAME')

# 建立池化層,進行影像的縮減取樣
#  x 是輸入的影像,必須是 4 維的張量
#  ksize 是縮減取樣窗口的大小,設定為 [1,2,2,1],格式為 [1, height, width, 1],也就是高度 2 寬度 2 的窗口
#  stides 是縮減取樣窗口的跨步 step,設定為 [1,2,2,1],格式為 [1, stride, stride, 1],也就是縮減取樣窗口,由左到右,由上到下,各2步
#  原本 28x28 的影像,經過 max-pool 後,會縮小為 14x14
def max_pool_2x2(x):
    return tf.nn.max_pool2d(x, ksize=[1,2,2,1],
                          strides=[1,2,2,1],
                          padding='SAME')


# 輸入層
with tf.name_scope('Input_Layer'):
    # placeholder 會傳入影像
    x = tf.compat.v1.placeholder("float",shape=[None, 784],name="x")
    # x 原本為 1 維張量,要 reshape 為 4 維張量
    # 第 1 維 -1,因為後續訓練要透過 placeholder 輸入的資料筆數不固定
    # 第 2, 3 維,是 28, 28,因為影像為 28x28
    # 第 4 維是 1,因為是單色的影像,就設定為 1,如果是彩色,要設定為 3 (RGB)
    x_image = tf.reshape(x, [-1, 28, 28, 1])

# CNN Layer 1
# 用來提取特徵,卷積運算後,會產生 16 個影像,大小仍為 28x28
with tf.name_scope('C1_Conv'):
    # filter weight 大小為 5x5
    # 因為是單色,第 3 維設定為 1
    # 要產生 16 個影像,所以第 4 維設定為 16
    W1 = weight([5,5,1,16])

    # 因為產生 16 個影像,所以輸入餐次 shape = 16
    b1 = bias([16])

    # 卷積運算
    Conv1=conv2d(x_image, W1)+ b1
    # ReLU 激活函數
    C1_Conv = tf.nn.relu(Conv1 )

# 池化層用來 downsampling,將影像由 28x28 縮小為 14x14,影像數量仍為 16
with tf.name_scope('C1_Pool'):
    C1_Pool = max_pool_2x2(C1_Conv)

# CNN Layer 2
# 第二次卷積運算,將 16 個影像轉換為 36 個影像,卷積運算不改變影像大小,仍為 14x14
with tf.name_scope('C2_Conv'):
    # filter weight 大小為 5x5
    # 第 3 維是 16,因為卷積層1 的影像數量為 16
    # 第 4 維設定為 36,因為將 16 個影像轉換為 36個
    W2 = weight([5,5,16,36])
    # 因為產生 36 個影像,所以輸入餐次 shape = 36
    b2 = bias([36])
    Conv2=conv2d(C1_Pool, W2)+ b2
    # relu 會將負數的點轉換為 0
    C2_Conv = tf.nn.relu(Conv2)

# 池化層2用來 downsampling,將影像由 14x14 縮小為 7x7,影像數量仍為 36
with tf.name_scope('C2_Pool'):
    C2_Pool = max_pool_2x2(C2_Conv)

# Fully Connected Layer
# 平坦層,將 36個 7x7 影像,轉換為 1 維向量,長度為 36x7x7= 1764,也就是 1764 個 float,作為輸入資料
with tf.name_scope('D_Flat'):
    D_Flat = tf.reshape(C2_Pool, [-1, 1764])

with tf.name_scope('D_Hidden_Layer'):
    W3= weight([1764, 128])
    b3= bias([128])
    D_Hidden = tf.nn.relu(
                  tf.matmul(D_Flat, W3)+b3)

    ## Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
    # D_Hidden_Dropout= tf.nn.dropout(D_Hidden, keep_prob=0.8)
    D_Hidden_Dropout= tf.nn.dropout(D_Hidden, rate = 0.2)

# 輸出層, 10 個神經元
#  y_predict = softmax(D_Hidden_Dropout * W4 + b4)
with tf.name_scope('Output_Layer'):
    # 因為上一層 D_Hidden 是 128 個神經元,所以第1維是 128
    W4 = weight([128,10])
    b4 = bias([10])
    y_predict= tf.nn.softmax(
                 tf.matmul(D_Hidden_Dropout, W4)+b4)


### 設定訓練模型最佳化步驟
# 使用反向傳播演算法,訓練多層感知模型
with tf.name_scope("optimizer"):

    y_label = tf.compat.v1.placeholder("float", shape=[None, 10],
                              name="y_label")

    loss_function = tf.reduce_mean(
                      tf.nn.softmax_cross_entropy_with_logits
                         (logits=y_predict ,
                          labels=y_label))

    optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.0001) \
                    .minimize(loss_function)


### 設定評估模型
with tf.name_scope("evaluate_model"):
    correct_prediction = tf.equal(tf.argmax(y_predict, 1),
                                  tf.argmax(y_label, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))


### 訓練模型

trainEpochs = 30
batchSize = 100
totalBatchs = int(len(x_train_norm_data)/batchSize)
epoch_list=[];accuracy_list=[];loss_list=[];
from time import time

with tf.compat.v1.Session() as sess:
    startTime=time()

    sess.run(tf.compat.v1.global_variables_initializer())

    for epoch in range(trainEpochs):
        for i in range(totalBatchs):
            # batch_x, batch_y = mnist.train.next_batch(batchSize)
            batch_x = x_train_norm_data[i*batchSize:(i+1)*batchSize]
            batch_y = y_train_one_hot_data[i*batchSize:(i+1)*batchSize]

            sess.run(optimizer,feed_dict={x: batch_x,
                                          y_label: batch_y})

        loss,acc = sess.run([loss_function,accuracy],
                            feed_dict={x: x_train_norm_validation,
                                       y_label: y_train_one_hot_validation})

        epoch_list.append(epoch)
        loss_list.append(loss)
        accuracy_list.append(acc)

        print("Train Epoch:", '%02d' % (epoch+1), "Loss=","{:.9f}".format(loss)," Accuracy=",acc)

    duration =time()-startTime
    print("Train Finished takes:",duration)

    ## 評估模型準確率
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm,
                                   y_label:y_test_one_hot}))
    # 前 5000 筆
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm[:5000],
                                   y_label: y_test_one_hot[:5000]}))
    # 後 5000 筆
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm[5000:],
                                   y_label: y_test_one_hot[5000:]}))

    ## 預測機率
    y_predict=sess.run(y_predict,
                   feed_dict={x: x_test_norm[:5000]})

    ## 預測結果
    prediction_result=sess.run(tf.argmax(y_predict,1),
                           feed_dict={x: x_test_norm ,
                                      y_label: y_test_one_hot})

    ## 儲存模型
    saver = tf.train.Saver()
    save_path = saver.save(sess, "saveModel/CNN_model1")
    print("Model saved in file: %s" % save_path)
    merged = tf.summary.merge_all()
    # 可將 計算圖,透過 TensorBoard 視覺化
    train_writer = tf.summary.FileWriter('log/CNN',sess.graph)


# matplotlib 列印 loss, accuracy 折線圖
import matplotlib.pyplot as plt

fig = plt.gcf()
# fig.set_size_inches(4,2)
plt.plot(epoch_list, loss_list, label = 'loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['loss'], loc='upper left')
plt.savefig('loss.png')


fig = plt.gcf()
# fig.set_size_inches(4,2)
plt.plot(epoch_list, accuracy_list,label="accuracy" )

plt.ylim(0.8,1)
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['accuracy'], loc='upper right')
plt.savefig('accuracy.png')

############
# 查看多筆資料,以及 label
import matplotlib.pyplot as plt
def plot_images_labels_prediction(images,labels,prediction,idx,filename, num=10):
    fig = plt.gcf()
    fig.set_size_inches(12, 14)
    if num>25: num=25
    for i in range(0, num):
        ax=plt.subplot(5,5, 1+i)

        # 將 images 的 784 個數字轉換為 28x28
        ax.imshow(np.reshape(images[idx],(28, 28)), cmap='binary')

        # 轉換 one_hot label 為數字
        title= "label=" +str(np.argmax(labels[idx]))
        if len(prediction)>0:
            title+=",predict="+str(prediction[idx])

        ax.set_title(title,fontsize=10)
        ax.set_xticks([]);ax.set_yticks([])
        idx+=1
    plt.savefig(filename)


plot_images_labels_prediction(x_test_norm,
                              y_test_one_hot,
                              prediction_result,0, "result.png", num=10)

# 找出預測錯誤
for i in range(400):
    if prediction_result[i]!=np.argmax(y_test_one_hot[i]):
        print("i="+str(i)+
              "   label=",np.argmax(y_test_one_hot[i]),
              "predict=",prediction_result[i])
Train Epoch: 01 Loss= 1.604377151  Accuracy= 0.8872
Train Epoch: 02 Loss= 1.547111511  Accuracy= 0.9281
Train Epoch: 03 Loss= 1.525221825  Accuracy= 0.9447
Train Epoch: 04 Loss= 1.516423583  Accuracy= 0.9511
Train Epoch: 05 Loss= 1.507740974  Accuracy= 0.9584
Train Epoch: 06 Loss= 1.503444791  Accuracy= 0.9636
Train Epoch: 07 Loss= 1.496760130  Accuracy= 0.9683
Train Epoch: 08 Loss= 1.494633555  Accuracy= 0.9712
Train Epoch: 09 Loss= 1.492025375  Accuracy= 0.9724
Train Epoch: 10 Loss= 1.491448402  Accuracy= 0.9735
Train Epoch: 11 Loss= 1.488568783  Accuracy= 0.9751
Train Epoch: 12 Loss= 1.488826513  Accuracy= 0.9745
Train Epoch: 13 Loss= 1.485750437  Accuracy= 0.9778
Train Epoch: 14 Loss= 1.484605789  Accuracy= 0.9798
Train Epoch: 15 Loss= 1.483879209  Accuracy= 0.9788
Train Epoch: 16 Loss= 1.482506037  Accuracy= 0.9808
Train Epoch: 17 Loss= 1.482969046  Accuracy= 0.9796
Train Epoch: 18 Loss= 1.481315017  Accuracy= 0.9811
Train Epoch: 19 Loss= 1.480247617  Accuracy= 0.983
Train Epoch: 20 Loss= 1.480669379  Accuracy= 0.9817
Train Epoch: 21 Loss= 1.480412602  Accuracy= 0.9824
Train Epoch: 22 Loss= 1.479805708  Accuracy= 0.983
Train Epoch: 23 Loss= 1.479858279  Accuracy= 0.9827
Train Epoch: 24 Loss= 1.479218960  Accuracy= 0.9834
Train Epoch: 25 Loss= 1.479144573  Accuracy= 0.9829
Train Epoch: 26 Loss= 1.478820801  Accuracy= 0.9838
Train Epoch: 27 Loss= 1.477338433  Accuracy= 0.9857
Train Epoch: 28 Loss= 1.478171706  Accuracy= 0.9847
Train Epoch: 29 Loss= 1.477008104  Accuracy= 0.9856
Train Epoch: 30 Loss= 1.477438688  Accuracy= 0.9845
Train Finished takes: 1763.7836382389069
Accuracy: 0.988
Accuracy: 0.9814
Accuracy: 0.9928

i=18   label= 3 predict= 5
i=290   label= 8 predict= 4
i=321   label= 2 predict= 7
i=359   label= 9 predict= 8

沒有留言:

張貼留言