Maxkit: CTF Steganography 語音資料隱寫

CTF（Capture The Flag）奪旗賽，在網絡安全領域中指的是網絡安全技術人員之間進行技術競技的一種比賽形式。其中有一項 STEGA（Steganography 隱寫），題目的Flag會隱藏到圖片、語音、視訊等各類數位化資料中，透過一些工具與技術，取得隱藏在其中的 flag。

HelloKittyKitty.wav

這個網頁 Nuit du Hack CTF Qualifications: Here, kitty kitty! 裡面有個 HelloKittyKitty.wav，直接聽就是一段喵喵的音樂，但是如果用 Adudacity 或是 Praat 打開，可看到語音檔有兩個聲道，其中左聲道有一些看起來不像語音的資料。

這些是 morse code，將長短符號記錄下來是

..... -... -.-. ----. ..--- ..... -.... ....- ----. -.-. -... ----- .---- ---.. ---.. ..-. ..... ..--- . -.... .---- --... -.. --... ----- ----. ..--- ----. .---- ----. .---- -.-.

送到 online morse code decoder 解碼可得到 5BC925649CB0188F52E617D70929191C

看起來很像是 md5 string，用 md5 decoder (md5應該是無法解密的，但網站用一個很大的 DB 把很多 string 的 md5 結果記錄起來查詢) 反查可得到 valar dohaeris

Here with your eyes

這個連結有一個 sound.wav，直接聽像是鳥叫聲，用 Adudacity 打開，看起來也跟一般聲音檔差不多。但如果將聲音檔改成頻譜圖的顯示，畫面上可直接看到這樣的字串。

flag: e5353bb7b57578bd4da1c898a8e2d767

dtmf

這個 dial.wav 聽起來，就像是撥打電話數字鍵盤的 dtmf tone 語音，用 dtmf decoder 可解出

4*7# 2*6# 1*2# 2*5# 2*3# 3*6# 2*6# 2*6# 3*6# 2*5# 3*4# 1*2

4*7 就是 T9 鍵盤的 7 按四次，可得到 s，依此類推

s  n  a  k  e  o  n  n  o  k  i  a

godwave.wav

ref: JarvisOJ Misc 部分題解

JarvisOJ Misc 上帝之音

[misc]上帝之音 400

godwave.wav 用 Audacity 打開來看，是個 mono, 10000000 Hz 的 wav，放大來看，是以振幅儲存類似 on-off keying OOK 的編碼

先以程式讀取 wav，由列印結果發現，大約每 64 個資料點做區隔，可用這些資料點的總和區分 0 or 1，再來列印結果，知道資料只有 01 10 兩種，符合 Manchaster 編碼的特性。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import wave
import matplotlib.pyplot as plt
import numpy as np

def read_wav_data(filename):
  wav = wave.open(filename,"rb") # 打開一個wav格式的聲音文件流
  frame = wav.getnframes() # 獲取幀數
  channel=wav.getnchannels() # 獲取聲道數
  framerate=wav.getframerate() # 獲取幀速率
  sample_width=wav.getsampwidth() # 獲取實例的比特寬度，即每一幀的字節數

  print("frame=%i, channel=%i, framerate=%i, sample_width=%i" %(frame, channel, framerate, sample_width))

  str_data = wav.readframes(frame) # 讀取全部的幀
  wav.close() # 關閉流
  wave_data = np.fromstring(str_data, dtype = np.int16) # 將聲音文件數據轉換為數組矩陣形式
  wave_data.shape = -1, channel # 按照聲道數將數組整形，單聲道時候是一列數組，雙聲道時候是兩列的矩陣
  wave_data = wave_data.T # 將矩陣轉置
  wave_data = wave_data
  return wave_data, frame, framerate

def wav_show(wave_data, fs): # 顯示出來聲音波形
  time = np.arange(0, len(wave_data)) * (1.0/fs)  # 計算聲音的播放時間，單位為秒
  # 畫聲音波形
  plt.plot(time, wave_data)
  plt.show()

if(__name__=='__main__'):
  # frame=1735680, channel=1, framerate=10000000, sample_width=2
  wave_data, frame, framerate = read_wav_data("godwave.wav")
  # wav_show(wave_data[0],fs)
  # wav_show(wave_data[1],fs)  # 如果是雙聲道則保留這一行，否則刪掉這一行

  # for i in range(300):
  #   print( "{index} = {val}".format(index = (i+1), val = abs(wave_data[0][i])) )
  # 由列印結果發現，大約每 64 個資料點做區隔，可用這些資料點的總和區分 0 or 1

  string = ''
  norm = 0
  for i in range(frame):
    norm = norm+abs(wave_data[0][i])
    if (i+1) % 64 == 0:
      # print("norm={norm}".format(norm=norm))
# norm=1244430
# norm=25330
# norm=32868
# norm=1118668
      if norm > 100000:
          string += '1'
      else:
          string += '0'
      # 用這個方式觀察，知道資料只有 01 10 兩種，符合 Manchaster 編碼的特性
      # if (i+1) % (128*8) == 0:
      #     string += "\n"
      norm = 0
  with open('output.txt','w') as output:
    output.writelines(string)

# -*- coding: utf-8 -*-
with open('output.txt', 'r') as f:
    data = f.readline()
    # print len(data)
    count = 0
    res = 0
    ans = b''
    key = ""

    pactemp = ""

    while data != '':
        pac = data[:2]
        data = data[2:]
        pactemp = pactemp + pac
        # print("pac={pac}".format(pac=pac))
        if pac != '':
            if pac[0] == '0' and pac[1] == '1':
                # print("res={res}".format(res=res))
                res = (res<<1)|0
                count = count + 1
            if pac[0] == '1' and pac[1] == '0':
                # print("res={res}".format(res=res))
                res = (res<<1)|1
                count = count + 1
            if count == 8:
                # print("pactemp={pactemp}, res={res}".format(pactemp=pactemp, res=res))
                ans += res.to_bytes(1, byteorder='big', signed=False)
                # ans += bytes(res)
                count = 0
                res = 0

                pactemp = ""
                pac = ""
        else:
            break

# print("ans={ans}".format(ans=ans))

with open('out.png', 'wb') as f2:
    f2.write(ans)

Manchaster解碼結果是一張 png 圖。打開後是一張 QR code 圖片，放到 qr code decoder 後取得結果：

Raw text	`PCTF{Good_Signal_Analyzer}`
Raw bytes	`41 a5 04 35 44 67 b4 76 f6 f6 45 f5 36 96 76 e6 16 c5 f4 16 e6 16 c7 97 a6 57 27 d0 ec 11 ec 11 ec 11 ec 11`
Barcode format	QR_CODE
Parsed Result Type	TEXT
Parsed Result	`PCTF{Good_Signal_Analyzer}`

godwavefsk.wav

godwavefsk.wav 用 Audacity 打開來看，是個 mono, 10000000 Hz 的 wav，放大來看，是以頻率儲存類似 on-off keying OOK 的編碼

根據上一個點的資料，判斷是不是由負數變成正數，計算改變的次數，如果改變的次數超過 8 次，就表示這是頻率比較大的區段

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import wave
import matplotlib.pyplot as plt
import numpy as np

def read_wav_data(filename):
  wav = wave.open(filename,"rb") # 打開一個wav格式的聲音文件流
  frame = wav.getnframes() # 獲取幀數
  channel=wav.getnchannels() # 獲取聲道數
  framerate=wav.getframerate() # 獲取幀速率
  sample_width=wav.getsampwidth() # 獲取實例的比特寬度，即每一幀的字節數

  print("frame=%i, channel=%i, framerate=%i, sample_width=%i" %(frame, channel, framerate, sample_width))

  str_data = wav.readframes(frame) # 讀取全部的幀
  wav.close() # 關閉流
  wave_data = np.fromstring(str_data, dtype = np.int16) # 將聲音文件數據轉換為數組矩陣形式
  wave_data.shape = -1, channel # 按照聲道數將數組整形，單聲道時候是一列數組，雙聲道時候是兩列的矩陣
  wave_data = wave_data.T # 將矩陣轉置
  wave_data = wave_data
  return wave_data, frame, framerate

def wav_show(wave_data, fs): # 顯示出來聲音波形
  time = np.arange(0, len(wave_data)) * (1.0/fs)  # 計算聲音的播放時間，單位為秒
  # 畫聲音波形
  plt.plot(time, wave_data)
  plt.show()

if(__name__=='__main__'):
  # frame=1988608, channel=1, framerate=10000000, sample_width=2
  wave_data, frame, framerate = read_wav_data("godwavefsk.wav")
  # wav_show(wave_data[0],fs)
  # wav_show(wave_data[1],fs)  # 如果是雙聲道則保留這一行，否則刪掉這一行

  # for i in range(300):
  #   print( "{index} = {val}".format(index = (i+1), val = wave_data[0][i]) )

  string = ''
  old_wave_data = 0
  count = 0
  for i in range(frame):
    # 根據上一個點的資料，判斷是不是由負數變成正數，計算改變的次數，如果改變的次數超過 8 次，就表示這是頻率比較大的區段
    if( old_wave_data < 0 and wave_data[0][i]>0 ):
      # print("old_wave_data={old_wave_data}, wave_data[0][i]={new_wave_data}".format(old_wave_data=old_wave_data, new_wave_data=wave_data[0][i]))
      count = count + 1
    old_wave_data = wave_data[0][i]
    if (i+1) % 64 == 0:
      # print("count={count}".format(count=count))
      if count > 8:
          string += '1'
      else:
          string += '0'
      # 用這個方式觀察，知道資料只有 01 10 兩種，符合 Manchaster 編碼的特性
      # if (i+1) % (128*8) == 0:
      #     string += "\n"
      count = 0
  with open('output.txt','w') as output:
    output.writelines(string)

用上一個例子中將 output.txt 的 Manchaster 解碼程式，輸出 png，可看到是一個 QR code。

Raw text	`CTF{Nice_FSK_D3ModUl47o2}`
Raw bytes	`41 94 35 44 67 b4 e6 96 36 55 f4 65 34 b5 f4 43 34 d6 f6 45 56 c3 43 76 f3 27 d0 ec 11 ec 11 ec 11 ec 11 ec`
Barcode format	QR_CODE
Parsed Result Type	TEXT
Parsed Result	`CTF{Nice_FSK_D3ModUl47o2}`