2020年2月23日

Rust Guessing Game

一個基本的猜數字範例程式:程式會產生一個 1 ~ 100 的亂數,提示我們輸入數字猜猜看,填寫數字會得到太大或太小的結果,猜對時就結束遊戲。

Project

首先產生一個空的 guessting game project

$ cargo new guessing_game
     Created binary (application) `guessing_game` package
$ cd guessing_game

因為 src/main.rs 有基本的 hello world 範例,故新的專案都可以直接執行

$ cargo run
   Compiling guessing_game v0.1.0 (/Users/charley/project/idea/rust/guessing_game)
    Finished dev [unoptimized + debuginfo] target(s) in 1.39s
     Running `target/debug/guessing_game`
Hello, world!

Processing a Guess

首先要處理 user 輸入的數字

rust 預設只會匯入 prelude,如果沒有在 prelude 就要用 use 匯入該函式庫

// 匯入 io library
use std::io;

// main() 函式是程式的進入點
// fn 語法宣告一個新函式,() 表示沒有傳入的參數,{ 後面開始是函式的內容
// 因為沒有回傳型別,所以這裡的回傳會是 (),一個空的 tuple
fn main() {
    // println! 是列印字串的巨集
    println!("Guess the number!");

    println!("Please input your guess.");

    // let 是變數跟數值的綁定,預設是 immutable
    // 這裡加上 mut,變成 mutable variable
    // String 是 UTF-8 編碼的可變長度字串型別
    let mut guess = String::new();

    // io::stdin() 是使用 std::io::stdin(),會回傳一個 stdin 的 handle
    // 因最前面有 use std:io,所以這裡可省略 std::
    // 呼叫該 handle 的 read_line method,並提供 guess 變數的 reference

    // read_line() 會把使用者的輸入資料放入 &mut String 參數中。而它會回傳一個值 io::Result
    // Rust 的標準函式庫中有很多 Result 的型別:有一般的 Result 以及子函式庫的特別版本,ex: io::Result
    // Result 是要將錯誤訊息編碼
    // Result 的 variants 是 Ok, Err, Ok 裡面有成功的結果, Err 裡面有錯誤的原因

    // io:Result 提供 expect 方法,可取得呼叫該 method 的結果,如果結果不成功,就會 panic! ,並加上後面的 msg 訊息內容
    // 去掉expect 仍然可以編譯程式,但會出現 warning
    io::stdin().read_line(&mut guess).expect("Failed to read line");

    // 列印 guess 變數的資料, {} 是 placeholder
    println!("You guessed: {}", guess);
}

執行

$ cargo run
   Compiling guessing_game v0.1.0 (/Users/charley/project/idea/rust/guessing_game)
    Finished dev [unoptimized + debuginfo] target(s) in 0.70s
     Running `target/debug/guessing_game`
Guess the number!
Please input your guess.
1
You guessed: 1

Generating a Secret Number

Rust 標準函式庫沒有亂數的功能,但有提供 rand crate,crate 是一包 rust 的程式碼,而 rand 是 library crate,包含了可被其他城市使用的 codes。

要使用 crate 必須修改 Cargo.toml,在 dependencies 裡面加上 rand = "0.4.6"

[package]
name = "guessing_game"
version = "0.1.0"
authors = ["name <email@domain>"]

[dependencies]
rand = "0.4.6"

0.4.6 的部分稱為 Semantic Versioning (有時稱為SemVer),是標準的版本號碼,0.4.6 等同 ^0.4.6,就是任何跟 0.4.6 版本有相容 pulbic API的版本。如果要指定只能使用 0.4.6,要寫成 rand="=0.4.6"

再執行一次,會看到下載了新的 crate library,以及相依的 libc

$ cargo run
    Blocking waiting for file lock on the registry index
    Updating crates.io index
  Downloaded rand v0.3.23
  Downloaded rand v0.4.6
   Compiling libc v0.2.55
   Compiling rand v0.4.6
   Compiling rand v0.3.23
   Compiling guessing_game v0.1.0 (/Users/charley/project/guessing_game)
    Finished dev [unoptimized + debuginfo] target(s) in 20.87s
     Running `target/debug/guessing_game`

cargo 內建了一個機制,確保專案會使用相同版本的 dependencies,第一次建構專案後 (cargo build),就會產生一個 Cargo.lock檔案,該檔案裡面記錄了所有 dependencies 的版本號碼,接下來在重新 build project 時,會先檢查 Cargo.lock,並使用裡面的 versions of the dependencies。

如果要更新 crate 的版本,就要使用 cargo update 指令。

Generating a Random Number

使用 rand::Rng 產生亂數

use std::io;
// Rng 有產生亂數的 methods
use rand::Rng;

fn main() {
    println!("Guess the number!");

    // 1 是下限,101 是 exclusive upper bound
    let secret_number = rand::thread_rng().gen_range(1, 101);

    println!("The secret number is: {}", secret_number);

    println!("Please input your guess.");

    let mut guess = String::new();

    io::stdin().read_line(&mut guess)
        .expect("Failed to read line");

    println!("You guessed: {}", guess);
}

Comparing the Guess to the Secret Number

use std::io;
// Ordering 也是一個 Enum,成員是 Less, Greater, Equal
use std::cmp::Ordering;
use rand::Rng;

fn main() {

    // ---snip---

    println!("You guessed: {}", guess);

    // 把 guess 與 secret_number 做比較
    // 目前這裡會發生 mismatched types 錯誤,因為 Rust 是 strong type language
    // Rust 推測 guess 是 String,而 secret_number 是數字(i32/u32),預設為 i32
    match guess.cmp(&secret_number) {
        Ordering::Less => println!("Too small!"),
        Ordering::Greater => println!("Too big!"),
        Ordering::Equal => println!("You win!"),
    }
}

編譯錯誤

$ cargo run
   Compiling guessing_game v0.1.0 (/Users/charley/project/guessing_game)
error[E0308]: mismatched types
  --> src/main.rs:22:21
   |
22 |     match guess.cmp(&secret_number) {
   |                     ^^^^^^^^^^^^^^ expected struct `std::string::String`, found integer
   |
   = note: expected type `&std::string::String`
              found type `&{integer}`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0308`.
error: Could not compile `guessing_game`.

要修改程式,將 guess 轉換為數字

// --snip--

    let mut guess = String::new();

    io::stdin().read_line(&mut guess)
        .expect("Failed to read line");

    // 重新定義 guess,trim 可去掉前後的空白或換行符號
    // 現在 guess 變成 u32,而 Rust 也會將 secret_number 視為 u32
    let guess: u32 = guess.trim().parse()
        .expect("Please type a number!");

    println!("You guessed: {}", guess);

    match guess.cmp(&secret_number) {
        Ordering::Less => println!("Too small!"),
        Ordering::Greater => println!("Too big!"),
        Ordering::Equal => println!("You win!"),
    }
}           

加上 loop 重複猜測數字

use std::io;
use std::cmp::Ordering;
use rand::Rng;

fn main() {
    println!("Guess the number!");

    let secret_number = rand::thread_rng().gen_range(1, 101);

    println!("The secret number is: {}", secret_number);

    loop {
        println!("Please input your guess.");

        let mut guess = String::new();

        io::stdin().read_line(&mut guess)
            .expect("Failed to read line");

        let guess: u32 = guess.trim().parse()
            .expect("Please type a number!");

        println!("You guessed: {}", guess);

        match guess.cmp(&secret_number) {
            Ordering::Less => println!("Too small!"),
            Ordering::Greater => println!("Too big!"),
            Ordering::Equal => println!("You win!"),
        }
    }

}

目前程式還需要再猜對時,正常跳出程式。另外要針對 guess 的轉型進行檢查。

use std::io;
use std::cmp::Ordering;
use rand::Rng;

fn main() {
    println!("Guess the number!");

    let secret_number = rand::thread_rng().gen_range(1, 101);

    // println!("The secret number is: {}", secret_number);

    loop {
        println!("Please input your guess.");

        let mut guess = String::new();

        io::stdin().read_line(&mut guess)
            .expect("Failed to read line");

        // 將 expect 換成 match
        // parse 返回一個 Result 類型,而 Result 是一個 Ok 或 Err 成員的 enum
        let guess: u32 = match guess.trim().parse() {
            Ok(num) => num,
            Err(_) => continue,
        };

        println!("You guessed: {}", guess);

        match guess.cmp(&secret_number) {
            Ordering::Less => println!("Too small!"),
            Ordering::Greater => println!("Too big!"),
            Ordering::Equal => {
                println!("You win!");
                break;
            },
        }
    }

}

References

The Rust Programming Language

中文版

中文版 2

2020年2月17日

Media Resource Control Protocol - MRCP

MRCP 是 speech server 給 client 提供服務(例如 speech recognition, speech synthesis)的傳輸協定,MRCP 無法獨立運作,必須透過 RTSP 或 SIP 建立 control session 與 audio streams。MRCP 是使用類似 http 的 text style protocol,每個訊息包含三個部分:first line, header, body。

MRCP 使用跟 http 一樣的 a request and reponse model,例如 MRCP client 發送 request,要求要發送 audio data 給 server 做語音辨識,server 會回傳一個訊息,裡面包含要接收資料的 port number,因為 MRCP 並沒有規範要如何傳送語音資料,這部分就透過 RTP 處理。

MRCP v2 (RFC 6787)使用 SIP 管理 session 與 audio stream,v1 (RFC 4463) 則沒有規範這部分要使用哪一種 protocol,目前比較常討論的是 MRCP v2,另外因為MRCPv1依賴 RTSP (RFC2326),但在討論 MRCP v2 時,大家一致 RTSP 的這種使用方式,會導致向後兼容性問題,因此在 (Requirements for Distributed Control of Automatic Speech Recognition (ASR), Speaker Identification/Speaker Verification (SI/SV), and Text-to-Speech (TTS) Resources RFC4313) 的3.2節禁止使用,這就是為什麼MRCPv2不能在RTSP上運作的原因。

MRCP V2 中使用了 SIP 負責建立獨立的媒體和會話支持語音媒體資源,增加了對講話者變化和講話者的身份引擎的支援(speaker verification 和 identification),同時增加了未來的擴充能力。

MRCP v2 規範中的架構圖為

     MRCPv2 client                   MRCPv2 Media Resource Server
|--------------------|            |------------------------------------|
||------------------||            ||----------------------------------||
|| Application Layer||            ||Synthesis|Recognition|Verification||
||------------------||            || Engine  |  Engine   |   Engine   ||
||Media Resource API||            ||    ||   |    ||     |    ||      ||
||------------------||            ||Synthesis|Recognizer |  Verifier  ||
|| SIP  |  MRCPv2   ||            ||Resource | Resource  |  Resource  ||
||Stack |           ||            ||     Media Resource Management    ||
||      |           ||            ||----------------------------------||
||------------------||            ||   SIP  |        MRCPv2           ||
||   TCP/IP Stack   ||---MRCPv2---||  Stack |                         ||
||                  ||            ||----------------------------------||
||------------------||----SIP-----||           TCP/IP Stack           ||
|--------------------|            ||                                  ||
         |                        ||----------------------------------||
        SIP                       |------------------------------------|
         |                          /
|-------------------|             RTP
|                   |             /
| Media Source/Sink |------------/
|                   |
|-------------------|

                      Figure 1: Architectural Diagram

W3C 在 1999年建立 Voice Broswer Working Group(VBWG),研究如何透過 web 支援語音辨識及 DTMF 處理,然後發佈了基於 web 的語音介面架構,核心是 VoiceXML。

W3C 的 Speech Recognition Grammar Specification (SRGS) 是一種 XML 標準,支援語音語法的規則,可識別的短詞語。和 SRGS 比較接近的是 W3C Semantic Interpretation for Speech Recognition (SISR),它更常用在標記語義信息支援語音語法,構成了對自然語言理解的基本格式。W3C Speech Synthesis Markup Language (SSML)是基於 XML 的方式指定內容進行語音合成的方式,可控制語音的各種屬性,包括音量大小,發音,語音間距,語速等方面的控制。

SRGS和SSML能互補和控制W3C的發音語法規則(Pronunciation Lexicon Specification (PLS))。PLS可以使用標準的發音字母來指定單字和短詞語發音。

VoiceXML 協助 MRCP,可支援多種第三方語音辨識及合成引擎。

MRCPv2 Media Resource Types

一個 MRCPv2 server 就是一種 SIP server,因此是用 SIP URI 方式定址 (sip:mrcpv2@example.net or sips:mrcpv2@example.net),可提供以下 media processing resources 給 clients

  • Basic Synthesizer

    透過連接 audio clips 產生語音 media stream,speech data 是以 limited subset of the Speech Synthesis Markup Language (SSML) 描述,最簡單的 synthesizer 必須支援這些 SSML tags: <speak>, <audio>, <say-as>, <mark>

  • Speech Synthesizer

    有完整 TTS 功能,必須完整支援 SSML

  • Recorder

    recoding audio 並提供該錄音的 URI,必須支援在錄音的最前面及後面要 supressing silence,錄音檔的中間可選擇要不要 supress silence,如果有做靜音處理,要記錄 timting metadata,才能知道原始錄音 media 實際發生語音的 timestamp

  • DTMF Recognizer

    能取得 media stream 中的 Dual-Tone Multi-Frequency (DTMF) digits,並對應到 supplied sigit grammar 中

  • Speech Recognizer

    完整的 speech recognition resource 可接收 audio media stream 並辨識取得結果,另外包含一個 natural language semantic interpreter 做辨識結果的 post-process,轉為 grammar 中的 semantic data

  • Speaker Verifier

    可辨別已存在的 voice print 的 speaker

Resource Type Resource Description
speechrecog Speech Recognizer
dtmfrecog DTMF Recognizer
speechsynth Speech Synthesizer
basicsynth Basic Synthesizer
speakverify Speaker Verification
recorder Speech Recorder

MRCPv2 的規範中,整個應用的使用過程如下:

  1. MRCP Client 通過SIP&SDP建立與MRCP Server的MRCP control channel(使用MRCP 通道ID進行唯一標識,MRCP Server返回200消息時,通過a==channel屬性指定)

  2. 可以使用SIP的Re-INVITE消息添加或者刪除一個會話中的MRCP control channel,所以一個 session 可以擁有多個MRCP control channels(比如:一個會話可以同時擁有ASR&TTS channel)

  3. 多個MRCP control channel 可以共享同一個TCP connection

  4. 一個 MRCP message 只能攜帶一個MRCP channel ID。

  5. MRCP控制消息不能更改 SIP dialog 的狀態。

  6. 由於MRCP不保證傳輸的可靠性,所以必須使用TCP/TLS來保證其傳輸

resourse control channel

MRCPv2 附在 SIP 的 SDP 裡面,client 透過 SIP Invite 連接 MRCPv2 server,產生 SIP dialog,SDP 讓兩個端點協調所有要建立的 resource control channel,並產生 server 與 source/sink of audio 之間的 media session。

client 需要建立獨立的 MRCPv2 resource control channel,控制 SIP dialog 裡面要處理的 media resource,因此需要產生一個唯一的 channel identifier string。

在 SDP 中,要有一行 "m=" 給 session 中每一個 MRCPv2 resource 使用,transport type 必須要是 "TCP/MRCPv2" or "TCP/TLS/MRCPv2",client 可透過 TCP 或 TCP/TLS 連接到 MRCPv2 server。

example:

連接到 synthesizer 的範例,server 會產生一個單向 audio stream 傳給 client

  1. 產生 Synthesizer Control Channel
C->S:  INVITE sip:mresources@example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf1
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314161 INVITE
          Contact:<sip:sarvi@client.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=sarvi 2890844526 2890844526 IN IP4 192.0.2.12
          s=-
          c=IN IP4 192.0.2.12
          t=0 0
          m=application 9 TCP/MRCPv2 1
          a=setup:active
          a=connection:new
          a=resource:speechsynth
          a=cmid:1
          m=audio 49170 RTP/AVP 0
          a=rtpmap:0 pcmu/8000
          a=recvonly
          a=mid:1

   S->C:  SIP/2.0 200 OK
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf1;received=192.0.32.10
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314161 INVITE
          Contact:<sip:mresources@server.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=- 2890842808 2890842808 IN IP4 192.0.2.11
          s=-
          c=IN IP4 192.0.2.11
          t=0 0
          m=application 32416 TCP/MRCPv2 1
          a=setup:passive
          a=connection:new
          a=channel:32AECB234338@speechsynth
          a=cmid:1
          m=audio 48260 RTP/AVP 0
          a=rtpmap:0 pcmu/8000
          a=sendonly
          a=mid:1

   C->S:  ACK sip:mresources@server.example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf2
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314161 ACK
          Content-Length:0

上面的 RTP 資源,另外再對 recognizer 要求取得一個 resource control channel 的資源,並改為 sendrecv 雙向傳輸語音

   C->S:  INVITE sip:mresources@server.example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf3
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314162 INVITE
          Contact:<sip:sarvi@client.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=sarvi 2890844526 2890844527 IN IP4 192.0.2.12
          s=-
          c=IN IP4 192.0.2.12
          t=0 0
          m=application 9 TCP/MRCPv2 1
          a=setup:active
          a=connection:existing
          a=resource:speechsynth
          a=cmid:1
          m=audio 49170 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
          a=sendrecv
          a=mid:1
          m=application 9 TCP/MRCPv2 1
          a=setup:active
          a=connection:existing
          a=resource:speechrecog
          a=cmid:1

   S->C:  SIP/2.0 200 OK
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf3;received=192.0.32.10
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314162 INVITE
          Contact:<sip:mresources@server.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=- 2890842808 2890842809 IN IP4 192.0.2.11
          s=-
          c=IN IP4 192.0.2.11
          t=0 0
          m=application 32416 TCP/MRCPv2 1
          a=setup:passive
          a=connection:existing
          a=channel:32AECB234338@speechsynth
          a=cmid:1
          m=audio 48260 RTP/AVP 0 96
          a=rtpmap:0 pcmu/8000
          a=rtpmap:96 telephone-event/8000
          a=fmtp:96 0-15
          a=sendrecv
          a=mid:1
          m=application 32416 TCP/MRCPv2 1
          a=setup:passive
          a=connection:existing
          a=channel:32AECB234338@speechrecog
          a=cmid:1

   C->S:  ACK sip:mresources@server.example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf4
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314162 ACK
          Content-Length:0

釋放 recofnizer channel 的資源,改回 recvonly

   C->S:  INVITE sip:mresources@server.example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf5
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314163 INVITE
          Contact:<sip:sarvi@client.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=sarvi 2890844526 2890844528 IN IP4 192.0.2.12
          s=-
          c=IN IP4 192.0.2.12
          t=0 0
          m=application 9 TCP/MRCPv2 1
          a=resource:speechsynth
          a=cmid:1
          m=audio 49170 RTP/AVP 0
          a=rtpmap:0 pcmu/8000
          a=recvonly
          a=mid:1
          m=application 0 TCP/MRCPv2 1
          a=resource:speechrecog
          a=cmid:1


   S->C:  SIP/2.0 200 OK
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf5;received=192.0.32.10
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314163 INVITE
          Contact:<sip:mresources@server.example.com>
          Content-Type:application/sdp
          Content-Length:...

          v=0
          o=- 2890842808 2890842810 IN IP4 192.0.2.11
          s=-
          c=IN IP4 192.0.2.11
          t=0 0
          m=application 32416 TCP/MRCPv2 1
          a=channel:32AECB234338@speechsynth
          a=cmid:1
          m=audio 48260 RTP/AVP 0
          a=rtpmap:0 pcmu/8000
          a=sendonly
          a=mid:1
          m=application 0 TCP/MRCPv2 1
          a=channel:32AECB234338@speechrecog
          a=cmid:1

   C->S:  ACK sip:mresources@server.example.com SIP/2.0
          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
           branch=z9hG4bK74bf6
          Max-Forwards:6
          To:MediaServer <sip:mresources@example.com>;tag=62784
          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
          Call-ID:a84b4c76e66710
          CSeq:314163 ACK
          Content-Length:0

MRCPv2 message

MRCPv2 訊息包含 client 給 server 的 request,及server 發給 client 的 response 與asynchronous events,資料格式包含一行 start-line,多個 headers,一行 empty line 代表 header 結束,然後是 optional message body,跟 http protocol 類似

generic-message  =    start-line
                      message-header
                      CRLF
                      [ message-body ]

message-body     =    *OCTET

start-line       =    request-line / response-line / event-line

message-header   =  1*(generic-header / resource-header / generic-field)

resource-header  =    synthesizer-header
                 /    recognizer-header
                 /    recorder-header
                 /    verifier-header

ex:

   C->S:   MRCP/2.0 877 INTERPRET 543266
           Channel-Identifier:32AECB23433801@speechrecog
           Interpret-Text:may I speak to Andre Roy
           Content-Type:application/srgs+xml
           Content-ID:<request1@form-level.store>
           Content-Length:661

           <?xml version="1.0"?>
           <!-- the default grammar language is US English -->
           <grammar xmlns="http://www.w3.org/2001/06/grammar"
                    xml:lang="en-US" version="1.0" root="request">
           <!-- single language attachment to tokens -->
               <rule id="yes">
                   <one-of>
                       <item xml:lang="fr-CA">oui</item>
                       <item xml:lang="en-US">yes</item>
                   </one-of>
               </rule>
           <!-- single language attachment to a rule expansion -->
               <rule id="request">
                   may I speak to
                   <one-of xml:lang="fr-CA">
                       <item>Michel Tremblay</item>
                       <item>Andre Roy</item>
                   </one-of>
               </rule>
           </grammar>

   S->C:   MRCP/2.0 82 543266 200 IN-PROGRESS
           Channel-Identifier:32AECB23433801@speechrecog

   S->C:   MRCP/2.0 634 INTERPRETATION-COMPLETE 543266 200 COMPLETE
           Channel-Identifier:32AECB23433801@speechrecog
           Completion-Cause:000 success
           Content-Type:application/nlsml+xml
           Content-Length:441

           <?xml version="1.0"?>
           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
                   xmlns:ex="http://www.example.com/example"
                   grammar="session:request1@form-level.store">
               <interpretation>
                   <instance name="Person">
                       <ex:Person>
                           <ex:Name> Andre Roy </ex:Name>
                       </ex:Person>
                   </instance>
                   <input>   may I speak to Andre Roy </input>
               </interpretation>
           </result>

request-line 的格式為

   request-line   =    mrcp-version SP message-length SP method-name SP request-id CRLF

   method-name    =    generic-method
                  /    synthesizer-method
                  /    recognizer-method
                  /    recorder-method
                  /    verifier-method
                  
   request-id     =    1*10DIGIT

response 的格式為

response-line  =    mrcp-version SP message-length SP request-id
                       SP status-code SP request-state CRLF
status-code     =    3DIGIT
request-state    =  "COMPLETE"
                    /  "IN-PROGRESS"
                    /  "PENDING"

event-line 的格式為

event-line       =  mrcp-version SP message-length SP event-name
                       SP request-id SP request-state CRLF
event-name       =  synthesizer-event
                    /  recognizer-event
                    /  recorder-event
                    /  verifier-event

注意到訊息格式中,分別對 synthesizer, recognizer, recorder, verifier 四種 resource type,有不同的定義 methods, headers, events

Generic Methods, Headers, Result Structure

所有 resource 通用的 methods, headers

MRCPv2 支援兩種 generic methods,可 reading, writing 相關資源的 state

   generic-method      =    "SET-PARAMS"
                       /    "GET-PARAMS"

SET-PARAMS

​ client 發送給 server,通知該 session 的 MRCPv2 resource 要定義 parameter

   C->S:  MRCP/2.0 ... SET-PARAMS 543256
          Channel-Identifier:32AECB23433802@speechsynth
          Voice-gender:female
          Voice-variant:3

   S->C:  MRCP/2.0 ... 543256 200 COMPLETE
          Channel-Identifier:32AECB23433802@speechsynth

GET-PARAMS

​ client 發送給 server,通知要取得 MRCPv2 resource 目前的 session parameters

   C->S:   MRCP/2.0 ... GET-PARAMS 543256
           Channel-Identifier:32AECB23433802@speechsynth
           Voice-gender:
           Voice-variant:
           Vendor-Specific-Parameters:com.example.param1;
                         com.example.param2
   S->C:   MRCP/2.0 ... 543256 200 COMPLETE
           Channel-Identifier:32AECB23433802@speechsynth
           Voice-gender:female
           Voice-variant:3
           Vendor-Specific-Parameters:com.example.param1="Company Name";
                         com.example.param2="124324234@example.com"

所有 MRCPv2 header 中,包含 generic-headers 及 resource-specific headers

header 的定義如下

   generic-field  = field-name ":" [ field-value ]
   field-name     = token
   field-value    = *LWS field-content *( CRLF 1*LWS field-content)
   field-content  = <the OCTETs making up the field-value
                    and consisting of either *TEXT or combinations
                    of token, separators, and quoted-string>

generic header 有

   generic-header      =    channel-identifier
                       /    accept
                       /    active-request-id-list
                       /    proxy-sync-id
                       /    accept-charset
                       /    content-type
                       /    content-id
                       /    content-base
                       /    content-encoding
                       /    content-location
                       /    content-length
                       /    fetch-timeout
                       /    cache-control
                       /    logging-tag
                       /    set-cookie
                       /    vendor-specific
  • Channel-Identifier

    在產生一個 control channel 時,由 server 指定一個 Channel Id

   channel-identifier  = "Channel-Identifier" ":" channel-id CRLF
   channel-id          = 1*alphanum "@" 1*alphanum
  • Accept

  • Active-Request-Id-List

    在 request 裡面,這個 header 代表這個 request 對這個 list of request-ids 有作用。在 response ,這個 header 代表該 method 影響到的 list of request-ids

       active-request-id-list  =  "Active-Request-Id-List" ":"
                                  request-id *("," request-id) CRLF
  • Proxy-Sync-Id

    當某個 server resource 產生 "barge-in-able" event,也會產生一個 unique tag,該 tag 會透過這個 header 放在 event 裡面,傳給 client

       proxy-sync-id    =  "Proxy-Sync-Id" ":" 1*VCHAR CRLF
  • Accept-Charset

    在 request 裡面指定 response or event 可接受能夠處理的 character sets。

    例如指定 Natural Language Semantic Markup Language (NLSML) results 的 RECOGNITION-COMPLETE event 可使用的 character set

  • Content-Type

    MRCPv2 的 content 支援有限 media types,例如 speech markup, grammer, recofnition results

       content-type     =    "Content-Type" ":" media-type-value CRLF
    
       media-type-value =    type "/" subtype *( ";" parameter )
    
       type             =    token
    
       subtype          =    token
    
       parameter        =    attribute "=" value
    
       attribute        =    token
    
       value            =    token / quoted-string
  • Content-ID

    該 content 參考或引用的 ID or name

  • Content-Base

    指定 base URI

    content-base      = "Content-Base" ":" absoluteURI CRLF
  • Content-Encoding

    某個 Content-Type 的附加資訊,例如 Content-Encoding:gzip

       content-encoding  = "Content-Encoding" ":"
                           *WSP content-coding
                           *(*WSP "," *WSP content-coding *WSP )
                           CRLF
  • Content-Location

       content-location  =  "Content-Location" ":"
                            ( absoluteURI / relativeURI ) CRLF
  • Content-Length

    message body 的長度

       content-length  =  "Content-Length" ":" 1*19DIGIT CRLF
  • Fetch Timeout

    當 recognizer/synthesizer 需要取得文件或其他資源,定義 server 透過網路取得資源的 timeout 時間

       fetch-timeout       =   "Fetch-Timeout" ":" 1*19DIGIT CRLF
  • Cache-Control

    如果 server 有支援 content caching,遵循 http 1.1 的規則提供 cache

       cache-control    =    "Cache-Control" ":"
                             [*WSP cache-directive
                             *( *WSP "," *WSP cache-directive *WSP )]
                             CRLF
    
       cache-directive     = "max-age" "=" delta-seconds
                           / "max-stale" [ "=" delta-seconds ]
                           / "min-fresh" "=" delta-seconds
    
       delta-seconds       = 1*19DIGIT
  • Logging-Tag

    SET-PARAMS/GET-PARAMS method 的 header,可 set/retrieve server 產生的 log 的 logging tag

       logging-tag    = "Logging-Tag" ":" 1*UTFCHAR CRLF
  • Set-Cookie

    類似 http 的 cookie,讓 server 在 client 存放 cookie values

  • Vendor-Specific Parameters

    ex:

       com.example.companyA.paramxyz=256
       com.example.companyA.paramabc=High
       com.example.companyB.paramxyz=Low

Generic Result Structure

Recognizer 與 Verifier resource server 產生的 result data,以 Natural Language Semantics Markup Language (NLSML) 格式提供

ex:

   Content-Type:application/nlsml+xml
   Content-Length:...

   <?xml version="1.0"?>
   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
           xmlns:ex="http://www.example.com/example"
           grammar="http://theYesNoGrammar">
       <interpretation>
           <instance>
                   <ex:response>yes</ex:response>
           </instance>
           <input>OK</input>
       </interpretation>
   </result>

Resource Discovery

透過 SIP OPTIONS 向 server 詢問 server capabilities

server 必須以 SDP 回應 capabilities,包含 media type, transport type: m=application 0 TCP/TLS/MRCPv2 1,以及 resource: a=resource:speechsynth

ex:

   C->S:
        OPTIONS sip:mrcp@server.example.com SIP/2.0
        Via:SIP/2.0/TCP client.atlanta.example.com:5060;
         branch=z9hG4bK74bf7
        Max-Forwards:6
        To:<sip:mrcp@example.com>
        From:Sarvi <sip:sarvi@example.com>;tag=1928301774
        Call-ID:a84b4c76e66710
        CSeq:63104 OPTIONS
        Contact:<sip:sarvi@client.example.com>
        Accept:application/sdp
        Content-Length:0


   S->C:
        SIP/2.0 200 OK
        Via:SIP/2.0/TCP client.atlanta.example.com:5060;
         branch=z9hG4bK74bf7;received=192.0.32.10
        To:<sip:mrcp@example.com>;tag=62784
        From:Sarvi <sip:sarvi@example.com>;tag=1928301774
        Call-ID:a84b4c76e66710
        CSeq:63104 OPTIONS
        Contact:<sip:mrcp@server.example.com>
        Allow:INVITE, ACK, CANCEL, OPTIONS, BYE
         Accept:application/sdp
        Accept-Encoding:gzip
        Accept-Language:en
        Supported:foo
        Content-Type:application/sdp
        Content-Length:...

        v=0
        o=sarvi 2890844536 2890842811 IN IP4 192.0.2.12
        s=-
        i=MRCPv2 server capabilities
        c=IN IP4 192.0.2.12/127
        t=0 0
        m=application 0 TCP/TLS/MRCPv2 1
        a=resource:speechsynth
        a=resource:speechrecog
        a=resource:speakverify
        m=audio 0 RTP/AVP 0 3
        a=rtpmap:0 PCMU/8000
        a=rtpmap:3 GSM/8000

Speech Synthesizer Resource

client 發送 text markup,讓server 即時產生 audio stream,可指定語音合成的參數,例如 voice characteristics, speaker speed

有兩種: speech synth, basicsynth

Synthesizer State Machine

pending 的 SPEAK request 可以被 deleted/stopped

   Idle                    Speaking                  Paused
   State                   State                     State
     |                        |                          |
     |----------SPEAK-------->|                 |--------|
     |<------STOP-------------|             CONTROL      |
     |<----SPEAK-COMPLETE-----|                 |------->|
     |<----BARGE-IN-OCCURRED--|                          |
     |              |---------|                          |
     |          CONTROL       |-----------PAUSE--------->|
     |              |-------->|<----------RESUME---------|
     |                        |               |----------|
     |----------|             |              PAUSE       |
     |    BARGE-IN-OCCURRED   |               |--------->|
     |<---------|             |----------|               |
     |                        |      SPEECH-MARKER       |
     |                        |<---------|               |
     |----------|             |----------|               |
     |         STOP           |       RESUME             |
     |          |             |<---------|               |
     |<---------|             |                          |
     |<---------------------STOP-------------------------|
     |----------|             |                          |
     |     DEFINE-LEXICON     |                          |
     |          |             |                          |
     |<---------|             |                          |
     |<---------------BARGE-IN-OCCURRED------------------|

Synthesizer Methods

   synthesizer-method   =  "SPEAK"
                        /  "STOP"
                        /  "PAUSE"
                        /  "RESUME"
                        /  "BARGE-IN-OCCURRED"
                        /  "CONTROL"
                        /  "DEFINE-LEXICON"

Synthesizer Events

   synthesizer-event    =  "SPEECH-MARKER"
                        /  "SPEAK-COMPLETE"

Synthesizer Header Fields

   synthesizer-header  =  jump-size
                       /  kill-on-barge-in
                       /  speaker-profile
                       /  completion-cause
                       /  completion-reason
                       /  voice-parameter
                       /  prosody-parameter
                       /  speech-marker
                       /  speech-language
                       /  fetch-hint
                       /  audio-fetch-hint
                       /  failed-uri
                       /  failed-uri-cause
                       /  speak-restart
                       /  speak-length
                       /  load-lexicon
                       /  lexicon-search-order

Example:

text 會被合成並播放到 media stream,resource 會產生 IN-PROGRESS, SPEAK-COMPLETE event

   C->S: MRCP/2.0 ... SPEAK 543257
         Channel-Identifier:32AECB23433802@speechsynth
         Voice-gender:neutral
         Voice-Age:25
         Prosody-volume:medium
         Content-Type:application/ssml+xml
         Content-Length:...

         <?xml version="1.0"?>
            <speak version="1.0"
                xmlns="http://www.w3.org/2001/10/synthesis"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
                xml:lang="en-US">
            <p>
             <s>You have 4 new messages.</s>
             <s>The first is from Stephanie Williams and arrived at
                <break/>
                <say-as interpret-as="vxml:time">0342p</say-as>.
                </s>
             <s>The subject is
                    <prosody rate="-20%">ski trip</prosody>
             </s>
            </p>
           </speak>

   S->C: MRCP/2.0 ... 543257 200 IN-PROGRESS
         Channel-Identifier:32AECB23433802@speechsynth
         Speech-Marker:timestamp=857206027059

   S->C: MRCP/2.0 ... SPEAK-COMPLETE 543257 COMPLETE
         Channel-Identifier:32AECB23433802@speechsynth
         Completion-Cause:000 normal
         Speech-Marker:timestamp=857206027059

Speech Recognizer Resource

接收 client 提供的 voice stream,轉換為文字

有兩種: speechrecog, dtmfrecog

recognizer resource 的能力有:

  1. Normal Mode Recognition:會將整個語音或 DTMF 判斷是否吻合

  2. Hotword Mode Recognition

    判斷是否有出現某個特定的 speech grammar or DTMF sequence

  3. Voice Enrolled Grammars

    (optional) enrollment 是用某個人的 voice 進行判斷, server 會維護 a list of contacts,包含人員的名稱以及 voice,這個技術也稱為 speaker-dependent recognition

  4. Interpretation

    natural language interpretation

    以 text 作為 input,產生該文字的 grammar

Recognizer State Machine

   Idle                   Recognizing               Recognized
   State                  State                     State
    |                       |                          |
    |---------RECOGNIZE---->|---RECOGNITION-COMPLETE-->|
    |<------STOP------------|<-----RECOGNIZE-----------|
    |                       |                          |
    |              |--------|              |-----------|
    |       START-OF-INPUT  |       GET-RESULT         |
    |              |------->|              |---------->|
    |------------|          |                          |
    |      DEFINE-GRAMMAR   |----------|               |
    |<-----------|          | START-INPUT-TIMERS       |
    |                       |<---------|               |
    |------|                |                          |
    |  INTERPRET            |                          |
    |<-----|                |------|                   |
    |                       |   RECOGNIZE              |
    |-------|               |<-----|                   |
    |      STOP                                        |
    |<------|                                          |
    |<-------------------STOP--------------------------|
    |<-------------------DEFINE-GRAMMAR----------------|

Recognizer Methods

   recognizer-method    =  recog-only-method
                        /  enrollment-method
   recog-only-method    =  "DEFINE-GRAMMAR"
                        /  "RECOGNIZE"
                        /  "INTERPRET"
                        /  "GET-RESULT"
                        /  "START-INPUT-TIMERS"
                        /  "STOP"
   enrollment-method    =  "START-PHRASE-ENROLLMENT"
                        /  "ENROLLMENT-ROLLBACK"
                        /  "END-PHRASE-ENROLLMENT"
                        /  "MODIFY-PHRASE"
                        /  "DELETE-PHRASE"

Recognizer Events

   recognizer-event     =  "START-OF-INPUT"
                        /  "RECOGNITION-COMPLETE"
                        /  "INTERPRETATION-COMPLETE"

Recognizer Header Fields

   recognizer-header    =  recog-only-header
                        /  enrollment-header

   recog-only-header    =  confidence-threshold
                        /  sensitivity-level
                        /  speed-vs-accuracy
                        /  n-best-list-length
                        /  no-input-timeout
                        /  input-type
                        /  recognition-timeout
                        /  waveform-uri
                        /  input-waveform-uri
                        /  completion-cause
                        /  completion-reason
                        /  recognizer-context-block
                        /  start-input-timers
                        /  speech-complete-timeout
                        /  speech-incomplete-timeout
                        /  dtmf-interdigit-timeout
                        /  dtmf-term-timeout
                        /  dtmf-term-char
                        /  failed-uri
                        /  failed-uri-cause
                        /  save-waveform
                        /  media-type
                        /  new-audio-channel
                        /  speech-language
                        /  ver-buffer-utterance
                        /  recognition-mode
                        /  cancel-if-queue
                        /  hotword-max-duration
                        /  hotword-min-duration
                        /  interpret-text
                        /  dtmf-buffer-time
                        /  clear-dtmf-buffer
                        /  early-no-match
                        
   enrollment-header    =  num-min-consistent-pronunciations
                        /  consistency-threshold
                        /  clash-threshold
                        /  personal-grammar-uri
                        /  enroll-utterance
                        /  phrase-id
                        /  phrase-nl
                        /  weight
                        /  save-best-waveform
                        /  new-phrase-id
                        /  confusable-phrases-uri
                        /  abort-phrase-enrollment

Example

   C->S:MRCP/2.0 ... RECOGNIZE 543257
   Channel-Identifier:32AECB23433801@speechrecog
           Confidence-Threshold:0.9
   Content-Type:application/srgs+xml
   Content-ID:<request1@form-level.store>
   Content-Length:...

   <?xml version="1.0"?>

   <!-- the default grammar language is US English -->
   <grammar xmlns="http://www.w3.org/2001/06/grammar"
            xml:lang="en-US" version="1.0" root="request">

   <!-- single language attachment to tokens -->
       <rule id="yes">
               <one-of>
                     <item xml:lang="fr-CA">oui</item>
                     <item xml:lang="en-US">yes</item>
               </one-of>
         </rule>

   <!-- single language attachment to a rule expansion -->
         <rule id="request">
               may I speak to
               <one-of xml:lang="fr-CA">
                     <item>Michel Tremblay</item>
                     <item>Andre Roy</item>
               </one-of>
         </rule>

     </grammar>

   S->C: MRCP/2.0 ... 543257 200 IN-PROGRESS
   Channel-Identifier:32AECB23433801@speechrecog

   S->C:MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
   Channel-Identifier:32AECB23433801@speechrecog

   S->C:MRCP/2.0 ... RECOGNITION-COMPLETE 543257 COMPLETE
   Channel-Identifier:32AECB23433801@speechrecog
   Completion-Cause:000 success
   Waveform-URI:<http://web.media.com/session123/audio.wav>;
                 size=424252;duration=2543
   Content-Type:application/nlsml+xml
   Content-Length:...
   <?xml version="1.0"?>
   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
           xmlns:ex="http://www.example.com/example"
           grammar="session:request1@form-level.store">
       <interpretation>
           <instance name="Person">
               <ex:Person>
                   <ex:Name> Andre Roy </ex:Name>
               </ex:Person>
           </instance>
               <input>   may I speak to Andre Roy </input>
       </interpretation>
   </result>

Recorder Resource

將收到的 audio/video 存到指定的 URI

Recorder State Machine

   Idle                   Recording
   State                  State
    |                       |
    |---------RECORD------->|
    |                       |
    |<------STOP------------|
    |                       |
    |<--RECORD-COMPLETE-----|
    |                       |
    |              |--------|
    |       START-OF-INPUT  |
    |              |------->|
    |                       |
    |              |--------|
    |    START-INPUT-TIMERS |
    |              |------->|
    |                       |

Recorder Methods

   recorder-method      =  "RECORD"
                        /  "STOP"
                        /  "START-INPUT-TIMERS"

Recorder Events

   recorder-event       =  "START-OF-INPUT"
                        /  "RECORD-COMPLETE"

Recorder Header Fields

   recorder-header      =  sensitivity-level
                        /  no-input-timeout
                        /  completion-cause
                        /  completion-reason
                        /  failed-uri
                        /  failed-uri-cause
                        /  record-uri
                        /  media-type
                        /  max-time
                        /  trim-length
                        /  final-silence
                        /  capture-on-speech
                        /  ver-buffer-utterance
                        /  start-input-timers
                        /  new-audio-channel

example

   C->S:  MRCP/2.0 ... RECORD 543257
          Channel-Identifier:32AECB23433802@recorder
          Record-URI:<file://mediaserver/recordings/myfile.wav>
          Media-Type:audio/wav
          Capture-On-Speech:true
          Final-Silence:300
          Max-Time:6000

   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... RECORD-COMPLETE 543257 COMPLETE
          Channel-Identifier:32AECB23433802@recorder
          Completion-Cause:000 success-silence
          Record-URI:<file://mediaserver/recordings/myfile.wav>;
                     size=242552;duration=25645

Speaker Verification and Identification

辨識 speaker 的身份

Speaker Verification State Machine

     Idle              Session Opened       Verifying/Training
     State             State                State
      |                   |                         |
      |--START-SESSION--->|                         |
      |                   |                         |
      |                   |----------|              |
      |                   |     START-SESSION       |
      |                   |<---------|              |
      |                   |                         |
      |<--END-SESSION-----|                         |
      |                   |                         |
      |                   |---------VERIFY--------->|
      |                   |                         |
      |                   |---VERIFY-FROM-BUFFER--->|
      |                   |                         |
      |                   |----------|              |
      |                   |  VERIFY-ROLLBACK        |
      |                   |<---------|              |
      |                   |                         |
      |                   |                |--------|
      |                   | GET-INTERMEDIATE-RESULT |
      |                   |                |------->|
      |                   |                         |
      |                   |                |--------|
      |                   |     START-INPUT-TIMERS  |
      |                   |                |------->|
      |                   |                         |
      |                   |                |--------|
      |                   |         START-OF-INPUT  |
      |                   |                |------->|
      |                   |                         |
      |                   |<-VERIFICATION-COMPLETE--|
      |                   |                         |
      |                   |<--------STOP------------|
      |                   |                         |
      |                   |----------|              |
      |                   |         STOP            |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      |         STOP      |                         |
      |<---------|        |                         |
      |                   |----------|              |
      |                   |    CLEAR-BUFFER         |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      |   CLEAR-BUFFER    |                         |
      |<---------|        |                         |
      |                   |                         |
      |                   |----------|              |
      |                   |   QUERY-VOICEPRINT      |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      | QUERY-VOICEPRINT  |                         |
      |<---------|        |                         |
      |                   |                         |
      |                   |----------|              |
      |                   |  DELETE-VOICEPRINT      |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      | DELETE-VOICEPRINT |                         |
      |<---------|        |                         |

Speaker Verification Methods

   verifier-method          =  "START-SESSION"
                            / "END-SESSION"
                            / "QUERY-VOICEPRINT"
                            / "DELETE-VOICEPRINT"
                            / "VERIFY"
                            / "VERIFY-FROM-BUFFER"
                            / "VERIFY-ROLLBACK"
                            / "STOP"
                            / "CLEAR-BUFFER"
                            / "START-INPUT-TIMERS"
                            / "GET-INTERMEDIATE-RESULT"

Verification Events

   verifier-event       =  "VERIFICATION-COMPLETE"
                        /  "START-OF-INPUT"

Verification Header Fields

   verification-header      =  repository-uri
                            /  voiceprint-identifier
                            /  verification-mode
                            /  adapt-model
                            /  abort-model
                            /  min-verification-score
                            /  num-min-verification-phrases
                            /  num-max-verification-phrases
                            /  no-input-timeout
                            /  save-waveform
                            /  media-type
                            /  waveform-uri
                            /  voiceprint-exists
                            /  ver-buffer-utterance
                            /  input-waveform-uri
                            /  completion-cause
                            /  completion-reason
                            /  speech-complete-timeout
                            /  new-audio-channel
                            /  abort-verification
                            /  start-input-timers

References

MRCP wiki

MRCP協議學習筆記-MRCP背景知識介紹

MRCP學習筆記-語音識別資源的事件和Headers詳解

MRCP協議學習筆記-語音識別資源的概括和全部Methods

MRCP協議學習筆記-關於媒體資源伺服器的定位路由策略

MRCPv2概述

MRCPv2在電信智能語音識別業務中的應用

MRCPv2 - Speech Synthesizer Resource

MRCP v2.0 規範 - RFC6787中文翻譯(1)

cisco 使用MRCPv1 ASR/TTS的IOS語音XML網關到CVP呼叫流

2020年2月10日

如何在 Markdown 輸入數學公式及符號

使用 latex 語法在 Markdown 輸入數學公式及符號

數學公式

1. 如何插入公式

有行內公式與獨立公式兩種

$ 行內公式 $

$$ 獨立公式 $$

ex:

行內公式 \(F=ma\)

獨立公式 \[F=ma\]

2. 上下標

上標符號,符號 ^, ex: $x^2$,就是 \(x^2\)

下標符號,符號:_,ex: $x_2$,就是 \( x_2 \)

組合符號,符號:{},ex: $x_{12}$,就是 \(x_{12}\)

如果要在左右兩邊都有上下標,可以用 \sideset 命令。

$$ \sideset{^1_2}{^3_4}\bigotimes $$

\[ \sideset{^1_2}{^3_4}\bigotimes \]

3. 括號

()[]| 表示符號本身,用 \{\} 來表示 {} 。當要顯示大號的括號或分隔符時,要用 \left\right 命令。

一些特殊的括號:

輸入 顯示 輸入 顯示
\langle \(\langle\) \rangle \(\rangle\)
\lceil \(\lceil\) \rceil \(\rceil\)
\lfloor \(\lfloor\) \rfloor \(\rfloor\)
\lbrace \(\lbrace\) \rbrace \(\rbrace\)

ex1:

$$ f(x,y,z) = 3y^2z \left( 3+\frac{7x+5}{1+y^2} \right) $$

\[ f(x,y,z) = 3y^2z \left( 3+\frac{7x+5}{1+y^2} \right) \]

ex2:

$$ \left. \frac{{\rm d}u}{{\rm d}x} \right| _{x=0} $$

\[ \left. \frac{{\rm d}u}{{\rm d}x} \right| _{x=0} \]

4. 分數

通常用 \frac {分子} {分母} 產生一個分數,分數可嵌套。 可直接輸入 \frac ab 來快速生成一個 \(\frac ab\) 。 如果分式很複雜,亦可使用 分子 \over 分母 命令,此時分數僅有一層。

ex:

$$\frac{a-1}{b-1} \quad and \quad {a+1\over b+1}$$

\[\frac{a-1}{b-1} \quad and \quad {a+1\over b+1}\]

5. 開方

\sqrt [根指數,省略時為2] {被開方數} 輸入開方。

ex:

$$\sqrt{2} \quad and \quad \sqrt[n]{3}$$

\[\sqrt{2} \quad and \quad \sqrt[n]{3}\]

6. 省略符號

數學公式中常見的省略號有兩種,\ldots 表示與文本底線對齊的省略號,\cdots 表示與文本中線對齊的省略號。

ex:

$$ f(x_1,x_2,\underbrace{\ldots}_{\rm ldots} ,x_n) = x_1^2 + x_2^2 + \underbrace{\cdots}_{\rm cdots} + x_n^2 $$

\[f(x_1,x_2,\underbrace{\ldots}_{\rm ldots} ,x_n) = x_1^2 + x_2^2 + \underbrace{\cdots}_{\rm cdots} + x_n^2\]

7. 向量

\vec{向量} 產生一個向量。也可以用 \overrightarrow 自訂字母上方的符號。

ex:

$$\vec{a} \cdot \vec{b}=0$$

\[\vec{a} \cdot \vec{b}=0\]

ex:

$$\overleftarrow{xy} \quad and \quad \overleftrightarrow{xy} \quad and \quad \overrightarrow{xy}$$

\[\overleftarrow{xy} \quad and \quad \overleftrightarrow{xy} \quad and \quad \overrightarrow{xy}\]

8. 微積分

\int_積分下限^積分上限 {積分表達式}

ex:

$$\int_0^1 {x^2} \,{\rm d}x $$

\[\int_0^1 {x^2} \,{\rm d}x\]

本例中 \,{\rm d} 部分可省略,但建議加入,能使式子更美觀。

\[\int_0^1 {x^2} dx \] 可發現 d 的部分跟上面有一點不一樣

\partial{}微分

ex:

\frac{\partial x}{\partial y}   

\(\frac{\partial x}{\partial y}\)

9. 極限

\lim_{變數 \to 表達式} 表達式

如有需要,可以修改 \to 符號為任意符號。

ex:

$$ \lim_{n \to +\infty} \frac{1}{n(n+1)} \quad and \quad \lim_{x\leftarrow{sample}} \frac{1}{n(n+1)} $$

\[ \lim_{n \to +\infty} \frac{1}{n(n+1)} \quad and \quad \lim_{x\leftarrow{sample}} \frac{1}{n(n+1)} \]

10. 級數

\sum_{下標表達式}^{上標表達式} {級數表達式} 與之類似,使用 \prod \bigcup \bigcap 來分別輸入連乘、聯集和交集

ex:

$$\sum_{i=1}^n \frac{1}{i^2} \quad and \quad \prod_{i=1}^n \frac{1}{i^2} \quad and \quad \bigcup_{i=1}^{2} R$$

\[\sum_{i=1}^n \frac{1}{i^2} \quad and \quad \prod_{i=1}^n \frac{1}{i^2} \quad and \quad \bigcup_{i=1}^{2} R\]

11. 希臘字母

\小寫希臘字母英文全稱\首字母大寫希臘字母英文全稱 來分別輸入小寫和大寫希臘字母。對於大寫希臘字母與現有字母相同的,直接輸入大寫字母即可。也可以直接用該字母,簡化數學式的寫法。

輸入 顯示 輸入 顯示 輸入 顯示 輸入 顯示
\alpha \(\alpha\) A \(A\) \beta \(\beta\) B \(B\)
\gamma \(\gamma\) \Gamma \(\Gamma\) \delta \(\delta\) \Delta \(\Delta\)
\epsilon \(\epsilon\) E \(E\) \zeta \(\zeta\) Z \(Z\)
\eta \(\eta\) H \(H\) \theta \(\theta\) \Theta \(\Theta\)
\iota \(\iota\) I \(I\) \kappa \(\kappa\) K \(K\)
\lambda \(\lambda\) \Lambda \(\Lambda\) \mu \(\mu\) M \(M\)
\nu \(\nu\) N \(N\) \xi \(\xi\) \Xi \(\Xi\)
o \(o\) O \(O\) \pi \(\pi\) \Pi \(\Pi\)
\rho \(\rho\) P \(P\) \sigma \(\sigma\) \Sigma \(\Sigma\)
\tau \(\tau\) T \(T\) \upsilon \(\upsilon\) \Upsilon \(\Upsilon\)
\phi \(\phi\) \Phi \(\Phi\) \chi \(\chi\) X \(X\)
\psi \(\psi\) \Psi \(\Psi\) \omega \(\omega\) \Omega \(\Omega\)

部分字母有變數專用形式,以 \var- 開頭。

小寫形式 大寫形式 變量形式 顯示
\epsilon E \varepsilon \(\epsilon \mid E \mid \varepsilon\)
\theta \Theta \vartheta \(\theta \mid \Theta \mid \vartheta\)
\rho P \varrho \(\rho \mid P \mid \varrho\)
\sigma \Sigma \varsigma \(\sigma \mid \Sigma \mid \varsigma\)
\phi \Phi \varphi \(\phi \mid \Phi \mid \varphi\)

12. 特殊符號

可在 Detexify 畫出符號,找到該符號的 latex 語法

若需要顯示更大或更小的字元,在符號前插入 \large\small

12.1 關係運算

輸入 顯示 輸入 顯示 輸入 顯示 輸入 顯示
\pm \(\pm\) \times \(\times\) \div \(\div\) \mid \(\mid\)
\nmid \(\nmid\) \cdot \(\nmid\) \circ \(\nmid\) \ast \(\ast\)
\bigodot \(\ast\) \bigotimes \(\ast\) \bigoplus \(\bigoplus\) \leq \(\bigoplus\)
\geq \(\geq\) \neq \(\neq\) \approx \(\approx\) \equiv \(\equiv\)
\sum \(\sum\) \prod \(\sum\) \coprod \(\coprod\) \backslash \(\backslash\)
\ngeq \(\ngeq\) \nleq \(\nleq\) \not\geq \(\not\geq\) \not\leq \(\not\leq\)

12.2 集合運算

輸入 顯示 輸入 顯示 輸入 顯示
\emptyset \(\emptyset\) \in \(\in\) \notin \(\notin\)
\subset \(\subset\) \supset \(\supset\) \subseteq \(\subseteq\)
\supseteq \(\supseteq\) \bigcap \(\bigcap\) \bigcup \(\bigcup\)
\bigvee \(\bigvee\) \bigwedge \(\bigwedge\) \biguplus \(\biguplus\)
\subsetneq \(\subsetneq\) \supsetneq \(\supsetneq\) \setminus \(\setminus\)
\bigodot \(\bigodot\) \bigotimes \(\bigotimes\) \mathbb{R} \(\mathbb{R}\)
\mathbb{Z} \(\mathbb{Z}\)

12.3 對數

輸入 顯示 輸入 顯示 輸入 顯示
\log \(\log\) \lg \(\lg\) \ln \(\ln\)

12.4 三角函數

輸入 顯示 輸入 顯示 輸入 顯示
30^\circ \(30^\circ\) \bot \(\bot\) \angle A \(\angle A\)
\sin \(\sin\) \cos \(\cos\) \tan \(\tan\)
\csc \(\csc\) \sec \(\sec\) \cot \(\cot\)

12.5 微積分

輸入 顯示 輸入 顯示 輸入 顯示
\int \(\int\) \iint \(\iint\) \iiint \(\iiint\)
\iiiint \(\iiiint\) \oint \(\oint\) \prime \(\prime\)
\lim \(\lim\) \infty \(\infty\) \nabla \(\nabla\)

12.6 邏輯運算

輸入 顯示 輸入 顯示 輸入 顯示
\because \(\because\) \therefore \(\therefore\)
\forall \(\forall\) \exists \(\exists\) \not\subset \(\not\subset\)
\not< \(\not<\) \not> \(\not>\) \not= \(\not=\)

12.7 hat

輸入 顯示 輸入 顯示
\hat{xy} \(\hat{xy}\) \widehat{xyz} \(\widehat{xyz}\)
\tilde{xy} \(\tilde{xy}\) \widetilde{xyz} \(\widetilde{xyz}\)
\check{x} \(\check{x}\) \breve{y} \(\breve{y}\)
\grave{x} \(\grave{x}\) \acute{y} \(\acute{y}\)

12.8 連線

輸入 顯示
\fbox{a+b+c+d} \(\fbox{a+b+c+d}\)
\overleftarrow{a+b+c+d} \(\overleftarrow{a+b+c+d}\)
\overrightarrow{a+b+c+d} \(\overrightarrow{a+b+c+d}\)
\overleftrightarrow{a+b+c+d} \(\overleftrightarrow{a+b+c+d}\)
\underleftarrow{a+b+c+d} \(\underleftarrow{a+b+c+d}\)
\underrightarrow{a+b+c+d} \(\underrightarrow{a+b+c+d}\)
\underleftrightarrow{a+b+c+d} \(\underleftrightarrow{a+b+c+d}\)
\overline{a+b+c+d} \(\overline{a+b+c+d}\)
\underline{a+b+c+d} \(\underline{a+b+c+d}\)
\overbrace{a+b+c+d}^{Sample} \(\overbrace{a+b+c+d}^{Sample}\)
\underbrace{a+b+c+d}_{Sample} \(\underbrace{a+b+c+d}_{Sample}\)
\overbrace{a+\underbrace{b+c}_{1.0}+d}^{2.0} \(\overbrace{a+\underbrace{b+c}_{1.0}+d}^{2.0}\)
\underbrace{a\cdot a\cdots a}_{b\text{ times}} \(\underbrace{a\cdot a\cdots a}_{b\text{ times}}\)

12.9 箭頭

輸入 顯示 輸入 顯示 輸入 顯示
\to \(\to\) \mapsto \(\mapsto\)
\implies \(\implies\) \iff \(\iff\) \impliedby \(\impliedby\)
  • 其它可用符號:
輸入 顯示 輸入 顯示
\uparrow \(\uparrow\) \Uparrow \(\Uparrow\)
\downarrow \(\downarrow\) \Downarrow \(\Downarrow\)
\leftarrow \(\leftarrow\) \Leftarrow \(\Leftarrow\)
\rightarrow \(\rightarrow\) \Rightarrow \(\Rightarrow\)
\leftrightarrow \(\leftrightarrow\) \Leftrightarrow \(\Leftrightarrow\)
\longleftarrow \(\longleftarrow\) \Longleftarrow \(\Longleftarrow\)
\longrightarrow \(\longrightarrow\) \Longrightarrow \(\Longrightarrow\)
\longleftrightarrow \(\longleftrightarrow\) \Longleftrightarrow \(\Longleftrightarrow\)

12.10 四則運算

運算 寫法 顯示
加法 x+y \(x+y\)
減法 x-y \(x-y\)
加減 x \pm y \(x \pm y\)
減加 x \mp y \(x \mp y\)
乘法 x \times y \(x \times y\)
星乘法 x \ast y \(x \ast y\)
點乘法 x \cdot y \(x \cdot y\)
除法 x \div y \(x \div y\)
斜除法 x / y \(x / y\)
分數 \frac{x}{y} \(\frac{x}{y}\)
分數 {x}\over{y} \({x}\over{y}\)

12.11 其他

運算 寫法 顯示
無窮 \infty \(\infty\)
虛數 \imath \(\imath\)
虛數 \jmath \(\jmath\)
^{\circ} \(^{\circ}\)

13. 字體轉換

要對公式的某一部分字元進行字體轉換,可以用 {\字體 {需轉換的部分字元}} 命令,其中 \字體 部分可以參照下表選擇合適的字體。一般情況下,預設為意大利體 \(italic\) 。

全部大寫 的字體僅大寫可用。

輸入 說明 顯示 輸入 說明 顯示
\rm 羅馬體 \(\rm{Sample}\) \cal 花體 \(\cal{SAMPLE}\)
\it 意大利體 \(\it{Sample}\) \Bbb 黑板粗體 \(\Bbb{SAMPLE}\)
\bf 粗體 \(\bf{Sample}\) \mit 數學斜體 \(\mit{SAMPLE}\)
\sf 等線體 \(\sf{Sample}\) \scr 手寫體 \(\scr{SAMPLE}\)
\tt 打字機體 \(\tt{Sample}\)
\frak 舊德式字體 \(\frak{Sample}\)

轉換字體十分常用,例如在積分中:

\begin{array}{cc}
\mathrm{Bad} & \mathrm{Better} \\
\hline \\
\int_0^1 x^2 dx & \int_0^1 x^2 \,{\rm d}x
\end{array}

\(\begin{array}{cc} \mathrm{Bad} & \mathrm{Better} \\ \hline \\ \int_0^1 x^2 dx & \int_0^1 x^2 \,{\rm d}x \end{array}\)

14. 大括號與行標

\left\right 來產生自動匹配高度的 (圓括號),[方括號] 和 {大括號}。 在每個公式結束前用 \tag{行標} 來實現行標。

$$
f\left(
   \left[
     \frac{
       1+\left\{x,y\right\}
     }{
       \left(
          \frac{x}{y}+\frac{y}{x}
       \right)
       \left(u+1\right)
     }+a
   \right]^{3/2}
\right)
\tag{行標}
$$

\[ f\left( \left[ \frac{ 1+\left\{x,y\right\} }{ \left( \frac{x}{y}+\frac{y}{x} \right) \left(u+1\right) }+a \right]^{3/2} \right) \tag{行標} \]

如果你需要在不同的行顯示對應括號,可以在每一行對應處使用 \left.\right. 來放一個"影子"括號:

ex:

$$
\begin{aligned}
a=&\left(1+2+3+  \cdots \right. \\
& \cdots+ \left. \infty-2+\infty-1+\infty\right)
\end{aligned}
$$

\[ \begin{aligned} a=&\left(1+2+3+ \cdots \right. \\ & \cdots+ \left. \infty-2+\infty-1+\infty\right) \end{aligned} \]

要將行內顯示的分隔符也變大,可以使用 \middle

$$
\left\langle
  q
\middle\|
  \frac{\frac{x}{y}}{\frac{u}{v}}
\middle|
   p
\right\rangle
$$

\[ \left\langle q \middle\| \frac{\frac{x}{y}}{\frac{u}{v}} \middle| p \right\rangle \]

15. 其他指令

15.1 定義新的符號 \operatorname

可查詢 關於此命令的定義關於此命令的討論

ex:

$$ \operatorname{Symbol} A $$

\[\operatorname{Symbol} A\]

15.2 註釋文字 \text

\text {文字} 中仍可以使用 $公式$ 插入其它公式。

ex:

$$ f(n)= \begin{cases} n/2, & \text {if $n$ is even} \\ 3n+1, & \text{if $n$ is odd} \end{cases} $$

\[ f(n)= \begin{cases} n/2, & \text {if $n$ is even} \\ 3n+1, & \text{if $n$ is odd} \end{cases} \]

15.3 在字元間加入空格

有四種寬度的空格可以使用: \,\;\quad\qquad

ex:

$$ a \, b \mid a \; b \mid a \quad b \mid a \qquad b $$

\[ a \, b \mid a \; b \mid a \quad b \mid a \qquad b \]

\text {n個空格} 也可以達到同樣效果。

15.4 修改文字顏色

使用 \color{顏色}{文字} 來更改特定的文字顏色。 更改文字顏色 需要瀏覽器支援 ,如果瀏覽器不知道你所需的顏色,那麼文字將為黑色。

對於較舊的瀏覽器(HTML4與CSS2),支援以下顏色:

輸入 顯示 輸入 顯示
black \(\color{black}{text}\) grey \(\color{grey}{text}\)
silver \(\color{silver}{text}\) white \(\color{white}{text}\)
maroon \(\color{maroon}{text}\) red \(\color{red}{text}\)
yellow \(\color{yellow}{text}\) lime \(\color{lime}{text}\)
olive \(\color{olive}{text}\) green \(\color{green}{text}\)
teal \(\color{teal}{text}\) auqa \(\color{auqa}{text}\)
blue \(\color{blue}{text}\) navy \(\color{navy}{text}\)
purple \(\color{purple}{text}\) fuchsia \(\color{fuchsia}{text}\)

對於較新的瀏覽器(HTML5與CSS3),支援額外的124種顏色:

輸入 \color {#rgb} {text} 來自定義更多的顏色,其中 #rgbr g b 可輸入 0-9a-f 來表示紅色、綠色和藍色的純度(飽和度)。

ex:

\begin{array}{|rrrrrrrr|}\hline
\verb+#000+ & \color{#000}{text} & & &
\verb+#00F+ & \color{#00F}{text} & & \\
& & \verb+#0F0+ & \color{#0F0}{text} &
& & \verb+#0FF+ & \color{#0FF}{text}\\
\verb+#F00+ & \color{#F00}{text} & & &
\verb+#F0F+ & \color{#F0F}{text} & & \\
& & \verb+#FF0+ & \color{#FF0}{text} &
& & \verb+#FFF+ & \color{#FFF}{text}\\
\hline
\end{array}

\(\begin{array}{|rrrrrrrr|}\hline \verb+#000+ & \color{#000}{text} & & & \verb+#00F+ & \color{#00F}{text} & & \\ & & \verb+#0F0+ & \color{#0F0}{text} & & & \verb+#0FF+ & \color{#0FF}{text}\\ \verb+#F00+ & \color{#F00}{text} & & & \verb+#F0F+ & \color{#F0F}{text} & & \\ & & \verb+#FF0+ & \color{#FF0}{text} & & & \verb+#FFF+ & \color{#FFF}{text}\\ \hline \end{array}\)

ex:

\begin{array}{|rrrrrrrr|}
\hline
\verb+#000+ & \color{#000}{text} & \verb+#005+ & \color{#005}{text} & \verb+#00A+ & \color{#00A}{text} & \verb+#00F+ & \color{#00F}{text}  \\
\verb+#500+ & \color{#500}{text} & \verb+#505+ & \color{#505}{text} & \verb+#50A+ & \color{#50A}{text} & \verb+#50F+ & \color{#50F}{text}  \\
\verb+#A00+ & \color{#A00}{text} & \verb+#A05+ & \color{#A05}{text} & \verb+#A0A+ & \color{#A0A}{text} & \verb+#A0F+ & \color{#A0F}{text}  \\
\verb+#F00+ & \color{#F00}{text} & \verb+#F05+ & \color{#F05}{text} & \verb+#F0A+ & \color{#F0A}{text} & \verb+#F0F+ & \color{#F0F}{text}  \\
\hline
\verb+#080+ & \color{#080}{text} & \verb+#085+ & \color{#085}{text} & \verb+#08A+ & \color{#08A}{text} & \verb+#08F+ & \color{#08F}{text}  \\
\verb+#580+ & \color{#580}{text} & \verb+#585+ & \color{#585}{text} & \verb+#58A+ & \color{#58A}{text} & \verb+#58F+ & \color{#58F}{text}  \\
\verb+#A80+ & \color{#A80}{text} & \verb+#A85+ & \color{#A85}{text} & \verb+#A8A+ & \color{#A8A}{text} & \verb+#A8F+ & \color{#A8F}{text}  \\
\verb+#F80+ & \color{#F80}{text} & \verb+#F85+ & \color{#F85}{text} & \verb+#F8A+ & \color{#F8A}{text} & \verb+#F8F+ & \color{#F8F}{text}  \\
\hline
\verb+#0F0+ & \color{#0F0}{text} & \verb+#0F5+ & \color{#0F5}{text} & \verb+#0FA+ & \color{#0FA}{text} & \verb+#0FF+ & \color{#0FF}{text}  \\
\verb+#5F0+ & \color{#5F0}{text} & \verb+#5F5+ & \color{#5F5}{text} & \verb+#5FA+ & \color{#5FA}{text} & \verb+#5FF+ & \color{#5FF}{text}  \\
\verb+#AF0+ & \color{#AF0}{text} & \verb+#AF5+ & \color{#AF5}{text} & \verb+#AFA+ & \color{#AFA}{text} & \verb+#AFF+ & \color{#AFF}{text}  \\
\verb+#FF0+ & \color{#FF0}{text} & \verb+#FF5+ & \color{#FF5}{text} & \verb+#FFA+ & \color{#FFA}{text} & \verb+#FFF+ & \color{#FFF}{text}  \\
\hline
\end{array}

\[ \begin{array}{|rrrrrrrr|} \hline \verb+#000+ & \color{#000}{text} & \verb+#005+ & \color{#005}{text} & \verb+#00A+ & \color{#00A}{text} & \verb+#00F+ & \color{#00F}{text} \\ \verb+#500+ & \color{#500}{text} & \verb+#505+ & \color{#505}{text} & \verb+#50A+ & \color{#50A}{text} & \verb+#50F+ & \color{#50F}{text} \\ \verb+#A00+ & \color{#A00}{text} & \verb+#A05+ & \color{#A05}{text} & \verb+#A0A+ & \color{#A0A}{text} & \verb+#A0F+ & \color{#A0F}{text} \\ \verb+#F00+ & \color{#F00}{text} & \verb+#F05+ & \color{#F05}{text} & \verb+#F0A+ & \color{#F0A}{text} & \verb+#F0F+ & \color{#F0F}{text} \\ \hline \verb+#080+ & \color{#080}{text} & \verb+#085+ & \color{#085}{text} & \verb+#08A+ & \color{#08A}{text} & \verb+#08F+ & \color{#08F}{text} \\ \verb+#580+ & \color{#580}{text} & \verb+#585+ & \color{#585}{text} & \verb+#58A+ & \color{#58A}{text} & \verb+#58F+ & \color{#58F}{text} \\ \verb+#A80+ & \color{#A80}{text} & \verb+#A85+ & \color{#A85}{text} & \verb+#A8A+ & \color{#A8A}{text} & \verb+#A8F+ & \color{#A8F}{text} \\ \verb+#F80+ & \color{#F80}{text} & \verb+#F85+ & \color{#F85}{text} & \verb+#F8A+ & \color{#F8A}{text} & \verb+#F8F+ & \color{#F8F}{text} \\ \hline \verb+#0F0+ & \color{#0F0}{text} & \verb+#0F5+ & \color{#0F5}{text} & \verb+#0FA+ & \color{#0FA}{text} & \verb+#0FF+ & \color{#0FF}{text} \\ \verb+#5F0+ & \color{#5F0}{text} & \verb+#5F5+ & \color{#5F5}{text} & \verb+#5FA+ & \color{#5FA}{text} & \verb+#5FF+ & \color{#5FF}{text} \\ \verb+#AF0+ & \color{#AF0}{text} & \verb+#AF5+ & \color{#AF5}{text} & \verb+#AFA+ & \color{#AFA}{text} & \verb+#AFF+ & \color{#AFF}{text} \\ \verb+#FF0+ & \color{#FF0}{text} & \verb+#FF5+ & \color{#FF5}{text} & \verb+#FFA+ & \color{#FFA}{text} & \verb+#FFF+ & \color{#FFF}{text} \\ \hline \end{array} \]

15.5 刪除線

使用刪除線功能必須用 $$ 符號。

在公式內使用 \require{cancel} 來允許 片段刪除線 的顯示。 聲明片段刪除線後,使用 \cancel{字符}\bcancel{字符}\xcancel{字符}\cancelto{字符} 來實現各種片段刪除線效果。

$$
\require{cancel}\begin{array}{rl}
\verb|y+\cancel{x}| & y+\cancel{x}\\
\verb|\cancel{y+x}| & \cancel{y+x}\\
\verb|y+\bcancel{x}| & y+\bcancel{x}\\
\verb|y+\xcancel{x}| & y+\xcancel{x}\\
\verb|y+\cancelto{0}{x}| & y+\cancelto{0}{x}\\
\verb+\frac{1\cancel9}{\cancel95} = \frac15+& \frac{1\cancel9}{\cancel95} = \frac15 \\
\end{array}
$$

\[ \require{cancel}\begin{array}{rl} \verb|y+\cancel{x}| & y+\cancel{x}\\ \verb|\cancel{y+x}| & \cancel{y+x}\\ \verb|y+\bcancel{x}| & y+\bcancel{x}\\ \verb|y+\xcancel{x}| & y+\xcancel{x}\\ \verb|y+\cancelto{0}{x}| & y+\cancelto{0}{x}\\ \verb+\frac{1\cancel9}{\cancel95} = \frac15+& \frac{1\cancel9}{\cancel95} = \frac15 \\ \end{array} \]

\require{enclose} 來允許 整段刪除線 的顯示。 聲明整段刪除線後,使用 \enclose{刪除線效果}{字符} 來實現各種整段刪除線效果。 其中,刪除線效果有 horizontalstrikeverticalstrikeupdiagonalstrikedowndiagonalstrike,可疊加使用。

$$
\require{enclose}\begin{array}{rl}
\verb|\enclose{horizontalstrike}{x+y}| & \enclose{horizontalstrike}{x+y}\\
\verb|\enclose{verticalstrike}{\frac xy}| & \enclose{verticalstrike}{\frac xy}\\
\verb|\enclose{updiagonalstrike}{x+y}| & \enclose{updiagonalstrike}{x+y}\\
\verb|\enclose{downdiagonalstrike}{x+y}| & \enclose{downdiagonalstrike}{x+y}\\
\verb|\enclose{horizontalstrike,updiagonalstrike}{x+y}| & \enclose{horizontalstrike,updiagonalstrike}{x+y}\\
\end{array}
$$

\[ \require{enclose}\begin{array}{rl} \verb|\enclose{horizontalstrike}{x+y}| & \enclose{horizontalstrike}{x+y}\\ \verb|\enclose{verticalstrike}{\frac xy}| & \enclose{verticalstrike}{\frac xy}\\ \verb|\enclose{updiagonalstrike}{x+y}| & \enclose{updiagonalstrike}{x+y}\\ \verb|\enclose{downdiagonalstrike}{x+y}| & \enclose{downdiagonalstrike}{x+y}\\ \verb|\enclose{horizontalstrike,updiagonalstrike}{x+y}| & \enclose{horizontalstrike,updiagonalstrike}{x+y}\\ \end{array} \]

矩陣

1. 無框矩陣

在開頭使用 begin{matrix},在結尾使用 end{matrix},在中間插入矩陣元素,每個元素之間插入 & ,並在每行結尾處使用 \\

$$
        \begin{matrix}
        1 & x & x^2 \\
        1 & y & y^2 \\
        1 & z & z^2 \\
        \end{matrix}
$$

\[ \begin{matrix} 1 & x & x^2 \\ 1 & y & y^2 \\ 1 & z & z^2 \\ \end{matrix} \]

2. 邊框矩陣

matrix 替換為 pmatrix bmatrix Bmatrix vmatrix Vmatrix

$ \begin{matrix} 1 & 2 \\ 3 & 4 \\ \end{matrix} $
$ \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ \end{pmatrix} $
$ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} $
$ \begin{Bmatrix} 1 & 2 \\ 3 & 4 \\ \end{Bmatrix} $
$ \begin{vmatrix} 1 & 2 \\ 3 & 4 \\ \end{vmatrix} $
$ \begin{Vmatrix} 1 & 2 \\ 3 & 4 \\ \end{Vmatrix} $
matrix pmatrix bmatrix Bmatrix vmatrix Vmatrix
\( \begin{matrix} 1 & 2 \\ 3 & 4 \\ \end{matrix} \) \( \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ \end{pmatrix} \) \( \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \) \( \begin{Bmatrix} 1 & 2 \\ 3 & 4 \\ \end{Bmatrix} \) \( \begin{vmatrix} 1 & 2 \\ 3 & 4 \\ \end{vmatrix} \) \( \begin{Vmatrix} 1 & 2 \\ 3 & 4 \\ \end{Vmatrix} \)

3. 帶省略符號的矩陣

\cdots \(\cdots\) , \ddots \(\ddots\) , \vdots \(\vdots\) 輸入省略符號。

ex:

$$
        \begin{pmatrix}
        1 & a_1 & a_1^2 & \cdots & a_1^n \\
        1 & a_2 & a_2^2 & \cdots & a_2^n \\
        \vdots & \vdots & \vdots & \ddots & \vdots \\
        1 & a_m & a_m^2 & \cdots & a_m^n \\
        \end{pmatrix}
$$

\[ \begin{pmatrix} 1 & a_1 & a_1^2 & \cdots & a_1^n \\ 1 & a_2 & a_2^2 & \cdots & a_2^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & a_m & a_m^2 & \cdots & a_m^n \\ \end{pmatrix} \]

4. 帶分割符號的矩陣

cc|c 代表在一個三列矩陣中的第二和第三列之間插入分割線。

$$
\left[
    \begin{array}{cc|c}
      1&2&3\\
      4&5&6
    \end{array}
\right]
$$

\[ \left[ \begin{array}{cc|c} 1&2&3\\ 4&5&6 \end{array} \right] \]

5. 行內矩陣

\bigl(\begin{smallmatrix} ... \end{smallmatrix}\bigr)

ex:

這是一個行內矩陣 $\bigl( \begin{smallmatrix} a & b \\ c & d \end{smallmatrix} \bigr)$ 。

這是一個行內矩陣 \(\bigl( \begin{smallmatrix} a & b \\ c & d \end{smallmatrix} \bigr)\) 。

方程式

1. 方程式序列

\begin{align}…\end{align} 來創造一列方程式,其中在每行結尾處使用 \\

請注意 {align} 語句是 自動編號

\begin{align}
\sqrt{37} & = \sqrt{\frac{73^2-1}{12^2}} \\
 & = \sqrt{\frac{73^2}{12^2}\cdot\frac{73^2-1}{73^2}} \\
 & = \sqrt{\frac{73^2}{12^2}}\sqrt{\frac{73^2-1}{73^2}} \\
 & = \frac{73}{12}\sqrt{1 - \frac{1}{73^2}} \\
 & \approx \frac{73}{12}\left(1 - \frac{1}{2\cdot73^2}\right)
\end{align}

\[ \begin{align} \sqrt{37} & = \sqrt{\frac{73^2-1}{12^2}} \\ & = \sqrt{\frac{73^2}{12^2}\cdot\frac{73^2-1}{73^2}} \\ & = \sqrt{\frac{73^2}{12^2}}\sqrt{\frac{73^2-1}{73^2}} \\ & = \frac{73}{12}\sqrt{1 - \frac{1}{73^2}} \\ & \approx \frac{73}{12}\left(1 - \frac{1}{2\cdot73^2}\right) \end{align} \]

2. 在方程式序列的每一行中注明原因

{align} 中靈活組合 \text\tag 語句。\tag 語句編號優先級高於自動編號。

\begin{align}
   v + w & = 0  &\text{Given} \tag 1\\
   -w & = -w + 0 & \text{additive identity} \tag 2\\
   -w + 0 & = -w + (v + w) & \text{equations $(1)$ and $(2)$}
\end{align}

\[ \begin{align} v + w & = 0 &\text{Given} \tag 1\\ -w & = -w + 0 & \text{additive identity} \tag 2\\ -w + 0 & = -w + (v + w) & \text{equations $(1)$ and $(2)$} \end{align} \]

條件表達式

1. 條件表達式

使用 begin{cases} 來創造一組條件表達式,在每一行條件中插入 & 來指定需要對齊的內容,並在每一行結尾處使用 \\,以 end{cases} 結束。

$$
        f(n) =
        \begin{cases}
        n/2,  & \text{if $n$ is even} \\
        3n+1, & \text{if $n$ is odd}
        \end{cases}
$$

\[ f(n) = \begin{cases} n/2, & \text{if $n$ is even} \\ 3n+1, & \text{if $n$ is odd} \end{cases} \]

2. 左側對齊的條件表達式

$$
        \left.
        \begin{array}{l}
        \text{if $n$ is even:}&n/2\\
        \text{if $n$ is odd:}&3n+1
        \end{array}
        \right\}
        =f(n)
$$

\[ \left. \begin{array}{l} \text{if $n$ is even:}&n/2\\ \text{if $n$ is odd:}&3n+1 \end{array} \right\} =f(n) \]

3. 讓條件表達式調整行高

在一些情況下,條件表達式中某些行的行高為非標準高度,此時使用 \\[2ex] 語句代替該行末尾的 \\ 來讓編輯器自動調整。

$$
f(n) =
\begin{cases}
\frac{n}{2},  & \text{if $n$ is even} \\
3n+1, & \text{if $n$ is odd}
\end{cases}
$$

\[ f(n) = \begin{cases} \frac{n}{2}, & \text{if $n$ is even} \\ 3n+1, & \text{if $n$ is odd} \end{cases} \]

調整行高的結果

$$
f(n) =
\begin{cases}
\frac{n}{2},  & \text{if $n$ is even} \\[2ex]
3n+1, & \text{if $n$ is odd}
\end{cases}
$$

\[ f(n) = \begin{cases} \frac{n}{2}, & \text{if $n$ is even} \\[2ex] 3n+1, & \text{if $n$ is odd} \end{cases} \]

數組與表格

1. 如何輸入一個數組或表格

通常,一個格式化後的表格比單純的文字或排版後的文字更具有可讀性。數組和表格均以 begin{array} 開頭,並在其後定義列數及每一列的文本對齊屬性,c l r 分別代表居中、左對齊及右對齊。若需要插入垂直分割線,在定義式中插入 | ,若要插入水平分割線,在下一行輸入前插入 \hline 。與矩陣相似,每行元素間均須要插入 & ,每行元素以 \\ 結尾,最後以 end{array} 結束數組。

\begin{array}{c|lcr}
n & \text{左} & \text{置中} & \text{右} \\
\hline
1 & 0.24 & 1 & 125 \\
2 & -1 & 189 & -8 \\
3 & -20 & 2000 & 1+10i
\end{array}

\[ \begin{array}{c|lcr} n & \text{左} & \text{置中} & \text{右} \\ \hline 1 & 0.24 & 1 & 125 \\ 2 & -1 & 189 & -8 \\ 3 & -20 & 2000 & 1+10i \end{array} \]

2. 嵌套的數組或表格

多個數組/表格可 互相嵌套 並組成一組數組/一組表格。 使用嵌套前必須聲明 $$ 符號。

$$
% outer vertical array of arrays 外層垂直表格
\begin{array}{c}
    % inner horizontal array of arrays 內層水平表格
    \begin{array}{cc}
        % inner array of minimum values 內層"最小值"數組
        \begin{array}{c|cccc}
        \text{min} & 0 & 1 & 2 & 3\\
        \hline
        0 & 0 & 0 & 0 & 0\\
        1 & 0 & 1 & 1 & 1\\
        2 & 0 & 1 & 2 & 2\\
        3 & 0 & 1 & 2 & 3
        \end{array}
    &
        % inner array of maximum values 內層"最大值"數組
        \begin{array}{c|cccc}
        \text{max}&0&1&2&3\\
        \hline
        0 & 0 & 1 & 2 & 3\\
        1 & 1 & 1 & 2 & 3\\
        2 & 2 & 2 & 2 & 3\\
        3 & 3 & 3 & 3 & 3
        \end{array}
    \end{array}
    % 內層第一行表格組結束
    \\
    % inner array of delta values 內層第二行Delta值數組
        \begin{array}{c|cccc}
        \Delta&0&1&2&3\\
        \hline
        0 & 0 & 1 & 2 & 3\\
        1 & 1 & 0 & 1 & 2\\
        2 & 2 & 1 & 0 & 1\\
        3 & 3 & 2 & 1 & 0
        \end{array}
        % 內層第二行表格組結束
\end{array}
$$

\[ \begin{array}{c} \begin{array}{cc} \begin{array}{c|cccc} \text{min} & 0 & 1 & 2 & 3\\ \hline 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 1 & 1 & 1\\ 2 & 0 & 1 & 2 & 2\\ 3 & 0 & 1 & 2 & 3 \end{array} & \begin{array}{c|cccc} \text{max}&0&1&2&3\\ \hline 0 & 0 & 1 & 2 & 3\\ 1 & 1 & 1 & 2 & 3\\ 2 & 2 & 2 & 2 & 3\\ 3 & 3 & 3 & 3 & 3 \end{array} \end{array} \\ \begin{array}{c|cccc} \Delta&0&1&2&3\\ \hline 0 & 0 & 1 & 2 & 3\\ 1 & 1 & 0 & 1 & 2\\ 2 & 2 & 1 & 0 & 1\\ 3 & 3 & 2 & 1 & 0 \end{array} \end{array} \]

3. 方程組

\begin{array}…\end{array}\left\{…\right.

$$
\left\{
\begin{array}{c}
a_1x+b_1y+c_1z=d_1 \\
a_2x+b_2y+c_2z=d_2 \\
a_3x+b_3y+c_3z=d_3
\end{array}
\right.
$$

\[ \left\{ \begin{array}{c} a_1x+b_1y+c_1z=d_1 \\ a_2x+b_2y+c_2z=d_2 \\ a_3x+b_3y+c_3z=d_3 \end{array} \right. \]

或者使用條件表達式組 \begin{cases}…\end{cases} 來實現相同效果

\begin{cases}
a_1x+b_1y+c_1z=d_1 \\
a_2x+b_2y+c_2z=d_2 \\
a_3x+b_3y+c_3z=d_3
\end{cases}

\[ \begin{cases} a_1x+b_1y+c_1z=d_1 \\ a_2x+b_2y+c_2z=d_2 \\ a_3x+b_3y+c_3z=d_3 \end{cases} \]

連分數

\cfrac

$$
x = a_0 + \cfrac{1^2}{a_1
          + \cfrac{2^2}{a_2
          + \cfrac{3^2}{a_3 + \cfrac{4^4}{a_4 + \cdots}}}}
$$

\[ x = a_0 + \cfrac{1^2}{a_1 + \cfrac{2^2}{a_2 + \cfrac{3^2}{a_3 + \cfrac{4^4}{a_4 + \cdots}}}} \]

可以使用 \frac 來表達連分數的 緊縮記法

$$
x = a_0 + \frac{1^2}{a_1+}
          \frac{2^2}{a_2+}
          \frac{3^2}{a_3 +} \frac{4^4}{a_4 +} \cdots
$$

\[ x = a_0 + \frac{1^2}{a_1+} \frac{2^2}{a_2+} \frac{3^2}{a_3 +} \frac{4^4}{a_4 +} \cdots \]

交換圖表

使用一行 $ \require{AMScd} $ 語句來允許交換圖表的顯示。 宣告交換圖表後,語法與矩陣相似,在開頭使用 begin{CD},在結尾使用 end{CD},在中間插入圖表元素,每個元素之間插入 & ,並在每行結尾處使用 \\

$\require{AMScd}$
\begin{CD}
    A @>a>> B\\
    @V b V V\# @VV c V\\
    C @>>d> D
\end{CD}$

\[ $\require{AMScd}$ \begin{CD} A @>a>> B\\ @V b V V\# @VV c V\\ C @>>d> D \end{CD}$ \]

@>>>代表右箭頭、@<<<代表左箭頭、@VVV代表下箭頭、@AAA代表上箭頭、@=代表水平雙實線、@|代表竪直雙實線、@.代表沒有箭頭。 在@>>>>>>` 之間任意插入文字即代表該箭頭的注釋文字。

$\require{AMScd}$
\begin{CD}
    A @>>> B @>{\text{very long label}}>> C \\
    @. @AAA @| \\
    D @= E @<<< F
\end{CD}

\[ $\require{AMScd}$ \begin{CD} A @>>> B @>{\text{very long label}}>> C \\ @. @AAA @| \\ D @= E @<<< F \end{CD} \]

注意事項

  • 在以e為底的指數函數、極限和積分中盡量不要使用 \frac 符號:它會使整段函數看起來很怪,而且可能產生歧義。也正是因此它在專業數學排版中幾乎從不出現。 橫著寫這些分式,中間使用斜線間隔 / (用斜線代替分數線)。
\begin{array}{cc}
\mathrm{Bad} & \mathrm{Better} \\
\hline \\
e^{i\frac{\pi}2} \quad e^{\frac{i\pi}2}& e^{i\pi/2} \\
\int_{-\frac\pi2}^\frac\pi2 \sin x\,dx & \int_{-\pi/2}^{\pi/2}\sin x\,dx \\
\end{array}

\(\begin{array}{cc} \mathrm{Bad} & \mathrm{Better} \\ \hline \\ e^{i\frac{\pi}2} \quad e^{\frac{i\pi}2}& e^{i\pi/2} \\ \int_{-\frac\pi2}^\frac\pi2 \sin x\,dx & \int_{-\pi/2}^{\pi/2}\sin x\,dx \\ \end{array}\)

  • 符號在被當作分隔符時會產生錯誤的間隔,因此在需要分隔時最好使用 \mid 來代替它。
\begin{array}{cc}
\mathrm{Bad} & \mathrm{Better} \\
\hline \\
\{x|x^2\in\Bbb Z\} & \{x\mid x^2\in\Bbb Z\} \\
\end{array}

\(\begin{array}{cc} \mathrm{Bad} & \mathrm{Better} \\ \hline \\ \{x|x^2\in\Bbb Z\} & \{x\mid x^2\in\Bbb Z\} \\ \end{array}\)

  • 使用多重積分符號時,不要多次使用 \int ,直接使用 \iint 來表示 二重積分 ,使用 \iiint 來表示 三重積分 等。對於無限次積分,可以用 \int \cdots \int 表示。
\begin{array}{cc}
\mathrm{Bad} & \mathrm{Better} \\
\hline \\
\int\int_S f(x)\,dy\,dx & \iint_S f(x)\,dy\,dx \\
\int\int\int_V f(x)\,dz\,dy\,dx & \iiint_V f(x)\,dz\,dy\,dx
\end{array}

\(\begin{array}{cc} \mathrm{Bad} & \mathrm{Better} \\ \hline \\ \int\int_S f(x)\,dy\,dx & \iint_S f(x)\,dy\,dx \\ \int\int\int_V f(x)\,dz\,dy\,dx & \iiint_V f(x)\,dz\,dy\,dx \end{array}\)

  • 在微分符號前加入 \, 來插入一個小的間隔空隙;沒有 \, 符號的話,latex 將會把不同的微分符號堆在一起。
\begin{array}{cc}
\mathrm{Bad} & \mathrm{Better} \\
\hline \\
\iiint_V f(x){\rm d}z {\rm d}y {\rm d}x & \iiint_V f(x)\,{\rm d}z\,{\rm d}y\,{\rm d}x
\end{array}

\(\begin{array}{cc} \mathrm{Bad} & \mathrm{Better} \\ \hline \\ \iiint_V f(x){\rm d}z {\rm d}y {\rm d}x & \iiint_V f(x)\,{\rm d}z\,{\rm d}y\,{\rm d}x \end{array}\)

Reference

如何在markdown中插入公式

MarkDown公式輸入

Cmd Markdown 公式指導手冊

數學符號的意義與念法