Maxkit: Fluentd

Fluentd 將 data source 及 backend system 分離，提供兩者之間的一個 Unified Logging Layer，可讓 developers 及 data analysts 能同時使用多種資料源，同時也解決格式錯誤的資料所造成的系統變慢或解譯錯誤的問題。

Fluentd 有三種版本，全部都是以 Apache2 License 釋出。

Fluentd

社群版本，只能用 ruby gems 安裝，沒有 init scripts，如果想要修改 Fluentd 或是做更多事情，可以用這個社群版
ta-agent

這是 Treasure Data, Inc 這家公司維護並測試的版本，可直接用 rpm/deb/dmg 套件安裝，安裝時同時安裝了一些預設設定值。如果是第一次使用 Fluentd，建議安裝 ta-agent。
Fluent Bit

Fluent Bit 是 Fluentd 的 lightweight data forwarder，用在 forward 資料給 Fluentd aggregators。可安裝在 embedded system 或是嵌入到 server 系統中。

Architecture

Fluentd 的架構圖為

由於 data inputs 及 output 透過 Fluentd 中繼資料，Fluentd 這個 Unified Logging Layer 野食作為 pluggable 架構，可不斷地增加不同的 input 及 output plugins，目前已經有超過 500+ 的 plugins。

假設有 M 種 data input，N 種 data output，pluggable 架構可讓原本複雜度 O(M*N) 的系統，變成 O(M+N) 的系統。

安裝

在 Download Fluentd 有列出所有安裝方式的資訊。我們選擇 Installing Fluentd Using rpm Package 安裝到 CentOS 7。

產生一個新的有 sshd 的 docker machine

docker run -d \
 -p 10022:22\
 -p 80:80\
 -p 8888:8888\
 --sysctl net.ipv6.conf.all.disable_ipv6=1\
 -e "container=docker" --privileged=true -v /sys/fs/cgroup:/sys/fs/cgroup --name fluentd centosssh /usr/sbin/init

在安裝前，Before Installing Fluentd 必須要先處理幾項系統設定。

NTP

要同步時間，確保 log 的 timestamp 是正確的

CentOS 7 修改 timezone，校正時間
```
timedatectl set-timezone Asia/Taipei
/usr/sbin/ntpdate time.stdtime.gov.tw && /sbin/hwclock -w
```

Max # of File Descriptors

ulimit -n 65535

vi /etc/security/limits.conf

root soft nofile 65535
root hard nofile 65535
* soft nofile 65535
* hard nofile 65535

Network Kernel Parameters

解決 TCP_WAIT 的問題，（如果在 docker 測試，會無法修改 kernel 參數，跳過這個步驟就好了，參考這邊的說明對docker container進行內核參數調優）

vi /etc/sysctl.conf
```
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535
```
sysctl -p 或是 reboot

以 script 安裝 FluentD，daemon 名稱為 td-agent

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh

安裝後會增加 /etc/yum.repos.d/td.repo，以及 td-agent service

啟動 daemon

systemctl enable td-agent
systemctl status td-agent

systemctl start td-agent

/etc/init.d/td-agent start
/etc/init.d/td-agent stop
/etc/init.d/td-agent restart
/etc/init.d/td-agent status

設定檔在 /etc/td-agent/td-agent.conf 預設是由 HTTP 接收 logs 轉至 stdout /var/log/td-agent/td-agent.log

發送測試資料

curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.test

Use Cases

Centralized App Logging 收集不同語言實作的 Applcation 的 Log
Log Management & Search 以 Fluentd + Elasticsearch 的整合替代 Splunk
Data Analysis 將 Log 儲存到 Hadoop 或 MongoDB，以供後續分析處理
Data Archiving 將 Log 儲存到 Amazon S3/Riak/GlusterFS Logs
Stream Processing
Windows Event Collection 收集 Windows Event Logs (目前 stable 版本 v0.12 還不支援 Windows，要到 v0.14 才有支援)
IoT Data Logger

Cloud Data Logger by Raspberry Pi 說明可在 Raspberry Pi 整合其他 Sensor 後，透過 Fluentd 收集資料。

Life of a Fluentd event

以實例解釋 event 是如何倍 Fluentd 處理的，包含 Setup, Inputs, Filters, Matches, and Labels

使用 inhttp 及 outstdout plugins 解釋 events cycle，首先修改 /etc/td-agent/td-agent.conf

# listening for HTTP Requests
<source>
  @type http
  port 8888
  bind 0.0.0.0
</source>

# print the data arrived on each incoming request to standard output
<match test.cycle>
  @type stdout
</match>

發送兩個 curl 測試

# curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.test

# curl -i -X POST -d 'json={"action":"login","user":2}' http://localhost:8888/test.cycle
HTTP/1.1 200 OK
Content-type: text/plain
Connection: Keep-Alive
Content-length: 0

tail -f /var/log/td-agent/td-agent.log

2017-10-31 15:15:40 +0800 [info]: adding match pattern="test.cycle" type="stdout"
2017-10-31 15:15:40 +0800 [info]: adding source type="http"
2017-10-31 15:15:40 +0800 [info]: using configuration file: <ROOT>
  <source>
    @type http
    port 8888
    bind 0.0.0.0
  </source>
  <match test.cycle>
    @type stdout
  </match>
</ROOT>
2017-10-31 15:15:48 +0800 [warn]: no patterns matched tag="debug.test"
2017-10-31 15:15:58 +0800 test.cycle: {"action":"login","user":2}

Event structure

Fluentd event 包含 tag, time, record 三個部分

tag: event 來自哪裡
time: Epoch time，event 發生時間
record: log content，JSON object

以 apache log 為例，利用 in_tail 會由一行一行的 text line log 產生 event

192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777

tag: apache.access # set by configuration
time: 1362020400   # 28/Feb/2013:12:00:00 +0900
record: {"user":"-","method":"GET","code":200,"size":777,"host":"192.168.0.1","path":"/"}

tag 是由 a.b.c 這樣的字串組成的，用 "." 組合不同部分的字串

設定檔 td-agent.conf

source: input source

標準 input 有兩個: http 及 forward，可同時使用

http 將 fluentd 轉變為 HTTP endpoint，由 HTTP 接收 event message

forward 將 fluentd 轉變為 TCP endpoint，接收 TCP packets

ex:

# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
<source>
  @type forward
  port 24224
</source>

# http://this.host:8888/myapp.access?json={"event":"data"}
<source>
  @type http
  port 8888
</source>

match: output destination

比對 event 的 tag，並處理符合定義 tag 的 event

fluentd 的 stdout output plugin 為 file 及 forward

ex:

# Match events tagged with "myapp.access" and
# store them to /var/log/fluent/access.%Y-%m-%d
# Of course, you can control how you partition your data
# with the time_slice_format option.
<match myapp.access>
  @type file
  path /var/log/fluent/access
</match>

match 後面的參數有以下規則，依照在設定檔中的順序進行比對

- matches a single tag part
ex: a.* matches a.b a.* not match a or a.b.c
** matches zero or more tag parts

a.** matches a, a.b and a.b.c
{X,Y,Z} matches X, Y, or Z, where X, Y, and Z are match patterns

{a,b} matches a and b a.{b,c}.* a.{b,c.**}
可用填寫多個 patterns

match a and b match a, a.b, a.b.c, and b.d

filter: 決定 event processing pipelines

Input -> filter 1 -> ... -> filter N -> Output

ex:

# http://this.host:9880/myapp.access?json={"event":"data"}
<source>
  @type http
  port 9880
</source>

<filter myapp.access>
  @type record_transformer
  <record>
    host_param "#{Socket.gethostname}"
  </record>
</filter>

<match myapp.access>
  @type file
  path /var/log/fluent/access
</match>

event 處理過程

收到 {"event":"data"}
-> 送到 record_transformer filter
-> 增加 "host_param" 欄位
-> {"event":"data","host_param":"webserver1"}
-> 送到 file output

system: 設定系統參數

<system>
  # equal to -qq option
  log_level error
  # equal to --without-source option
  without_source
  # suppress_repeated_stacktrace
  # emit_error_log_interval
  # suppress_config_dump
  
  # fluentd’s supervisor and worker process names
  process_name fluentd1
</system>

label: group output 及 filter for internal routing

<label @SYSTEM>
  <filter var.log.middleware.**>
    @type grep
    # ...
  </filter>
  <match **>
    @type s3
    # ...
  </match>
</label>

@include: include other files

# Include config files in the ./config.d directory
@include config.d/*.conf

Processing Events

在設定 Setup 後，Router Engine 就已經包含了幾個基本的 rules，內部會經過幾個步驟處理 Event。

Filters

可用來設定一個 rule，決定要不要接受這個 event

ex: filter test.cycle 放棄不處理 logout，這是用 @grep 處理的，判斷 action 的部分，有沒有 "logout" 這個字串

<source>
  @type http
  port 8888
  bind 0.0.0.0
</source>

<filter test.cycle>
  @type grep
  exclude1 action logout
</filter>

<match test.cycle>
  @type stdout
</match>

測試

# curl -i -X POST -d 'json={"action":"login","user":2}' http://localhost:8888/test.cycle
HTTP/1.1 200 OK
Content-type: text/plain
Connection: Keep-Alive
Content-length: 0

# curl -i -X POST -d 'json={"action":"logout","user":2}' http://localhost:8888/test.cycle
HTTP/1.1 200 OK
Content-type: text/plain
Connection: Keep-Alive
Content-length: 0

結果在 log 裡面只有看到 login

2017-10-31 15:50:55 +0800 test.cycle: {"action":"login","user":2}

Labels

可用來定義新的 Routing sections，且不遵循 top-bottom 的順序，類似 linked references 的行為。

ex: 在 source 增加了 @label，表示要跳到 @STAGING 處理 event，而不是用上面的 filter

<source>
  @type http
  bind 0.0.0.0
  port 8880
  @label @STAGING
</source>

<filter test.cycle>
  @type grep
  exclude1 action login
</filter>

<label @STAGING>
  <filter test.cycle>
    @type grep
    exclude1 action logout
  </filter>

  <match test.cycle>
    @type stdout
  </match>
</label>

Buffers

在範例中，使用 stdout 是 non-buffered output，但在正式環境，會需要對 output 增加 buffer，例如 forward, mongodb, s3 ...

buffered output plugins 會儲存收到的 events 到 buffers，並在達到 flush condition 時，再將資料一次寫入目標。換句話說，database 可能不會馬上看到新進的 event。

Execution unit

Fluentd events 預設是在 input plugin thread 中處理的，例如 intail -> filtergrep -> outstdout pipeline，就是在 intail 的 thread 中處理的。filtergrep 及 outstdout 並沒有自己的 thread。

但 buffered output plugin 中，另外有一個自己的 thread 可處理 flushing buffer。

Sample

Collecting Tomcat logs using Fluentd and Elasticsearch

fluentd-catch-all-config

Tomcat容器日誌收集方案fluentd+elasticsearch+kilbana

安裝 fluentd 的 elasticsearch plugin

td-agent-gem install fluent-plugin-elasticsearch

定義 tomcat catalina.out 的 source

<source>
  @type tail
  format none
  path /var/log/tomcat*/localhost_access_log.%Y-%m-%d.txt
  pos_file /var/lib/google-fluentd/pos/tomcat.pos
  read_from_head true
  tag tomcat-localhost_access_log
</source>

<source>
  @type tail
  format multiline
  # Match the date at the beginning of each entry, which can be in one of two
  # different formats.
  format_firstline /^(\w+\s\d+,\s\d+)|(\d+-\d+-\d+\s)/
  format1 /(?<message>.*)/
  path /var/log/tomcat*/catalina.out,/var/log/tomcat*/localhost.*.log
  pos_file /var/lib/google-fluentd/pos/tomcat-multiline.pos
  read_from_head true
  tag tomcat.logs
</source>

<match tomcat.logs>
    @type elasticsearch
    host localhost
    port 9200
    logstash_format true
    logstash_prefix tomcat.logs
    flush_interval 1s
</match>