2014/06/16

erlang - cowboy (2)

cowboy 的第二篇文章，內容談到 cookie 的使用、靜態網頁、REST、Server Push、Websocket、Hooks、 Middleware。

Using cookies

Setting cookies

%% 預設狀況下，cookie 是定義給 session 使用
SessionID = generate_session_id(),
Req2 = cowboy_req:set_resp_cookie(<<"sessionid">>, SessionID, [], Req).

%% 可設定 expiration time in seconds
SessionID = generate_session_id(),
Req2 = cowboy_req:set_resp_cookie(<<"sessionid">>, SessionID, [
    {max_age, 3600}
], Req).

%% 刪除 cookie
Req2 = cowboy_req:set_resp_cookie(<<"sessionid">>, <<>>, [
    {max_age, 0}
], Req).

%% 設定 cookie 時，指定 domain 與 path
Req2 = cowboy_req:set_resp_cookie(<<"inaccount">>, <<"1">>, [
    {domain, "my.example.org"},
    {path, "/account"}
], Req).

%% 限制 cookie 只用在 https
SessionID = generate_session_id(),
Req2 = cowboy_req:set_resp_cookie(<<"sessionid">>, SessionID, [
    {secure, true}
], Req).

%% 限制 cookie 只用在 client-server 通訊上，這種 cookie 無法使用 client-side script 做任何處理
SessionID = generate_session_id(),
Req2 = cowboy_req:set_resp_cookie(<<"sessionid">>, SessionID, [
    {http_only, true}
], Req).

Reading cookies

%% 讀取 cookie: lang 的 value
{CookieVal, Req2} = cowboy_req:cookie(<<"lang">>, Req).

%% 讀取 cookie: lang 的 value，不存在時，就回傳預設值 fr
{CookieVal, Req2} = cowboy_req:cookie(<<"lang">>, Req, <<"fr">>).

%% 取得 cookie 的 key/value tuple list
{AllCookies, Req2} = cowboy_req:cookies(Req).

Static files

static file handler 是用一個 built-in REST handler處理的，這可以服務一個檔案或是一個目錄的所有檔案，這些檔案可以用多個 Content distribution Network (CDN) 處理。

Serve one file

%% 處理路徑 / 時，以 應用程式 my_app 的私有目錄服務檔案 static/index.html
{"/", cowboy_static, {priv_file, my_app, "static/index.html"}}

%% 處理路徑 / 時，以檔案絕對路徑 /var/www/index.html 提供服務
{"/", cowboy_static, {file, "/var/www/index.html"}}

Serve all files from a directory

%% 服務 my_app 裡面的 static/assets 目錄裡面的所有檔案，可處理所有 /assets/ 開頭的網址
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets"}}

%% 指定目錄的絕對路徑
{"/assets/[...]", cowboy_static, {dir, "/var/www/assets"}}

Customize the mimetype detection

cowboy 預設會利用 file extension 來辨識檔案的 mimetype，可以覆寫這個 callback function。cowboy 內建兩個 functions，預設的只會處理 web application 用到的 file types，另一個則提供上百個 mimetypes。

%% 使用預設的 function
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets", [{mimetypes, cow_mimetypes, web}]}}

%% 使用所有檔案的 mimetypes
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets", [{mimetypes, cow_mimetypes, all}]}}

%% 改用自訂客製的 callback function
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets", [{mimetypes, Module, Function}]}}

如果 Module:Function 遇到無法識別的檔案 mimetype，就會回傳 {<<"application">>, <<"octet-stream">>, []} ，這就代表是 application/octet-stream。

Generate an etag

預設狀況下，static handler 會根據檔案的 size 與 modified time 產生一個 etag header value，

etag 是用來判斷檔案版本資訊的方法，如果 client 的檔案 etag 跟 server 一樣，server 可直接回應 304，告訴 client 直接使用 cache 裡面的檔案。實際上除了 etag 之外，還要同時觀察 Last-Modified 與 Expires，可參閱這篇文章。

%% 改變 etag 的計算方式
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets", [{etag, Module, Function}]}}

%% disabled etag handling
{"/assets/[...]", cowboy_static, {priv_dir, my_app, "static/assets", [{etag, false}]}}

REST handlers

跟 Websocket 一樣，REST 是 HTTP 的 sub-protocol，所以需要 protocol upgrade。

init({tcp, http}, Req, Opts) ->
    {upgrade, protocol, cowboy_rest}.

目前 REST handler 可處理以下這些 HTTP methods: HEAD, GET, POST, PATCH, PUT, DELETE, OPTIONS。Diagram for REST 最後面提供了四張 REST 處理的 svg 流程圖，可以先下載後，再拖拉到瀏覽器中觀看，這四個流程圖分別說明了以下的流程。

Beginning part, up to resource_exists
From resource_exists, for HEAD and GET requests
From resource_exists, for POST/PATCH/PUT requests
From resource_exists, for DELETE requests

Callbacks

處理 request 時，一開始就先呼叫 rest_init/2，這個 function 一定要回傳 {ok, Req, State}，State 是 handler 所有 callbacks 的狀態物件。在最後，會呼叫 rest_terminate/2，這個 function 不能發送 reply，且一定要回傳 ok。

所有其他的 callbacks 都是 resource callbacks，需要兩個參數: Req 與 State，而且都會回傳 {Value, Req, State}。如果 callbacks 回傳了 {halt, Req, State}，就表示要中止這個 request 的處理，接下來直接呼叫 rest_terminate/2。

如果 callback 回傳 skip，就會跳過此步驟，並執行下一步，空白欄位表示沒有預設值。

Callback name	Default value
allowed_methods	[<<"GET">>, <<"HEAD">>, <<"OPTIONS">>]
allow_missing_post	true
charsets_provided	skip
content_types_accepted
content_types_provided	[{{<<"text">>, <<"html">>, '*'}, to_html}]
delete_completed	true
delete_resource	false
expires	undefined
forbidden	false
generate_etag	undefined
is_authorized	true
is_conflict	false
known_content_type	true
known_methods	[<<"GET">>, <<"HEAD">>, <<"POST">>, <<"PUT">>, <<"PATCH">>, <<"DELETE">>, <<"OPTIONS">>]
languages_provided	skip
last_modified	undefined
malformed_request	false
moved_permanently	false
moved_temporarily	false
multiple_choices	false
options	ok
previously_existed	false
resource_exists	true
service_available	true
uri_too_long	false
valid_content_headers	true
valid_entity_length	true
variances	[]

可使用 content_types_accepted/2, content_types_provided/2 產生任意數量的 user-defined callbacks，建議區分成兩個 function，例如 from_html 與 to_html，分別用來代表接受 html 資料與發送 html 資料。

Meta data

cowboy 會在處理過程中設定一些 meta values，可使用 cowboy_req:meta/{2,3} 取得。

Meta key	Details
media_type	The content-type negotiated for the response entity.
language	The language negotiated for the response entity.
charset	The charset negotiated for the response entity.

Response headers

cowboy 會在處理 REST 之後自動設定一些 headers。

Header name	Details
content-language	Language used in the response body
content-type	Media type and charset of the response body
etag	Etag of the resource
expires	Expiration date of the resource
last-modified	Last modification date for the resource
location	Relative or absolute URI to the requested resource
vary	List of headers that may change the representation of the resource

Server Push: using Loop Handlers

當 response 無法馬上回傳時，就可以使用 Loop Handler，它會進入一個 receive loop 等待訊息，並發送 response。這個方式非常適合處理 long-polling。如果 response 是 partially available，且我們需要 stream the response body，也可使用 Loop Handler，這種方式適合處理 server-sent events。

sample

-module(my_loop_handler).
-behaviour(cowboy_loop_handler).

-export([init/3]).
-export([info/3]).
-export([terminate/3]).

init({tcp, http}, Req, Opts) ->
    %% 如果沒有在 60s 內收到 {reply, Body}，就會產生 204 No Content 的 response
    {loop, Req, undefined_state, 60000, hibernate}.

%% 等待 {reply, Body}，然後才發送 response
info({reply, Body}, Req, State) ->
    {ok, Req2} = cowboy_req:reply(200, [], Body, Req),
    {ok, Req2, State};
info(Message, Req, State) ->
    {loop, Req, State, hibernate}.

terminate(Reason, Req, State) ->
    ok.

Websocket handlers

Websocket 是 HTTP extension，可在 browser 中模擬　plain TCP connection，cowboy 是用 Websocket Handler 處理，client 與 server 兩端都可以在任何時間非同步發送資料。

sample

-module(my_ws_handler).
-behaviour(cowboy_websocket_handler).

-export([init/3]).
-export([websocket_init/3]).
-export([websocket_handle/3]).
-export([websocket_info/3]).
-export([websocket_terminate/3]).

%% 將 cowboy connection 升級到支援 websocket
init({tcp, http}, Req, Opts) ->
    {upgrade, protocol, cowboy_websocket}.

websocket_init(TransportName, Req, _Opts) ->
    erlang:start_timer(1000, self(), <<"Hello!">>),
    {ok, Req, undefined_state}.

%% 以 websocket_handle 接收 client 發送的資料
websocket_handle({text, Msg}, Req, State) ->
    {reply, {text, << "That's what she said! ", Msg/binary >>}, Req, State};
websocket_handle(_Data, Req, State) ->
    {ok, Req, State}.

%% 以 websocket_info 發送系統訊息
websocket_info({timeout, _Ref, Msg}, Req, State) ->
    erlang:start_timer(1000, self(), <<"How' you doin'?">>),
    {reply, {text, Msg}, Req, State};
websocket_info(_Info, Req, State) ->
    {ok, Req, State}.

%% 當連線中斷時，就會呼叫 websocket_terminate
websocket_terminate(_Reason, _Req, _State) ->
    ok.

Hooks

onrequest

當 cowboy 取得 request headers之後，就會呼叫 onrequest hook，這會在所有request相關處理（包含routing）之前發生，我們可用來在繼續處理 request 之前，修改 request object 裡面的資料，如果在 onrequest 裡面就發送了 reply，cowboy就會中止處理這個 request，繼續處理下一個 request。如果 onrequest crash，就不會發送任何 reply 了。

%% 在產生 listener 時，就指定 onrequest 的callback function
cowboy:start_http(my_http_listener, 100,
    [{port, 8080}],
    [
        {env, [{dispatch, Dispatch}]},
        {onrequest, fun ?MODULE:debug_hook/1}
    ]
).

%% 這個 hook 會列印每一個 request object，適合用在 debugging
debug_hook(Req) ->
    erlang:display(Req),
    Req.

onresponse

在 cowboy 發送 response 之前，會呼叫 onresponse hook，通常用來 logging responses 或是修改 response header/body，常見的範例是提供 custom error pages。跟onrequest一樣，如果 onresponse crash，就不會發送 reply 了。

%% 在產生 listener 時，就指定 onresponse 的callback function
cowboy:start_http(my_http_listener, 100,
    [{port, 8080}],
    [
        {env, [{dispatch, Dispatch}]},
        {onresponse, fun ?MODULE:custom_404_hook/4}
    ]
).

%% 提供自訂的 404 error page
custom_404_hook(404, Headers, <<>>, Req) ->
    Body = <<"404 Not Found.">>,
    %% 修改 response header: content-length
    Headers2 = lists:keyreplace(<<"content-length">>, 1, Headers,
        {<<"content-length">>, integer_to_list(byte_size(Body))}),
    {ok, Req2} = cowboy_req:reply(404, Headers2, Body, Req),
    Req2;
custom_404_hook(_, _, _, Req) ->
    Req.

Middlewares

cowboy 將 request processing 交給 middleware components 處理，預設提供了 routing 與 handler 兩個 middlewares。cowboy 會根據 middleware 設定的順序執行。

Usage

middleware 只需要實作一個 callback function: execute/2，這是定義在 cowboy_middleware behavior 裡面。

execute(Req, Env) 可能會回傳四種 values

{ok, Req, Env} : 會繼續執行下一個 middleware
{suspend, Module, Function, Args} : to hibernate，繼續執行下一個 MFA
{halt, Req} : 停止處理這個 request，繼續下一個 request
{error, StatusCode, Req} : 回應 error 並 close the socket

Configuration

Env 裡面保留了兩個值

listener
包含 name of the listener
result
包含 result of the processing，如果結果不是 ok ，cowboy 就不會處理這個 connection 後面的所有 requests。

可使用 cowboy:set_env/3 設定或取代 Env 裡面的資料。

Routing middleware

需要 dispatch value，如果 routing compilation 成功，就會把 handler name and options 放在 Env 裡面的 handler 與 handler_opts。

Handler middleware

需要 handler 與 handler_opts values，會把結果放在 Env  的 result 裡面。

high concurency

如果要讓 cowboy 能處理多個連線，必須調整參數。

在 cowboy:start_http 時，要加上 {max_connections, infinity}

cowboy:start_http(my_http_listener, 100,
            [{port, 8000}, {max_connections, infinity}],
            [{env, [{dispatch, Dispatch}]}]
        ),

另外 erlang vm 本身預設有可以建立的 process 數量的上限。預設值為 262144 個。

1> erlang:system_info(process_limit).
262144

這個數量是不夠的，我們必須在啟動 vm 時，設定 +P Number 參數，Number 的數量為 [1024-134217727]，實際上測試時，如果把數量設定為最大值 134217727，反而會覺得 vm 啟動的速度變慢了，所以把 process 上限調為 1000萬，這樣子應該夠用了，實際上得到的數量也是接近 10240000，而不是絕對值。

erl +P 10240000

1> erlang:system_info(process_limit).
16777216

Reference

cowboy user guide
100萬並發連接服務器筆記之Erlang完成1M並發連接目標

2014/06/12

使用wxPython開發跨平台視窗程式

wxPython是Python的GUI toolkit，顧名思義，其包裝了知名的C++ GUI toolkit - wxWidget。

Python語言擁有簡潔的語法以及豐富的package可以使用，是快速開發跨平台視窗程式的好選擇之一。

安裝Python

在Python官方網站下載Python直譯器的安裝檔。一點進去官方網站就可以看到抖大的下載按鈕。但預設是32bit的版本。

我建議可以到https://www.python.org/download/這頁下載，可以自行選擇32bit或64bit的版本。

安裝好Python直譯器後，可以在console下測試python指令。

若無法執行python指令，記得檢查Python直譯器的可執行檔是否加入環境變數中。

安裝wxPython

在wxPython官方網站下載wxPython。

這邊要注意的是，Python直譯器與wxPython必須要同樣是32bit或64bit的版本。

在Windows環境以及Mac OS環境下都有Binaries可以直接執行安裝程式。

若需要自行由Source Code建立wxPython，可以參考：http://www.wxpython.org/BUILD.html

安裝完畢後，進入console下執行python指令，並輸入：import wx

若沒有出現錯誤，表示wxPython安裝成功。

或是執行以下範例：

#<path_to_python>
# -*- coding:utf-8 -*-
import wx

app = wx.App(False)  # Create a new app, don't redirect stdout/stderr to a window.
frame = wx.Frame(None, wx.ID_ANY, "Hello World") # A Frame is a top-level window.
frame.Show(True)     # Show the frame.
app.MainLoop()

安裝wxPython demo

wxPython有相當多的範例demo程式，在windows或OS X平台下都可以透過安裝檔安裝demo程式碼。安裝完畢後，在Python目錄底下可以看到wxPython demo的程式原始碼，裡面有豐富的範例，大多數也可單一執行；也可將範例複製到其他有Python直譯器的任何平台執行。

使用pyinstaller建立特定平台可執行檔

pyinstaller可以將python程式轉成特定平台的可執行檔，執行時更為便利。

而pyinstaller本身也支援Windows, Linux及OS X 等平台。

以下介紹如何使用pyinstaller建立特定平台可執行檔。

在Windows下安裝pyinstaller

安裝PyWin32

Windows環境需要先安裝PyWin32 http://sourceforge.net/projects/pywin32/files/ 可以在SourceForge找到最新的build, 選適合自己平台的安裝檔

使用pip-Win安裝及執行pyinstaller

接著下載pip-Win這個工具。

下載後，直接執行即可。

並輸入：

venv -c -i  pyi-env-name

如下圖：

會產生一個命令列視窗。

第一次執行時會自動安裝需要的相關套件，待執行完畢後，

在命令列視窗輸入：

pip install pyinstaller

即可安裝好pyinstaller。

爾後，執行pyinstaller時，也需要在此命令列視窗中執行。

在Linux下安裝pyinstaller

安裝pip

首先至pip網站下載get-pip.py

並執行：

python get-pip.py

即可安裝好pip，

接著再執行：

pip install pyinstaller

即可安裝好pyinstaller。

使用pyinstaller

使用pyinstaller建立spec檔案

(提醒：在Windows環境下，以下步驟要在pip-Win產生的命令列視窗中執行)

在產生可執行檔之前，

pyinstaller會先分析python程式碼，並產生spec檔案。

假設有個python程式名為testWx.py，

則執行：

pyinstaller -w -F testWx.py

即會產生一個testWx.spec的檔案。

-w參數表示此python程式是視窗程式，產生的可執行檔在執行時，不要跑出命令列視窗；

-F參數表示產生單一的可執行檔。

使用pyinstaller建立可執行檔

產生了spec檔案之後，

只要執行：

pyinstaller -w -F testWx.spec

即可產生該平台可執行檔。

2014/06/11

JAVA LDAP分頁查詢處理

這幾天遇到一個問題，是關於LDAP匯入的部分，客戶說有些人員沒有被匯入進來，查了一下log沒看到拋出任何Exception，透過LDAP Client軟體去下查詢，也查的到該位沒被匯入的人員，資料驗證的部分也都正確無誤，一時之間不知道問題在哪，找了幾個小時之後在log內發現到一個很奇怪的現象，為什麼匯入的人員數量會那麼剛好的為1000人，憑著這幾年寫程式的直覺來猜測，這裡面一定有什麼問題。

原本的寫法

原本的寫法應該是參考JAVA官網 Advanced Topics for LDAP - Creation 上的寫法，如下：

// set properties for our connection and provider
Properties properties = new Properties();
properties.put(Context.INITIAL_CONTEXT_FACTORY,
        "com.sun.jndi.ldap.LdapCtxFactory");
properties.put(Context.PROVIDER_URL, "ldap://192.168.150.129:389");
properties.put(Context.REFERRAL, "ignore");
properties.put(Context.SECURITY_PRINCIPAL,
        "cn=Manager,dc=maxkit,dc=com,dc=tw");
properties.put(Context.SECURITY_CREDENTIALS, "secret");

InitialDirContext context;
context = new InitialDirContext(properties);

透過JNDI去，來達成對LDAP Server的連線，上面的程式執行起來是正常的，也能夠正確的連上LDAP Server。

人員沒有匯入問題所在

那既然上面的code是正常的，那怎麼會出現人員沒匯入的狀況呢？

客戶用的是微軟的Active Directory，而在客戶透過AD Client軟體查詢給我們看時，我發現到他的軟體一次也是回一千筆資料，而我自己的程式在匯入時，也只處理剛好一千筆的資料，綜合上述幾點看來，有足夠的理由讓我懷疑Active Directory是不是會對查詢進行每次只回傳1千筆資料的分頁處理。

找了一些網站，沒找到官方的正式資料，倒是同事有找到類似的文章，只是文章內是教說如何加大Active Directory每次查詢回來的資料筆數(Increasing the number of objects returned in a single LDAP query)，根據網路上查詢到的資料來推斷，就是這原因造成人員沒有成功匯入的！

解決方案

既然知道問題在於沒有做分頁，那就將之分頁，問題不就解決了？

因此上網查了一些JAVA處理LDAP分頁的範例，找到了幾篇好懂的解決方案，如 JNDI, Active Directory, Paging and Range Retrieval，不過也因此發現到，要用分頁的話，原本的寫法是不行的，原因在於原本寫法用的InitialDirContext這類別，沒有提供任何可以使用分頁的方法，因此根據找到的文章，改成使用InitialLdapContext來處理，就能進行分頁處理了，範例如下：

// LDAP連線相關設定
Hashtable<String, String> env = new Hashtable<String, String>();
env.put(Context.INITIAL_CONTEXT_FACTORY,
        "com.sun.jndi.ldap.LdapCtxFactory");
env.put(Context.SECURITY_PRINCIPAL, "cn=Manager,dc=maxkit,dc=com,dc=tw");
env.put(Context.SECURITY_CREDENTIALS, "secret");
env.put(Context.REFERRAL, "ignore");
env.put(Context.PROVIDER_URL, LDAP_URL);

LdapContext ctx = new InitialLdapContext(env, null);

// 設定分頁相關資訊
int pageSize = 1000; //設定LDAP每次分頁所取的資料筆數
byte[] cookie = null;
ctx.setRequestControls(new Control[]{new PagedResultsControl(
    pageSize, Control.CRITICAL)});

do {
    // 設定 LDAP 人員查詢條件
    String searchFilter = "(objectClass=organizationalPerson)";
    SearchControls searchCtls = new SearchControls();
    searchCtls.setSearchScope(SearchControls.SUBTREE_SCOPE);

    // 進行 LDAP 資料查詢與資料處理
    NamingEnumeration<SearchResult> results = ctx.search(
            "dc=maxkit,dc=com,dc=tw", searchFilter, searchCtls);
    while (results != null && results.hasMore()) {
        // ...
        // 這邊執行 LDAP 查出來的人員相關處理邏輯
        // ...
    }       

    //==================================================
    // 換頁處理開始
    //==================================================

    // 此分頁資料處理完畢，底下先取出cookie，
    // 如果cookie不為null，則表示還有下一頁的資料
    Control[] controls = ctx.getResponseControls();
    if (controls != null) {
        for (int i = 0; i < controls.length; i++) {
            if (controls[i] instanceof PagedResultsResponseControl) {
                PagedResultsResponseControl prrc = 
                    (PagedResultsResponseControl) controls[i];
                cookie = prrc.getCookie();
            }
        }
    }

    // 將cookie資訊提供給InitialLdapContext，讓它在接下來的查詢中進行換頁
    ctx.setRequestControls(new Control[]{new PagedResultsControl(
            pageSize, cookie, Control.CRITICAL)});

    //==================================================
    // 換頁處理結束
    //==================================================
} while (cookie != null);

ctx.close();

上面的程式需要注意到的是：