以下是 Apache HTTPClient API 連線管理,以 cookie 處理 state management,另外有個 Fluent API,簡化了 HttpClient 的操作介面這三個部分的 tutorial。
Connection Management
因為多次建立 HTTP Connection 會產生過多不必要的 Overdead,需要提供 connection management 機制,re-use connection 執行多個 requests。HttpClient 支援 connection persistence 機制。
HttpClient 可經過多個 hops 的 routes,建立 connection,或是直接連到 target host,區分三種不同的 route: plain, tunneled and layered。
plain routes 是直接連接目標 host,或是透過唯一一個 proxy 建立連線
tunneled routes: 連接到一個 proxy,並透過多個 proxies 的 tunnel 連接到 host
layered routes: 在既有的 connection 上以 layering a procotol 建立連線,這種方式能用在 tunnel 或是沒有透過 proxy 的直接連線。
Route Computation
RouteInfo interface 儲存既定路由(一或多個 steps/hops)的資訊。HttpRoute 實作了 RouteInfo,是 immutable 不能被修改,HttpTracker 是 mutable RouteInfo 實作,在 HttpClient 內部被使用,用來追蹤到目標 host 的 hops,HttpTracker 會在到達下一個 hop 時被更新。HttpRouteDirector 是用來計算 route 的下一個步驟的 helper class。
HttpRoutePlanner 是在 context 計算 route 的 interface,目前有兩個 implementations:
SystemDefaultRoutePlanner 利用 java.net.ProxySelector 實作的,預設會取得 JVM 的 proxy 設定(由 system properties 或是 browser 取得)。
DefaultProxyRoutePlanner 不透過 system property 或 browser 取得 proxy 設定,他會固定使用某一個 default proxy。
HTTPS 是透過 SSL/TLS protocol 在上面疊加 HTTP transport security。HTTP transport 也就是透過 layered over SSL/TLS connection。
Connection managers
HTTP connection 一次只能讓一個 thread 使用,HttpClient 透過實作了 HttpClientConnectionManager 介面的 HTTP connetion manager 管理 HTTP connection。Connection manager 作為產生新 HTTP connection 的 factory,管理 persistent connection 的 life cycle,確保一次只有一個 thread 使用該 connection。
如果一個http connection被釋放或者被consumer關閉了,底層的connection 將與 proxy 分離,重新交給 manager,即使服務使用者仍然持有對代理實例的引用,但是它不再能夠執行任何 I/O 操作或者改變實際連接的狀態。
public class Test {
public static void main(String[] args) {
HttpClientContext context = HttpClientContext.create();
HttpClientConnectionManager connMrg = new BasicHttpClientConnectionManager();
HttpRoute route = new HttpRoute(new HttpHost("kokola.maxkit.com.tw", 80));
// Request new connection. This can be a long process
ConnectionRequest connRequest = connMrg.requestConnection(route, null);
try {
// 等待 10 sec,看有沒有建立 connection
HttpClientConnection conn = connRequest.get(10, TimeUnit.SECONDS);
try {
if (!conn.isOpen()) {
// 依照 rout info 建立連線
connMrg.connect(conn, route, 1000, context);
// mark it as route complete
connMrg.routeComplete(conn, route, context);
}
// Do useful things with the connection.
} finally {
connMrg.releaseConnection(conn, null, 1, TimeUnit.MINUTES);
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
BasicHttpClientConnectionManager 是一次只維護一個 connection 的 connection mananger,但還是要注意,一次只能有一個 thread 使用,BasicHttpClientConnectionManager 會在 connection 被關閉後,以相同的 route 再建立一次連線,如果 route 跟既有的 connection request 不同,就會產生 java.lang.IllegalStateException。
Pooling connection manager
PoolingHttpClientConnectionManager 針對每個 route 建立有限數量的連線,connection 是根據 route 建立 pool。
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
// Increase max total connection to 200
cm.setMaxTotal(200);
// Increase default max connection per route to 20
cm.setDefaultMaxPerRoute(20);
// Increase max connections for localhost:80 to 50
HttpHost localhost = new HttpHost("locahost", 80);
cm.setMaxPerRoute(new HttpRoute(localhost), 50);
CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(cm)
.build();
在不需要使用 HttpClient instance 時,需要注意要關閉 connection manager,確保所有 connetion 的資源都被釋放。
CloseableHttpClient httpClient = <...>
httpClient.close();
Multithreaded request execution
使用 PoolingHttpClientConnectionManager 後,HttpClient 可同時在多個 threads 執行多個 requests。如果 PoolingHttpClientConnectionManager 已經沒有多餘的 connection,就會 blocking connection 要求,設定 http.conn-manager.timeout 就可以確保不會無限期等待 connection request,而是丟出 ConnectionPoolTimeoutException。
雖然 HttpClient 是 thread-safe,可在多個 threads 之間共享,但建議每一個 thread 要維護自己的 HttpContext。
public class Test {
public static void main(String[] args) {
PoolingHttpClientConnectionManager cm=new PoolingHttpClientConnectionManager();
//設置最大連接數不超過200
cm.setMaxTotal(200);
//每個路由默認的連接數20
cm.setDefaultMaxPerRoute(20);
CloseableHttpClient httpclient=HttpClients.custom()
.setConnectionManager(cm)
.build();
String[] urisToGet= {
"http://www.domain1.com/",
"http://www.domain2.com/",
"http://www.domain3.com/",
"http://www.domain4.com/"
};
GetThread[] threads=new GetThread[urisToGet.length];
for(int i=0;i<threads.length;i++) {
HttpGet httpGet=new HttpGet(urisToGet[i]);
threads[i]=new GetThread(httpclient,httpGet);
}
// start the threads
for (int j = 0; j < threads.length; j++) {
threads[j].start();
}
// join the threads
try {
for (int j = 0; j < threads.length; j++) {
threads[j].join();
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
static class GetThread extends Thread {
private final CloseableHttpClient httpClient;
private final HttpContext context;
private final HttpGet httpget;
public GetThread(CloseableHttpClient httpClient, HttpGet httpget) {
this.httpClient = httpClient;
this.context = HttpClientContext.create();
this.httpget = httpget;
}
@Override
public void run() {
try {
CloseableHttpResponse response = httpClient.execute(
httpget, context);
try {
HttpEntity entity = response.getEntity();
} finally {
response.close();
}
} catch (ClientProtocolException ex) {
// Handle protocol errors
} catch (IOException ex) {
// Handle I/O errors
}
}
}
}
eviction policy
傳統 blocking I/O model 的問題是無法在 blocked I/O 操作時,針對 I/O event 進行回應,當 connection 回到 manager,會保持 alive,但無法監控 socket status,如果 connection 被 server 關閉,client side 無法偵測到 connection state 已經改變了。
HttpClient 透過檢測 connection 是否 stale 的方式來解決這個問題,但 stale connection check 並不是 100% reliable。另一個方式就是用專屬的 monitor thread 監控 evict connections,他會定時呼叫 ClientConnectionManager#closeExpiredConnections(),關閉 expired connections,並evict closed connections from pool (呼叫 ClientConnectionManager#closeIdleConnections())。
import org.apache.http.conn.HttpClientConnectionManager;
import java.util.concurrent.TimeUnit;
public class IdleConnectionMonitorThread extends Thread {
private final HttpClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
@Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
// Close expired connections
connMgr.closeExpiredConnections();
// Optionally, close connections
// that have been idle longer than 30 sec
connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// terminate
}
}
public void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
}
connection keep alive strategy
HTTP spec 沒有規定 connection 要保持 alive 多久,有些 Server 使用非標準的 Keep-Alive header,一般 http server 都會設定一段時間沒有活動的 http connection 會被刪除,但不會通知 client side。
public class Test {
public static void main(String[] args) {
ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
// Honor 'keep-alive' header
HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator(HTTP.CONN_KEEP_ALIVE));
while (it.hasNext()) {
HeaderElement he = it.nextElement();
String param = he.getName();
String value = he.getValue();
if (value != null && param.equalsIgnoreCase("timeout")) {
try {
return Long.parseLong(value) * 1000;
} catch (NumberFormatException ignore) {
}
}
}
HttpHost target = (HttpHost) context.getAttribute(
HttpClientContext.HTTP_TARGET_HOST);
if ("www.server.com".equalsIgnoreCase(target.getHostName())) {
// Keep alive for 5 seconds only
return 5 * 1000;
} else {
// otherwise keep alive for 30 seconds
return 30 * 1000;
}
}
};
CloseableHttpClient client = HttpClients.custom()
.setKeepAliveStrategy(myStrategy)
.build();
}
}
connection socket factory
Http connection 內部使用 java.net.Socket 物件處理資料傳輸,透過 ConnectionSocketFactory interface 建立, 初始化, 連接 socket。可讓 HttpClient 提供建立 socket 的功能,預設是使用 PlainConnectionSocketFactory 建立 plain sockets。 產生 socket 跟建立 conection 是兩件事,socket 可在 blocked connection operation 時被關閉。
public class Test {
public static void main(String[] args) {
HttpClientContext clientContext = HttpClientContext.create();
PlainConnectionSocketFactory sf = PlainConnectionSocketFactory.getSocketFactory();
try {
Socket socket = sf.createSocket(clientContext);
int timeout = 1000; //ms
HttpHost target = new HttpHost("localhost");
InetSocketAddress remoteAddress = new InetSocketAddress(
InetAddress.getByAddress(new byte[] {127,0,0,1}), 80);
sf.connectSocket(timeout, socket, target, remoteAddress, null, clientContext);
} catch (IOException e) {
e.printStackTrace();
}
}
}
LayeredConnectionSocketFactory 是 ConnectionSocketFactory 的 extension,可在既有的 plain sokcet 建立 layered socket。socket layering 主要是用來透過 proxies 建立 secure sockets。 HttpClient 提供 SSLSocketFactory 實作了 SSL/TLS layering。
Hostname Verification: HttpClient 可自訂要不要檢查儲存在 X.509 certificate 的 hostname,有兩種 javax.net.ssl.HostnameVerifier implementations。
- DefaultHostnameVerifier 相容於 RFC2818,hostname 必須符合 certificate 中的 alterntive names,如果 CN 中沒有指定 alternative name,則必須在 CN 中填寫 *
- NoopHostnameVerifier 不使用 hostname verification
public HttpClient httpClient() {
try {
ConnectionSocketFactory plainsf = PlainConnectionSocketFactory.getSocketFactory();
SSLContext sslContext = SSLContexts.custom()
.loadTrustMaterial(null, new TrustSelfSignedStrategy())
.build();
LayeredConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext, NoopHostnameVerifier.INSTANCE);
Registry<ConnectionSocketFactory> r = RegistryBuilder.<ConnectionSocketFactory>create()
.register("http", plainsf)
.register("https", sslsf)
.build();
HttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(r);
RequestConfig requestConfig = RequestConfig
.custom()
.setSocketTimeout(30000)
.setConnectTimeout(30000).build();
return HttpClients.custom().setConnectionManager(cm)
.setDefaultRequestConfig(requestConfig).build();
} catch (Exception e) {
throw new RuntimeException("Cannot build HttpClient using self signed certificate", e);
}
}
proxy configuration
可自訂 proxy host
HttpHost proxy = new HttpHost("someproxy", 8080);
DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
CloseableHttpClient httpclient = HttpClients.custom()
.setRoutePlanner(routePlanner)
.build();
可使用 JRE 標準的 proxy 設定
SystemDefaultRoutePlanner routePlanner = new SystemDefaultRoutePlanner(
ProxySelector.getDefault());
CloseableHttpClient httpclient = HttpClients.custom()
.setRoutePlanner(routePlanner)
.build();
也可以實作 RoutePlanner,建立 route
HttpRoutePlanner routePlanner = new HttpRoutePlanner() {
public HttpRoute determineRoute(
HttpHost target,
HttpRequest request,
HttpContext context) throws HttpException {
return new HttpRoute(target, null, new HttpHost("someproxy", 8080),
"https".equalsIgnoreCase(target.getSchemeName()));
}
};
CloseableHttpClient httpclient = HttpClients.custom()
.setRoutePlanner(routePlanner)
.build();
HTTP State Management
HTTP 本身是 stateless, request/response 的 protocol,沒有 session 的概念,後來Netscape 提出了 cookie 的概念,並送交標準化。
HttpClient 使用 Cookie interface 處理 cookie token。通常 cookie 包含了多個 name/value pair,有 domain 限制,另外有個 path 決定該 cookie 可使用的 urls,還有 maxmimum period of time。
BasicClientCookie cookie = new BasicClientCookie("name", "value");
// Set effective domain and path attributes
cookie.setDomain(".mycompany.com");
cookie.setPath("/");
// Set attributes exactly as sent by the server
cookie.setAttribute(ClientCookie.PATH_ATTR, "/");
cookie.setAttribute(ClientCookie.DOMAIN_ATTR, ".mycompany.com");
HttpClient 提供多個 CookieSpec,建議使用 Standard or Standard strict policy。
Standard strict RFC 6265, section 4
Standard 比 RFC 6265, section 4 的限制寬鬆一些,可相容於大部分的 servers
Netscape draft (obsolete)
RFC 2965 (obsolete)
RFC 2109 (obsolete)
Browser compatibility (obsolete)
Default RFC 2965, RFC 2109, or Nescape draft 相容的規格,將會被 Standard 取代
Ignore cookies
RequestConfig globalConfig = RequestConfig.custom()
.setCookieSpec(CookieSpecs.DEFAULT)
.build();
CloseableHttpClient httpclient = HttpClients.custom()
.setDefaultRequestConfig(globalConfig)
.build();
RequestConfig localConfig = RequestConfig.copy(globalConfig)
.setCookieSpec(CookieSpecs.STANDARD_STRICT)
.build();
HttpGet httpGet = new HttpGet("/");
httpGet.setConfig(localConfig);
如果需要用自訂的 cookie policy,就要實作 CookieSpec interface。
HttpClient 需要實作 CookieStore interface 的 persistent cookie store,預設是 BasicCookieStore,內部以ArrayList 實作。
// Create a local instance of cookie store
CookieStore cookieStore = new BasicCookieStore();
// Populate cookies if needed
BasicClientCookie cookie = new BasicClientCookie("name", "value");
cookie.setDomain(".mycompany.com");
cookie.setPath("/");
cookieStore.addCookie(cookie);
// Set the store
CloseableHttpClient httpclient = HttpClients.custom()
.setDefaultCookieStore(cookieStore)
.build();
Fluent API
自 HttpClient 4.2 開始,提供了新的 fluent interface 的 API,Fluent API 簡化了 HttpClient,也不需要處理連接管理、資源釋放等繁雜的操作。
使用 Fluent API 必須在 build.sbt 增加 fluent-hc
"org.apache.httpcomponents" % "httpclient" % "4.5.4",
"org.apache.httpcomponents" % "fluent-hc" % "4.5.4",
import java.io.IOException;
import java.net.URI;
import java.util.LinkedList;
import java.util.Queue;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.config.CookieSpecs;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.fluent.*;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.concurrent.FutureCallback;
import org.apache.http.entity.ContentType;
import org.apache.http.impl.client.BasicCookieStore;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.cookie.BasicClientCookie;
import org.apache.http.util.EntityUtils;
import org.json.JSONObject;
/**
* Easy to use facade API
*
* Fluent API 是共用的一個HttpClient實例(Executor.CLIENT),
* 使用了 PoolingHttpClientConnectionManager,
* MaxPerRoute: 100, MaxTotal: 200.
*/
public class FluentExample {
public static void main(String[] args) throws Exception {
// 使用URIBuilder來構造複雜的url
URI uri = new URIBuilder().setScheme("http")
.setHost("www.bing.com")
.setPath("/dict/")
.addParameter("a", "中文test123")
.addParameter("b", "")
.build();
testFluentGet(uri.toString());
testFluentPost("http://www.maxkit.com.tw/temp/test/json.html");
testFluentWithContext();
testFluentJsonResponse();
// 多線程並發模式, 平均1.3毫秒發一個請求(共發10個)
testFluentConcurrent("http://www.maxkit.com.tw/temp/test/json.html", 10);
}
private static void testFluentGet(String url) {
try {
String result = Request.Get(url)
.userAgent("Test")
.addHeader(HttpHeaders.ACCEPT, "a") // HttpHeaders包含很多常用的http header
.addHeader("AA", "BB")
.connectTimeout(1000)
.socketTimeout(1000)
.execute().returnContent().asString();
System.out.println(result);
} catch (Exception e) {
e.printStackTrace();
}
}
private static void testFluentPost(String url) {
try {
String result = Request.Post(url)
.version(HttpVersion.HTTP_1_1)
.useExpectContinue()
.addHeader("X-Custom-header", "stuff")
.bodyForm(Form.form().add("a", "abc123")
.add("b", "中文abc123").build(), Consts.UTF_8)
// 或者傳入自定義類型的body
// ContentType包含很多常用的content-type
// .bodyString("Important stuff 中文abc123", ContentType.DEFAULT_TEXT)
.execute().returnContent().asString();
System.out.println(result);
} catch (Exception e) {
e.printStackTrace();
}
}
private static void testFluentJsonResponse() {
try {
JSONObject result = Request.Get("http://www.maxkit.com.tw/temp/test/json.html")
.execute().handleResponse(new JsonResponseHandler());
System.out.println(result.toString(4));
} catch (Exception e) {
e.printStackTrace();
}
}
/**
* 由於Fluent API默認是共用的一個HttpClient實例, 因此HTTP的session狀態本身就會被控制住.
*
* 如果想獲得更多的自定義選項, 可以使用Executor來控制.
* 例如預先設置一個cookie, 保持多個請求的cookie是一致的, 這樣服務器就能夠識別出這些HTTP來自同一個用戶
**/
private static void testFluentWithContext() {
// To maintain client-side state (cookies, authentication) between requests,
// the fluent Executor helps by keeping a cookie store and setting up other types of authentication:
CookieStore cookieStore = new BasicCookieStore();
BasicClientCookie cookie = new BasicClientCookie("a", "b");
// 必須設置domain, 請求會根據訪問的域名自動在請求header中添加屬於該域名的cookie(瀏覽器默認行為)
cookie.setDomain(".bing.com");
cookieStore.addCookie(cookie);
RequestConfig config = RequestConfig.custom().setCookieSpec(CookieSpecs.DEFAULT).build();
// 預先設置請求中需要包含的cookie
// 更多的自定義可以使用
// HttpClient httpClient = HttpClientBuilder.create().setMaxConnTotal(20).setMaxConnPerRoute(20);
// Executor.newInstance(httpClient);
Executor executor = Executor.newInstance(HttpClients.custom().setDefaultRequestConfig(config).build())
.use(cookieStore);
try {
// 發送2個一樣的請求, 注意查看請求中cookie的情況
Request request1 = Request.Get("http://www.maxkit.com.tw/temp/test/json.html");
Request request2 = Request.Get("http://www.maxkit.com.tw/temp/test/json.html");
String result1 = executor.execute(request1).returnContent().asString();
System.out.println(result1);
// 發送了第一個請求過後, executor會自動將response中的set-cookie補充的客戶端的cookie中去(這就是一般瀏覽器的行為)
String result2 = executor.execute(request2).returnContent().asString();
System.out.println(result2);
} catch (Exception e) {
e.printStackTrace();
}
}
private static void testFluentConcurrent(String url, int count) throws InterruptedException {
// Creates a thread pool that creates new threads as needed,
// but will reuse previously constructed threads when they are available.
// If no existing thread is available, a new thread will be created and added to the pool.
// These pools will typically improve the performance of programs that
// execute many short-lived asynchronous tasks.
// Threads that have not been used for sixty seconds are terminated and
// removed from the cache. Thus, a pool that remains idle for long
// enough will not consume any resources.
ExecutorService threadpool = Executors.newCachedThreadPool();
// 如果不傳入ExecutorService線程池, 則直接採用多線程模式
// Async async = Async.newInstance().use(threadpool);
Async async = Async.newInstance();
// 增大連接數量, 預防出現連接不夠用的情況
int connMaxTotal = count * 2;
// 自定義httpclient, 主要是設置連接池
// MaxPerRoute: 每個路由(可以看作是每個URL)默認最多可佔用多少個連接
// connMaxTotal: 連接池最大多少個連接
HttpClient hc = HttpClients.custom().setMaxConnPerRoute(connMaxTotal).setMaxConnTotal(connMaxTotal).build();
async.use(Executor.newInstance(hc));
Request[] requests = new Request[count];
for (int i = 0; i < count; i++) {
requests[i] = Request.Get(url + "?_=" + i);
}
Queue<Future<Content>> queue = new LinkedList<Future<Content>>();
// Execute requests asynchronously
for (final Request request : requests) {
Future<Content> future = async.execute(request, new FutureCallback<Content>() {
public void failed(final Exception ex) {
System.out.println(ex.getMessage() + ": " + request);
}
public void completed(final Content content) {
System.out.println("Request completed: " + request);
}
public void cancelled() {
}
});
queue.add(future);
}
while (!queue.isEmpty()) {
Future<Content> future = queue.remove();
try {
future.get();
} catch (ExecutionException ex) {
ex.printStackTrace(System.err);
}
}
threadpool.shutdown();
}
}
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.HttpStatus;
import org.apache.http.StatusLine;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpResponseException;
import org.apache.http.client.ResponseHandler;
import org.apache.http.entity.ContentType;
import org.apache.http.util.EntityUtils;
import org.json.JSONObject;
import java.io.IOException;
public class JsonResponseHandler implements ResponseHandler<JSONObject> {
@Override
public JSONObject handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
final StatusLine statusLine = response.getStatusLine();
final HttpEntity entity = response.getEntity();
if (statusLine.getStatusCode() >= HttpStatus.SC_MULTIPLE_CHOICES) {
throw new HttpResponseException(statusLine.getStatusCode(),
statusLine.getReasonPhrase());
}
if (entity != null) {
return new JSONObject(EntityUtils.toString(entity, ContentType.getOrDefault(entity).getCharset()));
}
return null;
}
}
沒有留言:
張貼留言