實操 Web Cache

http://netkiller.github.io/journal/cache.html

Mr. Neo Chen (陳景峯), netkiller, BG7NYT


中國廣東省深圳市龍華新區民治街道溪山美地
518131
+86 13113668890


$Id: 12dfba037a04a025e01285ca3d35a1192d9bb95d

版權聲明

轉載請與作者聯繫,轉載時請務必標明文章原始出處和作者信息及本聲明。

文檔出處:
http://netkiller.github.io
http://netkiller.sourceforge.net

微信掃瞄二維碼進入 Netkiller 微信訂閲號

QQ群:128659835 請註明“讀者”

2017-06-16

摘要

寫這篇文章的原因,是我看到網上很多談這類的文章,多是人云亦云,不求實事,誤導讀者。

下面文中我會一個一個做實驗,並展示給你,說明為什麼會這樣。只有自己親自嘗試才能拿出有說服力的真憑實據。

2014-03-12 首次發佈

2015-08-27 修改,增加特殊數據緩存


目錄

1. 測試環境

CentOS 6.5

Nginx安裝腳本 https://github.com/oscm/shell/blob/master/nginx/nginx.sh

php安裝腳本 https://github.com/oscm/shell/blob/master/php/5.5.8.sh

2. 檔案修改日期 If-Modified-Since / Last-Modified

If-Modified-Since 小於 Last-Modified 返回 HTTP/1.1 200 OK, 否則返回 HTTP/1.0 304 Not Modified

每次瀏覽器請求檔案會攜帶 If-Modified-Since 頭,將當前時間發送給伺服器,與伺服器的Last-Modified時間對對比,如果大於Last-Modified時間,返回HTTP/1.0 304 Not Modified不會重新打開檔案,否則重新讀取檔案並返回內容

2.1. 靜態檔案

nginx/1.0.15 靜態檔案自動產生 Last-Modified 頭

# nginx -v
nginx version: nginx/1.0.15

# curl -I http://192.168.6.9/index.html
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 07:36:03 GMT
Content-Type: text/html
Content-Length: 6
Last-Modified: Thu, 27 Feb 2014 07:29:50 GMT
Connection: keep-alive
Accept-Ranges: bytes
			

圖片檔案

# curl -I http://192.168.6.9/image.png
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 07:37:18 GMT
Content-Type: image/png
Content-Length: 41516
Last-Modified: Thu, 27 Feb 2014 07:36:59 GMT
Connection: keep-alive
Accept-Ranges: bytes
			

提示

疑問 nginx/1.4.5 預設沒有 Last-Modified

# nginx -v
nginx version: nginx/1.4.5

# curl -I http://192.168.2.15/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:13:44 GMT
Content-Type: text/html
Connection: keep-alive
				

經過一番周折最終找到答案 Nginx 如果開啟 ssi 會禁用Last-Modified 關閉 ssi 後輸出如下

# curl -I  http://localhost/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:44:29 GMT
Content-Type: text/html
Content-Length: 6
Last-Modified: Wed, 25 Dec 2013 03:18:16 GMT
Connection: keep-alive
ETag: "52ba4e78-6"
Accept-Ranges: bytes
				

再測試一次

# curl -H "If-Modified-Since: Fir, 28 Feb 2014 07:42:55 GMT" -I http://192.168.2.15/test.html
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:34:54 GMT
Last-Modified: Fri, 28 Feb 2014 01:55:50 GMT
Connection: keep-alive
ETag: "530feca6-8b"
			

測試結果成功返回 HTTP/1.1 304 Not Modified, 但又莫名其妙的出現了 ETag。 這就是Nignx本版差異,非常混亂。

既然出現了ETag我們也順便測試一下

# curl -H 'If-None-Match: "530feca6-8b"' -I http://192.168.2.15/test.html
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:39:18 GMT
Last-Modified: Fri, 28 Feb 2014 01:55:50 GMT
Connection: keep-alive
ETag: "530feca6-8b"
			

也是成功的

測試圖片

# curl -I http://localhost/logo.jpg
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:59:04 GMT
Content-Type: image/jpeg
Content-Length: 10103
Last-Modified: Fri, 28 Feb 2014 02:56:37 GMT
Connection: keep-alive
ETag: "530ffae5-2777"
Accept-Ranges: bytes


# curl -H 'If-None-Match: "530ffae5-2777"' -I http://localhost/logo.jpg
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 03:03:33 GMT
Last-Modified: Fri, 28 Feb 2014 02:56:37 GMT
Connection: keep-alive
ETag: "530ffae5-2777"

# curl -H "If-Modified-Since: Fri, 28 Feb 2014 12:04:18 GMT" -I http://localhost/logo.jpg
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 03:04:45 GMT
Content-Type: image/jpeg
Content-Length: 10103
Last-Modified: Fri, 28 Feb 2014 02:56:37 GMT
Connection: keep-alive
ETag: "530ffae5-2777"
Accept-Ranges: bytes
			

測試結果,ETag通過測試,If-Modified-Since無論如何也無法返回 304 可能還需要其他的HTTP頭,瀏覽器測試都通過返回 HTTP/1.1 304 Not Modified

現在換成瀏覽器測試 Chrome Firefox成功, 因為瀏覽器不會主動發送If-Modified-Since, 瀏覽器只有發現Last-Modified後,第二次請求才會推送 If-Modified-Since 需要刷新兩次頁面。

2.1.1. if_modified_since

在開啟ssi的情況下,通過參數 if_modified_since 可以開啟 Last-Modified

server {
    listen       80;
    server_name  192.168.2.15;
    if_modified_since before;
}
				

測試結果看不到 Last-Modified, 因為 Nginx 的 if_modified_since before;參數只有接收到瀏覽器發過來的If-Modified-Since頭才會發送Last-Modified

# curl -I http://192.168.2.15/test.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:39:42 GMT
Content-Type: text/html
Connection: keep-alive
				

最終 if_modified_since before; 數沒有起到作用

參數設置為 if_modified_since exact;

# curl -I http://192.168.2.15/test.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:45:40 GMT
Content-Type: text/html
Connection: keep-alive

# curl -H 'If-None-Match: "530feca6-8b"' -I http://192.168.2.15/test.html
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:45:44 GMT
Last-Modified: Fri, 28 Feb 2014 01:55:50 GMT
Connection: keep-alive
ETag: "530feca6-8b"

# curl -H "If-Modified-Since: Fir, 28 Feb 2014 07:42:55 GMT" -I http://192.168.2.15/test.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 02:45:50 GMT
Content-Type: text/html
Connection: keep-alive
				

測試失敗,瀏覽器也是實測失敗,ETag卻成功

2.2. 通過rewrite偽靜態處理

index.php仍然是上面的那個php檔案,我們只是做了偽靜態

			
location / {
        root   /www;
        index  index.html index.htm;
		rewrite ^/test.html$ /index.php last;
}
			
			

現在我們分別通過curl有chrome/firefox進行測試

			
# curl -H "If-Modified-Since: Fri, 28 Feb 2014 08:42:55 GMT" -I  http://192.168.6.9/test.html
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 08:55:19 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Thu, 26 Feb 2014 08:39:35 GMT
			
			

經過測試無論是 curl 還是 chrome/firefox 均無法返回304.

下面是我的分析,僅供參考。用戶請求index.html Nginx 會找到該檔案讀取 mtime 與 If-Modified-Since 匹配,如果If-Modified-Since大於 Last-Modified返回 304否則返回200.

為什麼同樣操作經過偽靜態的test.html就不行呢? 我分析當用戶請求test.html Nginx 首先做Rewrite處理,然後跳轉到index.php 整個過程nginx 並沒有訪問實際物理檔案test.html也就沒有mtime, 所以Nginx 返回200.

如果 Nginx 按預想的返回304,nginx 需要讀取程序返回的HTTP頭,Nginx 並沒有這樣的處理邏輯。

2.3. 動態檔案

動態檔案沒有 Last-Modified 頭,我們可以偽造一個

			
# curl -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 07:57:59 GMT
Content-Type: text/html
Connection: keep-alive
			
			

在程序中加入HTTP頭推送操作,Last-Modified時間是27號,當前時間是28號,我們要讓Last-Modified 小於當前時間才行。

			
# cat index.php
<?php
header('Last-Modified: Thu, 27 Feb 2014 08:39:35 GMT' );
//header('Last-Modified: ' .gmdate('D, d M Y H:i:s') . ' GMT' );
?>
Hello
			
			

現在你將看到 Last-Modified

			
# curl -I http://localhost/modified.php
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:59:28 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 10:04:18 GMT
			
			

注意

雖然我們讓動態程序返回了 Last-Modified ,但瀏覽器不認,經過測試 Chrome / Firefox 均不會承認.php檔案,並緩存其內容。

				
# curl -I http://localhost/modified.php
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:59:28 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 10:04:18 GMT

# curl -H "If-Modified-Since: Fri, 28 Feb 2014 08:42:55 GMT" -I  http://localhost/modified.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Fri, 28 Feb 2014 05:32:30 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Thu, 26 Feb 2014 08:39:35 GMT
				
				

Last-Modified 對動態程序來說沒有起到實際作用

Last-Modified是程序產生的,Nginx無法讀到,讓程序去處理狀態返回是可行的,下面我們修改程序如下。

			
# cat modified.php
<?php
$mtime = 'Fri, 28 Feb 2014 12:04:18 GMT';
cache($mtime);
function cache($mtime)
{
	$http_if_modified_since = null;
	if(array_key_exists ('HTTP_IF_MODIFIED_SINCE',$_SERVER)){
		$http_if_modified_since = $_SERVER['HTTP_IF_MODIFIED_SINCE'];
	}
	echo $http_if_modified_since;
	if ($http_if_modified_since >= $mtime)
	{
		header('Last-Modified: '.$mtime, true, 304);
		exit;
	} else {
		header('Last-Modified: ' . $mtime );
	}

}
print_r($_SERVER);
echo date("Y-m-d H:i:s");
?>
			
			

測試效果

			
# curl -I http://localhost/modified.php
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:22:28 GMT
Content-Type: text/html
Connection: keep-alive
			
			

偽造一個 If-Modified-Since 日期小於我們指定的日期程序返回HTTP/1.1 200 OK

			
# curl -H "If-Modified-Since: Fri, 28 Feb 2014 10:04:18 GMT" -I http://localhost/modified.php
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:22:13 GMT
Content-Type: text/html
Connection: keep-alive
			
			

偽造一個 If-Modified-Since 日期大於我們指定的日期程序返回HTTP/1.1 304 Not Modified

			
# curl -H "If-Modified-Since: Fri, 28 Feb 2014 20:04:18 GMT" -I http://localhost/modified.php
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 05:21:31 GMT
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 12:04:18 GMT
			
			

測試成功,並且在瀏覽器端也測試成功 HTTP/1.1 304 Not Modified

將modified.php偽靜態處理

    location / {
        root   /www;
        index  index.html index.htm;
		rewrite ^/modified.html$ /modified.php last;
    }
			

測試

# curl -I http://localhost/modified.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 06:21:10 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 10:04:18 GMT

# curl -H "If-Modified-Since: Fri, 28 Feb 2014 12:04:18 GMT" -I http://localhost/modified.html
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 06:21:22 GMT
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 10:04:18 GMT
			

達到預期效果

3. ETag / If-None-Match

上面的Last-Modified測試中發現ETag雖然不限制,但是暗中還是可用的:)

etag on; 開啟Nginx etag支持,lighttpd 預設開啟

		
server {
    listen       80;
    server_name phalcon;

    charset utf-8;

    access_log  /var/log/nginx/host.access.log  main;
	etag on;
    location / {
        root   /www/phalcon/public;
        index  index.html index.php;
    }
}
		
		

檢查ETag輸出

# curl -I http://localhost/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 03:08:28 GMT
Content-Type: text/html
Connection: keep-alive

# curl -I http://phalcon/img/css.png
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 27 Feb 2014 09:20:49 GMT
Content-Type: image/png
Content-Length: 1133
Last-Modified: Fri, 14 Feb 2014 08:05:03 GMT
Connection: keep-alive
ETag: "52fdce2f-46d"
Accept-Ranges: bytes3
		

即使你開啟了 ETag Nginx 對 HTML、CSS檔案也不做處理。最終在一個外國網站是找到一個nginx-static-etags模組,有興趣自己嘗試,這裡就不講了。

3.1. 靜態檔案

首先查詢etag值

# curl -I http://phalcon/img/css.png
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 27 Feb 2014 09:25:41 GMT
Content-Type: image/png
Content-Length: 1133
Last-Modified: Fri, 14 Feb 2014 08:05:03 GMT
Connection: keep-alive
ETag: "52fdce2f-46d"
Accept-Ranges: bytes
			

然後向伺服器發送If-None-Match HTTP頭

# curl -H 'If-None-Match: "52fdce2f-46d"' -I http://phalcon/img/css.png
HTTP/1.1 304 Not Modified
Server: nginx
Date: Thu, 27 Feb 2014 09:25:44 GMT
Last-Modified: Fri, 14 Feb 2014 08:05:03 GMT
Connection: keep-alive
ETag: "52fdce2f-46d"
			

這次比較順利,成功返回HTTP/1.1 304 Not Modified

3.2. 動態程序

預設情況輸出如下

# curl -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 27 Feb 2014 09:29:13 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
			

測試程序

			
<?php
header('Last-Modified: Thu, 26 Feb 2014 08:39:35 GMT' );
header('Etag: "abcdefg"');
#header('Last-Modified: ' .gmdate('D, d M Y H:i:s') . ' GMT' );
?>
Hello
			
			

測試效果

# curl -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 09:41:06 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Thu, 26 Feb 2014 08:39:35 GMT
Etag: "abcdefg"

[root@centos6 ~]# curl -H 'If-None-Match: "abcdefg"' -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 09:41:42 GMT
Content-Type: text/html
Connection: keep-alive
Last-Modified: Thu, 26 Feb 2014 08:39:35 GMT
Etag: "abcdefg"
			

測試情況與之前的Last-Modified結果一樣

動態程序返回Etag真的就沒有用了嗎?

答案是:非也, 有一個方法可以讓動態程序返回的 Etag 也能發揮作用,程序修改如下

			
<?php
$etag = md5('http://netkiller.github.io');
cache($etag);
function cache($etag)
{
        $http_if_none_match = null;
        if(array_key_exists ('HTTP_IF_NONE_MATCH',$_SERVER)){
                $http_if_none_match = $_SERVER['HTTP_IF_NONE_MATCH'];
        }

        if ($http_if_none_match == $etag)
        {
                header('Etag: '.$etag, true, 304);
                exit;
        } else {
                header('Etag: '.$etag);
        }

}
print_r($_SERVER);
echo date("Y-m-d H:i:s");
?>
			
			

首先查看Etag值

# curl  -I http://192.168.6.9/test.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 10:07:19 GMT
Content-Type: text/html
Connection: keep-alive
Etag: 7467675324d0f7a3e01ce5151848fedb
			

發送If-None-Match頭

# curl -H 'If-None-Match: 7467675324d0f7a3e01ce5151848fedb' -I http://192.168.6.9/test.php
HTTP/1.1 304 Not Modified
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 10:07:39 GMT
Connection: keep-alive
Etag: 7467675324d0f7a3e01ce5151848fedb
			

達成預計效果,此種方法同樣可以用於 Last-Modified,偽靜態後效果更好

Etag 值的運算技巧,我習慣上採用URL同時配合偽靜態例如

$etag = $_SERVER['REQUEST_URI']
			

URL類似 http://www.example.com/news/100/1000.html 一次請求便緩存頁面,這樣帶來一個更新的問題,於是又做了這樣的處理

http://www.example.com/news/100/1000.1.html
			

.1.是版本號,每次修改後+1操作,.1.沒有人格意義rewrite操作是會丟棄這個參數,僅僅是為了始終有新的URL對應內容

4. Expires / Cache-Control

前面所講 Last-Modified 與 Etag 主要用於分辨檔案是否修改過, 無法控制頁面在瀏覽器端緩存的時間。Expires / Cache-Control 可以控制緩存的時間段

Expires 是 HTTP/1.0標準,Cache-Control是 HTTP/1.1標準。都能正常工作,HTTP/1.1規範中max-age優先順序高於Expires,有些瀏覽器會聯動設置,例如你設置了Cache-Control隨之自動生成Expires,僅僅為了兼容。

4.1. 靜態檔案

首先配置nginx設置html與png檔案緩存1天

location ~ .*\.(html|png)$
{
    expires      1d;
}
			

當前情況

# curl -I http://192.168.6.9/index.html
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 10:47:08 GMT
Content-Type: text/html
Content-Length: 6
Last-Modified: Thu, 27 Feb 2014 07:29:50 GMT
Connection: keep-alive
Accept-Ranges: bytes
			

重啟Nginx後的HTTP協議頭多出Expires與Cache-Control

# curl -I http://192.168.6.9/index.html
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 10:42:09 GMT
Content-Type: text/html
Content-Length: 3698
Last-Modified: Fri, 26 Apr 2013 20:36:51 GMT
Connection: keep-alive
Expires: Fri, 28 Feb 2014 10:42:09 GMT
Cache-Control: max-age=86400
Accept-Ranges: bytes
			

4.2. 動態檔案

預設返回

# curl -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 11:45:05 GMT
Content-Type: text/html
Connection: keep-alive
			

index.php 增加 Cache-Control 輸出控制

			
header('Cache-Control: max-age=259200');
			
			

再次查看

# curl -I http://192.168.6.9/index.php
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Thu, 27 Feb 2014 11:53:48 GMT
Content-Type: text/html
Connection: keep-alive
Cache-Control: max-age=259200
			

現在使用 Chrome 、Firefox 測試,你會發現始終返回200,並且max-age=259200數值不會改變。

原因是Cache-Control程序輸出的,Nginx並不知道,所以Nginx 不會給你返回304

			
header('Last-Modified: ' .gmdate('D, d M Y H:i:s') . ' GMT' );

$offset = 60 * 60 * 24;
header('Expires: ' . gmdate('D, d M Y H:i:s', time() + $offset) . ' GMT');

$ttl=3600;
header("Cache-Control: max-age=$ttl, must-revalidate");
			
			

這種方法不能實現緩存的目的

5. FastCGI 緩存相關

我們做個嘗試將 expires 1d;加到location ~ \.php$中,看看能不能實現緩存的目的。

    location ~ \.php$ {
        root           /www;
        fastcgi_pass   127.0.0.1:9000;
        fastcgi_index  index.php;
        fastcgi_param  SCRIPT_FILENAME  /www$fastcgi_script_name;
        include        fastcgi_params;
		expires      1d;
    }
		

測試程序

		
# cat expires.php
<?php
echo date("Y-m-d H:i:s");
?>
		
		

測試結果

# curl -I http://localhost/expires.php
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 04:39:57 GMT
Content-Type: text/html
Connection: keep-alive
Expires: Sat, 01 Mar 2014 04:39:57 GMT
Cache-Control: max-age=86400
		

雖然推送 Cache-Control: max-age=86400 但是 IE Chrome Firefox 仍不能緩存頁面

6. HTML META 與 Cache

創建一個測試檔案如下

		
<html>
<head>
	<title>Hello</title>
	<meta http-equiv="Cache-Control" content="max-age=7200" />
	<meta http-equiv="expires" content="Fri, 28 Feb 2014 12:04:18 GMT" />
</head>
<body>
	Helloworld
</body>
</html>
		
		

測試HTML頁面

		
# curl -i http://localhost/test.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 03:30:45 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive

<html>
<head>
	<title>Hello</title>
	<meta http-equiv="Cache-Control" content="max-age=7200" />
	<meta http-equiv="expires" content="Fri, 28 Feb 2014 12:04:18 GMT" />
</head>
<body>
	Helloworld
</body>
</html>
		
		

我們可以看到HTML頁面中meta設置緩存對Nginx並不起作用, 很多人會說對瀏覽器起作用!

這次我測試了 IE11, Chrome, Firefox 發現都無法緩存頁面,可能對IE5什麼的還有用,我沒有環境測試,因為10年前我們在B/S開發經常這樣使用

		
<meta http-equiv="cache-control" content="max-age=0" />
<meta http-equiv="cache-control" content="no-cache" />
<meta http-equiv="expires" content="0" />
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
<meta http-equiv="pragma" content="no-cache" />
		
		

至少在當年IE是認這些Meta的,進入HTML5時代很多都發生了變化,所以不能一概而論

7. gzip

defalte 是 Apache httpd 的標準這裡只談gzip

首先創建一個 gzip.html

# curl -I http://localhost/gzip.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Mon, 03 Mar 2014 01:49:45 GMT
Content-Type: text/html
Content-Length: 19644
Last-Modified: Mon, 03 Mar 2014 01:49:02 GMT
Connection: keep-alive
ETag: "5313df8e-4cbc"
Accept-Ranges: bytes
		

開啟 gzip on;

server {
    listen       80;
    server_name  localhost;

    #charset utf-8;
    #access_log  /var/log/nginx/log/host.access.log  main;
    #etag on;
    #ssi on;
    gzip on;
		

現在看看效果

# curl -I http://localhost/gzip.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Mon, 03 Mar 2014 01:51:56 GMT
Content-Type: text/html
Content-Length: 19644
Last-Modified: Mon, 03 Mar 2014 01:49:02 GMT
Connection: keep-alive
ETag: "5313df8e-4cbc"
Accept-Ranges: bytes
		

並沒有什麼不同,現在增加HTTP頭Accept-Encoding:gzip,defalte看看

		
# curl -H Accept-Encoding:gzip,defalte  http://localhost/gzip.html
		
		

如果你能看到非文本內容(俗稱亂碼)就表示成功了。輸入內容就是gzip壓縮後二進制數據,我們使用gunzip可以解壓縮

# curl -H Accept-Encoding:gzip,defalte  http://localhost/gzip.html | gunzip
		

如果能正常看到html輸出,表示壓縮無誤。

7.1. gzip 總結

gzip on; 開啟後預設支持 text/html 不能在 gzip_types 再次定義,否則會提示重複MIME類型

Starting nginx: nginx: [warn] duplicate MIME type "text/html" in /etc/nginx/conf.d/localhost.conf:16
			

高級配置參考

    gzip  on;
    gzip_http_version 1.0;
    gzip_types        text/plain text/xml text/css application/xml application/xhtml+xml application/rss+xml application/atom_xml application/javascript application/x-javascript application/json;
    gzip_disable      "MSIE [1-6]\.";
    gzip_disable      "Mozilla/4";
    gzip_comp_level   6;
    gzip_proxied      any;
    gzip_vary         on;
    gzip_buffers      4 8k;
    gzip_min_length   1000;		
			

8. 反向代理與緩存

反向代理伺服器緩存方式分為:

強制緩存,指定檔案,副檔名,URL設置緩存時間

遵循HTTP協議頭標準進行緩存

預設配置,只進行代理,不進行緩存

server {
    listen       80;
    server_name  192.168.2.15;
    #access_log  /var/log/nginx/log/host.access.log  main;

	location / {
	  proxy_pass        http://localhost:80;
	  proxy_set_header  X-Real-IP  $remote_addr;
	}
}
 		

反向代理會產生兩條日誌(access_log 寫入一個檔案中,如果分開寫,則會分開寫入日誌)

192.168.2.15 - - [28/Feb/2014:18:09:33 +0800] "HEAD /modified.html HTTP/1.1" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
127.0.0.1 - - [28/Feb/2014:18:09:33 +0800] "HEAD /modified.html HTTP/1.0" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
		

Last-Modified 與 ETag 會透傳過去

# curl -H "If-Modified-Since: Fri, 28 Feb 2014 12:04:18 GMT" -I http://192.168.2.15/modified.html
HTTP/1.1 304 Not Modified
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 10:17:30 GMT
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 10:04:18 GMT
		

我們可以看到兩條日誌都返回304

192.168.2.15 - - [28/Feb/2014:18:17:30 +0800] "HEAD /modified.html HTTP/1.1" 304 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
127.0.0.1 - - [28/Feb/2014:18:17:30 +0800] "HEAD /modified.html HTTP/1.0" 304 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
		

下面為反向代理增加緩存功能

proxy_temp_path   /tmp/proxy_temp_dir;
proxy_cache_path  /tmp/proxy_cache_dir  levels=1:2   keys_zone=nginx_cache:200m inactive=3d max_size=30g;

server {
    listen       80;
    server_name  192.168.2.15;

	location / {
		proxy_cache nginx_cache;
		proxy_cache_key $host$uri$is_args$args;
		proxy_set_header  X-Real-IP  $remote_addr;
		proxy_set_header  X-Forwarded-For  $proxy_add_x_forwarded_for;
		proxy_cache_valid 200 10m;
		proxy_pass        http://localhost;
	}

	location ~ .*\.(php|jsp|cgi)?$
	{
	     proxy_set_header Host  $host;
	     proxy_set_header X-Forwarded-For  $remote_addr;
	     proxy_pass http://backend_server;
	}
}
		

# curl  -I http://192.168.2.15/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 10:57:35 GMT
Content-Type: text/html
Content-Length: 12
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 06:54:45 GMT
ETag: "531032b5-c"
Expires: Sat, 01 Mar 2014 10:57:35 GMT
Cache-Control: max-age=86400
Accept-Ranges: bytes

# curl  -I http://192.168.2.15/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 10:57:41 GMT
Content-Type: text/html
Content-Length: 12
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 06:54:45 GMT
ETag: "531032b5-c"
Expires: Sat, 01 Mar 2014 10:57:35 GMT
Cache-Control: max-age=86400
Accept-Ranges: bytes

# curl  -I http://192.168.2.15/index.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Fri, 28 Feb 2014 10:57:46 GMT
Content-Type: text/html
Content-Length: 12
Connection: keep-alive
Last-Modified: Fri, 28 Feb 2014 06:54:45 GMT
ETag: "531032b5-c"
Expires: Sat, 01 Mar 2014 10:57:35 GMT
Cache-Control: max-age=86400
Accept-Ranges: bytes
		

上面共請求了3次伺服器

192.168.2.15 - - [28/Feb/2014:18:57:35 +0800] "HEAD /index.html HTTP/1.1" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
127.0.0.1 - - [28/Feb/2014:18:57:35 +0800] "GET /index.html HTTP/1.0" 200 12 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "192.168.2.15"
192.168.2.15 - - [28/Feb/2014:18:57:41 +0800] "HEAD /index.html HTTP/1.1" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
192.168.2.15 - - [28/Feb/2014:18:57:46 +0800] "HEAD /index.html HTTP/1.1" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2" "-"
		

第一次連接192.168.2.15然後轉發給127.0.0.1 返回 HTTP/1.1 200 OK

後面兩次連接192.168.2.15沒有轉發給127.0.0.1 直接返回 HTTP/1.1 200 OK

查看緩存目錄,我們可以看到生成的緩存檔案

# find /tmp/proxy_*
/tmp/proxy_cache_dir
/tmp/proxy_cache_dir/1
/tmp/proxy_cache_dir/1/79
/tmp/proxy_cache_dir/1/79/b47a0009c531900de2a15ba80c0e3791
/tmp/proxy_temp_dir
		

8.1. gzip 處理

http://localhost/gzip.html 是支持壓縮的,192.168.2.15 proxy_pass http://localhost

# curl -H Accept-Encoding:gzip,defalte  http://localhost/gzip.html			
			

運行後輸出亂碼

# curl -H Accept-Encoding:gzip,defalte  http://192.168.2.15/gzip.html
			

現在透過反向代理請求試試,你會發現gzip壓縮無效,輸出的是HTML,這是怎麼回事呢?這是因為反向代理不清楚後面的伺服器是否支持gzip,所以一律按照正常html請求。現在我們開啟 gzip_vary on; 每次返回數據會攜帶Vary: Accept-Encoding 頭。

	gzip  on;
	gzip_vary on;
			

reload nginx 後查看Vary: Accept-Encoding輸出

# curl -I http://localhost/gzip.html
HTTP/1.1 200 OK
Server: nginx/1.4.5
Date: Mon, 03 Mar 2014 02:09:16 GMT
Content-Type: text/html
Content-Length: 19644
Last-Modified: Mon, 03 Mar 2014 01:49:02 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "5313df8e-4cbc"
Accept-Ranges: bytes
			

有 Vary: Accept-Encoding 頭,現在再測試一次

			
# curl -H "Accept-Encoding: gzip" http://192.168.2.15/gzip.html
<html>
<head>
	<title>Hello</title>
			
			

測試失敗,並沒有出現預期效果,於是到網站找答案,中文與英文資料都看個遍,沒有解決.

最後只能讓反向代理取到數據後再壓縮一次,配置開啟 gzip on;

proxy_temp_path   /tmp/proxy_temp_dir;
proxy_cache_path  /tmp/proxy_cache_dir  levels=1:2   keys_zone=nginx_cache:200m inactive=3d max_size=30g;

server {
    listen       80;
    server_name  192.168.2.15;

	gzip on;
	
	location / {
		proxy_set_header X-Real-IP  $remote_addr;
		proxy_set_header X-Forwarded-For  $proxy_add_x_forwarded_for; 
		# proxy_set_header Accept-Encoding "gzip"; 沒有任何效果
		proxy_pass       http://localhost;
	}
}
			

Nginx 反向代理作為代理綽綽有餘,如果做緩存伺服器,還是使用squid, varnish吧。

9. 特殊數據緩存

緩存並非只能緩存靜態內容,HTML,CSS,JS以及圖片意外的數據一樣可以緩存。

只要處理好HTTP頭即可。例如Ajax動態內容緩存,JSON數據緩存。

9.1. json

當用戶請求json地址時,我們將 json 數據附加HTTP頭(Cache-Control, Expires, ETag),然後返回給用戶,用戶的設備會遵循HTTP的聲明,進行緩存操作。

curl -I http://api.example.com/article/json/2/20/0.html
HTTP/1.1 200 OK
Expires: Wed, 26 Aug 2015 05:40:57 GMT
Date: Wed, 26 Aug 2015 05:39:57 GMT
Server: nginx
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Cache-Control: max-age=60
ETag: 4238111283
Age: 69475
X-Via: 1.1 kaifeng45:3 (Cdn Cache Server V2.0)
Connection: keep-alive
			

注意這裡使用了偽靜態 /article/json/2/20/0.html 偽靜態與緩存沒有關係,實際起作用的是HTTP頭。

我們可以看到 Content-Type: application/json; charset=utf-8 聲明,表明這是json數據,而不是HTML。

現在我們來演示一下JSON被緩存的效果,首先要說明 http://api.example.com/article/json/2/20/0.html 不是0.html檔案,而是採用phalcon框架開發的一個程序,article是控製器類名稱,json是jsonAction方法, 2/20/0 是傳遞給jsonAction的參數。

$ curl -I http://api.example.com/article/json/2/20/0.html
HTTP/1.1 200 OK
Expires: Thu, 27 Aug 2015 05:24:21 GMT
Date: Thu, 27 Aug 2015 05:23:21 GMT
Server: nginx/1.5.7
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Cache-Control: max-age=60
ETag: 558918903
Age: 1
X-Via: 1.1 kaifeng45:3 (Cdn Cache Server V2.0)
Connection: keep-alive
			

上面第一次請求數據將被緩存。我們第二次請求推送 HTTP 頭 If-None-Match。

$ curl -H 'If-None-Match: 558918903' -I http://api.example.com/article/json/2/20/0.html
HTTP/1.0 304 Not Modified
Date: Thu, 27 Aug 2015 05:23:22 GMT
Content-Type: application/json; charset=utf-8
Expires: Thu, 27 Aug 2015 05:24:22 GMT
ETag: 558918903
Cache-Control: max-age=60
Age: 15
X-Via: 1.0 kaifeng45:3 (Cdn Cache Server V2.0)
Connection: keep-alive			
			

數據被緩存並返回結果 HTTP/1.0 304 Not Modified,304代碼是告訴用戶端該頁面或者數據沒有變動,無需要再次下載數據。

9.2. XML

這裡是指動態生成的XML,處理方式與 JSON一樣,XML數據附加HTTP頭(Cache-Control, Expires, ETag)後返回給用戶。

10. 總結

經過詳細的測試我們發現不同的瀏覽器,不同的Web伺服器,甚至每個版本都有所差異。

測試總結 Apache HTTPD 最完善 Lighttpd 其次, Nignx仍在快速發展中,Nignx每個版本差異很大,對HTTP協議實現標準也不太嚴謹,因為Nignx在大陸是趨勢,所以下面給出的例子都是nginx

我比較看好Lighttpd,FastCGI 部分我一般是用php-fpm替代Lighttpd的spawn-fcgi

切記使用Nginx要注意每個本版細微變化,否則升級後會有影響。我習慣使用yum 安裝 nginx 隨時 yum update 升級。

另外FastCGI 與 mod_php也有所區別

延伸閲讀《 Netkiller Web 手札》http://netkiller.github.io/www/index.html