Home | 簡體中文 | 繁體中文 | 雜文 | 打賞(Donations) | ITEYE 博客 | OSChina 博客 | Facebook | Linkedin | 知乎專欄 | Search | About

第 8 章 Elasticsearch

目錄

8.1. 安裝 Elasticsearch
8.1.1. 單機模式 (適用於開發環境)
8.1.2. Elasticsearch Cluster
8.1.3. 負載均衡配置
8.1.4. 安裝指定版本的 Elasticsearch
8.1.5. Plugin
8.1.5.1. elasticsearch-analysis-ik
8.1.5.2. elasticsearch-analysis-pinyin
8.2. 管理
8.2.1. 查看索引
8.2.2. 節點健康狀態
8.2.3. 節點http狀態
8.2.4. 查看master節點
8.2.5. 查看索引的節點分佈
8.2.6. 索引的開啟與關閉
8.2.6.1. _open
8.2.6.2. _close
8.3. 文檔API
8.3.1. 快速上手
8.3.2. 寫入 PUT/POST
8.3.3. 獲取 GET
8.3.3.1. _source
8.3.4. 檢查記錄是否存在
8.3.5. 刪除 Delete
8.3.6. 參數
8.3.6.1. pretty 格式化 json
8.4. 搜索
8.4.1. URL 搜索
8.4.2. 分頁
8.5. Query DSL
8.5.1. match 匹配
8.5.2. multi_match 多欄位匹配
8.5.3. Query bool 布爾條件
8.5.3.1. must
8.5.3.2. should
8.5.3.3. must_not
8.5.4. filter 過濾
8.5.5. sort 排序
8.5.6. _source
8.5.7. highlight 高亮處理
8.6. 中文分詞插件管理
8.6.1. 通過 elasticsearch-plugin 命令安裝分詞插件
8.6.2. 手工安裝插件
8.6.3. 創建索引
8.6.4. 刪除索引
8.6.5. 配置索引分詞插件
8.6.5.1. 測試分詞效果
8.7. 映射
8.7.1. 查看 _mapping
8.7.2. 刪除 _mapping
8.7.3. 創建 _mapping
8.7.4. 更新 mapping
8.7.5. 修改 _mapping
8.7.6. 數據類型
8.7.6.1. date
8.8. Alias management 別名管理
8.8.1. 查看索引別名
8.8.2. 創建索引別名
8.8.3. 修改別名
8.8.4. 刪除別名
8.9. Example
8.9.1. 新聞資訊應用案例
8.9.2. 文章搜索案例
8.9.2.1.
8.10. Migrating MySQL Data into Elasticsearch using logstash
8.10.1. 安裝 logstash
8.10.2. 配置 logstash
8.10.3. 啟動 Logstash
8.10.4. 驗證
8.10.5. 配置模板
8.10.5.1. 全量導入
8.10.5.2. 多表導入
8.10.5.3. 通過 ID 主鍵欄位增量複製數據
8.10.5.4. 通過日期欄位增量複製數據
8.10.5.5. 指定SQL檔案
8.10.5.6. 參數傳遞
8.10.5.7. 控制返回JDBC數據量
8.10.5.8. 輸出到不同的 Elasticsearch 中
8.10.5.9. 日期格式轉換
8.10.5.10. example
8.10.6. 解決數據不對稱問題
8.10.7. 修改 Mapping
8.11. 安裝 Elasticsearch 2.3
8.11.1. RPM 安裝
8.11.2. YUM 安裝
8.11.3. 測試安裝是否正常
8.11.4. Plugin 插件管理
8.11.4.1. 手工安裝插件
8.11.4.2. plugin 命令
8.11.4.3. 插件測試
8.12. FAQ
8.12.1. Plugin [analysis-ik] is incompatible with Elasticsearch [2.3.5]. Was designed for version [2.3.4]
8.12.2. plugin [analysis-ik] is incompatible with version [5.6.1]; was designed for version [5.5.2]
8.12.3. mapper_parsing_exception: failed to parse [ctime]
8.12.4. 配置 JAVA_HOME

http://www.elasticsearch.org/

8.1. 安裝 Elasticsearch

8.1.1. 單機模式 (適用於開發環境)

使用 Netkiller OSCM 一鍵安裝 Elasticsearch 5.6.0

# Java
curl -s https://raw.githubusercontent.com/oscm/shell/master/lang/java/openjdk/java-1.8.0-openjdk.sh | bash

# Install
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/elasticsearch-5.x.sh | bash

# Bind 0.0.0.0
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/network.bind_host.sh | bash

# Auto create index
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/action.auto_create_index.sh | bash

# elasticsearch-analysis-ik

curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/5.5/elasticsearch-analysis-ik-5.6.0.sh | bash
			

通常 elasticsearch-analysis-ik 的版本會比 elasticsearch 慢一個版本,所以請使用下面命令查看版本是否一致,如果不一致可以修改 plugin-descriptor.properties 配置檔案,使其一致。

root@netkiller /usr/share/elasticsearch/plugins/ik % grep ^version plugin-descriptor.properties
version=5.5.1
			

啟動後使用 jps 命令檢查進城是否工作正常

root@netkiller /var/log/elasticsearch % jps | grep Elasticsearch
9706 Elasticsearch

root@netkiller /var/log/elasticsearch % ss -lnt | grep 9200
LISTEN     0      128    127.0.0.1:9200                     *:*
			

8.1.2. Elasticsearch Cluster

集群模式需要兩個以上的節點,通常是一個 master 節點,多個 data 節點

首先在所有節點上安裝 elasticsearch,然後配置各節點的配置檔案,對於 5.5.1 不需要配置決定哪些節點屬於 master 節點 或者 data 節點。

curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/elasticsearch-5.x.sh | bash			
			

配置檔案

cluster.name: elasticsearch-cluster # 配置集群名稱,所有伺服器伺服器保持一致

node.name: node-1 # 每個節點唯一標識,每個節點只需改動這裡,一次遞增 node-1, node-2, node-3 ...

network.host: 0.0.0.0

discovery.zen.ping.unicast.hosts: ["172.16.0.20", "172.16.0.21","172.16.0.22"]  # 所有節點的IP 地址寫在這裡

discovery.zen.minimum_master_nodes: 3 # 可以作為master的節點總數,有多少個節點就寫多少

http.cors.enabled: true
http.cors.allow-origin: "*"
			

查看節點狀態,使用curl工具: curl 'http://localhost:9200/_nodes/process?pretty'

root@netkiller /var/log/elasticsearch % curl 'http://localhost:9200/_nodes/process?pretty'
{
  "_nodes" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "cluster_name" : "my-application",
  "nodes" : {
    "-lnKCmBXRpiwExLns0jc9g" : {
      "name" : "node-1",
      "transport_address" : "10.104.3.2:9300",
      "host" : "10.104.3.2",
      "ip" : "10.104.3.2",
      "version" : "5.5.1",
      "build_hash" : "19c13d0",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 23669,
        "mlockall" : false
      }
    },
    "WVsgYi2HT8GWnZU1kUwFwA" : {
      "name" : "node-2",
      "transport_address" : "10.186.7.221:9300",
      "host" : "10.186.7.221",
      "ip" : "10.186.7.221",
      "version" : "5.5.1",
      "build_hash" : "19c13d0",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 12641,
        "mlockall" : false
      }
    }
  }
}
			

啟動節點後回生成 cluster.name 為檔案名的日誌檔案。

誰先啟動誰講成為master

[2017-08-11T17:42:46,018][INFO ][o.e.c.s.ClusterService   ] [node-1] new_master {node-1}{-lnKCmBXRpiwExLns0jc9g}{rZcJDIynSzq2Td3yP2kN5A}{10.104.3.2}{10.104.3.2:9300}, added {{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{X13ShUpAQa2zA1Mgcsm3bQ}{10.186.7.221}{10.186.7.221:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{X13ShUpAQa2zA1Mgcsm3bQ}{10.186.7.221}{10.186.7.221:9300}]			
			

如果master出現故障,其他節點會接管

[2017-08-11T17:44:52,797][INFO ][o.e.c.s.ClusterService   ] [node-2] master {new {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300}}, removed {{node-1}{-lnKCmBXRpiwExLns0jc9g}{rZcJDIynSzq2Td3yP2kN5A}{10.104.3.2}{10.104.3.2:9300},}, added {{node-1}{-lnKCmBXRpiwExLns0jc9g}{odnoG9kpQpeX1ltx5KYTSw}{10.104.3.2}{10.104.3.2:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node-1}{-lnKCmBXRpiwExLns0jc9g}{odnoG9kpQpeX1ltx5KYTSw}{10.104.3.2}{10.104.3.2:9300}]
[2017-08-11T17:44:53,184][INFO ][o.e.c.r.DelayedAllocationService] [node-2] scheduling reroute for delayed shards in [59.5s] (11 delayed shards)
[2017-08-11T17:44:53,929][INFO ][o.e.c.r.a.AllocationService] [node-2] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[information][0]] ...]).		
			

master 節點恢復上線會提示

[2017-08-11T17:44:52,855][INFO ][o.e.c.s.ClusterService   ] [node-1] detected_master {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300}, added {{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300},}, reason: zen-disco-receive(from master [master {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300} committed version [44]])
			

8.1.3. 負載均衡配置

首先安裝 nginx, 這裡使用 Netkiller OSCM 一鍵安裝腳本完成。

# curl -s https://raw.githubusercontent.com/oscm/shell/master/web/nginx/stable/nginx.sh | bash
			

因為 elasticsearch 沒有用戶認證機制我們通常在內網訪問他。如果對外提供服務需要增加用戶認證。

			
# printf "neo:$(openssl passwd -crypt s3cr3t)n" > /etc/nginx/passwords 			
			
			

創建 nginx 配置檔案 /etc/nginx/conf.d/elasticsearch.conf

upstream elasticsearch {
	server 172.16.0.10:9200;
	server 172.16.0.20:9200;
	server 172.16.0.30:9200;

	keepalive 15;
}

server {
	listen 9200;
	server_name so.netkiller.cn;
	
	charset utf-8;
    access_log /var/log/nginx/so.netkiller.cn.access.log;
    error_log /var/log/nginx/so.netkiller.cn.error.log;
	
	auth_basic "Protected Elasticsearch";
	auth_basic_user_file passwords;

	location ~* ^(/_cluster|/_nodes) {
		return 403;
		break;
	}
    location ~* _(open|close) {
            return 403;
            break;
    }
	location / {
    
		if ($request_filename ~ _shutdown) {
		    return 403;
		    break;
		}

        if ($request_method !~ ^(GET|HEAD|POST)$) {
			return 403;
		}

		proxy_pass http://elasticsearch;
		proxy_http_version 1.1;
		proxy_set_header Connection "Keep-Alive";
		proxy_set_header Proxy-Connection "Keep-Alive";
	}

}
			

反覆使用下面方法請求,最終你會發現 total_opened 會達到你的nginx 配置數量

$ curl 'http://test:test@localhost:9200/_nodes/stats/http?pretty' | grep total_opened
# "total_opened" : 15			
			

上面的例子適用於絶大多數場景。

例 8.1. Elasticsearch master / slave

				
upstream elasticsearch {
	server 172.16.0.10:9200;
	server 172.16.0.20:9200 backup;

	keepalive 15;
}

server {
	listen 9200;
	server_name so.netkiller.cn;
	
	auth_basic "Protected Elasticsearch";
	auth_basic_user_file passwords;

	location ~* ^(/_cluster|/_nodes) {
		return 403;
		break;
	} 

	location / {
    
		if ($request_filename ~ _shutdown) {
		    return 403;
		    break;
		}
		if ($request_method !~ "HEAD") {
          return 403;
          break;
        }
        if ($request_method ~ "DELETE") {
          return 403;
          break;
        }

		proxy_pass http://elasticsearch;
		proxy_http_version 1.1;
		proxy_set_header Connection "Keep-Alive";
		proxy_set_header Proxy-Connection "Keep-Alive";
	}

}
				
				

通過 limit_except 可以控制訪問權限,例如刪除操作。

			
limit_except PUT {
	allow 192.168.1.1;
	deny all;
}
limit_except DELETE {
	allow 192.168.1.1;
	deny all;
}
			
			

8.1.4. 安裝指定版本的 Elasticsearch

使用 yum 安裝預設為最新版本,我們常常會遇到一個問題 elasticsearch-analysis-ik 的版本晚于 Elasticsearch。如果使用 yum 安裝 Elasticsearch 可能 elasticsearch-analysis-ik 插件不支持這個版本,有些版本的 elasticsearch-analysis-ik 可以修改插件配置檔案中的版本號,使其與elasticsearch版本相同,可以欺騙 elasticsearch 跳過版本不一致異常。

最佳的解決方案是去 elasticsearch-analysis-ik github 找到兼容的版本,安裝我們安裝 elasticsearch-analysis-ik 的版本需求來指定安裝 elasticsearch

Versions

IK version	ES version
master	5.x -> master
5.6.0	5.6.0
5.5.3	5.5.3
5.4.3	5.4.3
5.3.3	5.3.3
5.2.2	5.2.2
5.1.2	5.1.2
1.10.1	2.4.1
1.9.5	2.3.5
1.8.1	2.2.1
1.7.0	2.1.1
1.5.0	2.0.0
1.2.6	1.0.0
1.2.5	0.90.x
1.1.3	0.20.x
1.0.0	0.16.2 -> 0.19.0			
			

最新版是 elasticsearch 5.6.1 但分詞插件 elasticsearch-analysis-ik 僅能支持到 elasticsearch 版本是 5.6.0

root@netkiller /var/log % yum --showduplicates list elasticsearch | expand | tail
Repository epel is listed more than once in the configuration  
elasticsearch.noarch                 5.5.3-1                  elasticsearch-5.x     
elasticsearch.noarch                 5.6.0-1                  elasticsearch-5.x   
elasticsearch.noarch                 5.6.1-1                  elasticsearch-5.x 
			

安裝 5.6.0

# yum install elasticsearch-5.6.0-1

Loaded plugins: fastestmirror, langpacks
Repository epel is listed more than once in the configuration
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package elasticsearch.noarch 0:5.6.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================================================================================================================================
 Package                                            Arch                                        Version                                      Repository                                              Size
==========================================================================================================================================================================================================
Installing:
 elasticsearch                                      noarch                                      5.6.0-1                                      elasticsearch-5.x                                       32 M

Transaction Summary
==========================================================================================================================================================================================================
Install  1 Package

Total download size: 32 M
Installed size: 36 M
Is this ok [y/d/N]: y
			

8.1.5. Plugin

Elasticsearch 提供了插件管理命令 elasticsearch-plugin

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin -h
A tool for managing installed elasticsearch plugins

Commands
--------
list - Lists installed elasticsearch plugins
install - Install a plugin
remove - removes a plugin from Elasticsearch

Non-option arguments:
command              

Option         Description        
------         -----------        
-h, --help     show help          
-s, --silent   show minimal output
-v, --verbose  show verbose output			
			

8.1.5.1. elasticsearch-analysis-ik

安裝插件

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
[=================================================] 100%   
-> Installed analysis-ik
				
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
        "properties": {
            "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word"
            }
        }
    
}'			
				

8.1.5.2. elasticsearch-analysis-pinyin

https://github.com/medcl/elasticsearch-analysis-pinyin