ElasticSearch7.4.2安装、使用以及与SpringBoot…

2020-05-27 16:07:33来源:博客园 阅读 ()

新老客户大回馈,云服务器低至5折

ElasticSearch7.4.2安装、使用以及与SpringBoot的整合

目录

  • ElasticSearch
    • 安装
    • 运行
    • 初步检索
      • 新增文档
      • 查询文档
      • 更新文档
      • 删除文档/索引
      • 批量操作
    • 进阶检索
      • Search API
      • Query DSL
        • match
        • bool
        • filter
        • term
        • 字段.keyword以及match区分
    • Aggregations
    • Mapping
      • 创建映射关系
      • 查看映射信息
      • 修改映射信息
    • 分词
      • 自定义词库
        • 安装nginx
        • 创建自定义词库
    • 整合SpringBoot
      • 创建索引
      • 获取
      • 删除
      • 检索

ElasticSearch

安装

ElasticSearch

docker pull elasticsearch:7.4.2

Kibana用于可视化

docker pull kibana:

运行

准备es

 mkdir -p /mydata/elasticsearch/config #存放ES配置
 
 mkdir -p /mydata/elasticsearch/data #存放ES数据
 
 chmod -R 777 /mydata/elasticsearch #修改权限

#进入config目录
 echo "http.host: 0.0.0.0">>/mydata/elasticsearch/config/elasticsearch.yml #将可远程访问写入配置文件

运行es

# 9200是我们向es发请求的http端口,9300是es集群节点之间的通信端口
docker run --name es -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \ #单节点设置
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \ #内存占用,此配置为开发测试
#### 挂载###
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \	#配置
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \ #数据
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \ #插件
-d elasticsearch:7.4.2

运行kibana

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://123.56.16.54:9200 -p 5601:5601 -d kibana:7.4.2

初步检索

  • GET请求:_cat/nodes
  • GET请求:_cat/health
  • GET请求:_cat/master 查看主节点
  • GET请求:_cat/indices 查看索引

image-20200526111343564

新增文档

  • PUT请求:index/type/id (不存在该id)新增数据,(存在该id)更新数据

    image-20200525215839777

    image-20200525220001992

  • POST请求:index/type/id (不带id,随机指定)新增数据,(带且存在该id)更新数据

查询文档

  • GET请求:index/type/id 查找

    image-20200525221108136

    乐观锁,插叙的时候带上?if_sq_no=1&if_primary_term=1

更新文档

  • POST请求:index/type/id/_update

    方法1:若是此次更新数据与上次一样,则不会增加版本号,提示noop无操作

    image-20200525222155384

image-20200525222230355

? 方法2:不带_update,同样的更新也能增加版本号,注意json数据写法不一样

image-20200525222357273

  • PUT请求:index/type/id
    • _update只能是post请求
    • 不会检查是否与上次更新操作一样

删除文档/索引

  • DELETE请求:index/type/id
  • DELETE请求:index

批量操作

两行一个数据,删除没有请求体,只有一行

image-20200525224545232

image-20200525232238066

进阶检索

Search API

#方法1 uri+检索参数
GET index01/_search?q=*&sort=_id:asc
#方法2 uri+请求体
GET index01/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
   { 
       "_id": {
       	   "order":  "asc" #按id升序 desc降序
       }
   },
   ...
  ]
}
#默认只显示10个信息

Query DSL

match
# 1. match_all 查询所有
GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
   { 
       "balance": {
       	   "order":  "asc" #按id升序 desc降序
       }
   }
  ],
  "from": 5,  # 指定从哪里开始
  "size": 5,  # 指定一次查询多少个 
  "_source": ["balance","firstname"] # 指定返回哪些字段
}
# 2. match用于查找字段xxx,值为xxx的数据,
GET /bank/_search
{
  "query": {
    "match": {
    	"account_number": 20
    }
  }
}
# 也可以是模糊匹配,单词用空格隔开,按匹配度(匹配单词数量)降序 ”max_score“匹配最高的得分
GET /bank/_search
{
  "query": {
    "match": {
    	"address": "mill lane"
    }
  }
}

image-20200526113814433

# 3.match_phrase 短语完全匹配
GET /bank/_search
{
  "query": {
    "match_phrase": {
    	"address": "Lane mill"
    }
  }
}
# 4.multi_match多字段匹配 指定字段内包含query值越多越匹配
GET /bank/_search
{
  "query": {
    "multi_match": {
    	"query": "Lane Brogan",
    	"fields": ["address","city"]
    }
  }
}
bool
  • must
  • must_not
  • should:并非必要,只是匹配的话得分更高
# 以下条件筛选"gender"为"M","address"包含 "mill",但是age不为28,且最好last有Holland的数据
GET /bank/_search
{
  "query":{
    "bool": {
      "must": [
        {"match": {
          "gender": "M"
        }},
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {"match": {
          "age": "28"
        }}
      ],
      "should": [
        {"match": {
          "lastname": "Holland"
        }}
      ]
    }
  }
}
filter
GET /bank/_search
{
  "query":{
    "bool": {
      "filter": {  #不影响得分
        "range": {
          "age": {
            "gte": 10,
            "lte": 40
          }
        }
      }
    }
  }
}

image-20200526141148036

term

非text字段检索,match适合全文检索字段

GET /bank/_search
{
  "query": {
    "term": {
    	"balance": "32838"
    }
  }
}
字段.keyword以及match区分

必须完全匹配,空格也要,大小写也要

GET /bank/_search
{
  "query": {
    "match": {
    	"address.keyword": "789 Madison Street"
    }
  }
}
#下例中  match_phrase,只要存在该完整短语(完整单词A 完整单词B)即可(大小写无关,但是单词顺序不能乱)
GET /bank/_search
{
  "query": {
    "match_phrase": {
    	"address": "789 madison"
    }
  }
}
#下例中  match  只要存在当中任意一个单词即可,与单词顺序、拼写大小写无关,匹配越多单词得分越高
GET /bank/_search
{
  "query": {
    "match": {
    	"address": "789 madison"
    }
  }
}

# 注意:上述匹配均是需要匹配完整单词,每个单词长度一样才可匹配

Aggregations

GET bank/_search
{
  "query": {
    "match_all": {}
  },
 "aggs": {
   "myagg1": {
      "terms": {
      "field": "age"	#查看年龄分布
      }
   },
   "myaggAVG":{
      "avg": {
        "field": "balance" #查看平均余额
      }
   }
 },
 "size": 0
}

image-20200526151052472

GET bank/_search
{
  "query": {
    "match_all": {}
  },
 "aggs": {
   "myagg1": {
      "terms": {
        "field": "age"
      },
      "aggs": {
        "myaggAVG": {	# 子聚合 即查看每个年龄的分布清空以及每个年龄的平均余额
          "avg": {
            "field": "balance"
          }
        }
      }
   }
 },
 "size": 0 #不显示查出的结果,只显示聚合结果
}

image-20200526151215041

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "myagg": {
      "terms": {
        "field": "age",
        "size": 10
      },
      "aggs": {
        "myaggGender": {
          "terms": {
            "field": "gender.keyword",
            "size": 10
          },
          "aggs": {
            "myaggGenderAVG": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "myaggAllAVG":{
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }, 
 "size": 0
}
# 查看各个年龄段的男和女分别的平均余额以及这个年龄段的平均余额

image-20200526153051127

Mapping

创建映射关系

PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {"type": "integer"},
      "email":{"type": "keyword"},
      "name": {"type": "text","index":true} #index默认true,默认可以被索引
    }
  }
}

image-20200526154811728

查看映射信息

GET请求:index/_mapping

image-20200526154846561

修改映射信息

添加新字段

PUT /my_index/_mapping
{
  "properties":{
    "address":{
      "type":"keyword",
      "index":false
    }
  }
}

注意:以及存在的映射信息不可修改

如果要修改,只能指定正确的映射关系,然后数据迁移

PUT /new_bank  #创建新的映射
{
  "mappings": {
    "properties": {
      "account_number": {
        "type": "long"
      },
      "address": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      },
      "balance": {
        "type": "long"
      },
      "city": {
        "type": "keyword"
      },
      "email": {
        "type": "keyword"
      },
      "employer": {
        "type": "keyword"
      },
      "firstname": {
        "type": "text"
      },
      "gender": {
        "type": "keyword"
      },
      "lastname": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "state": {
        "type": "keyword"
      }
    }
  }
}


POST _reindex	#数据迁移,不再用type
{
  "source": {
    "index": "bank",
    "type": "account" #原始数据拥有类型,6.0以后不用type
  },
  "dest": {
    "index": "new_bank" #不用type
  }
}

原始type是自己创建的适合定义的

image-20200526160831083

不用type之后所有的type为_doc

image-20200526160859214

分词

安装插件

下载elasticsearch-analysis-ik-7.4.2

解压后放到挂载的plugins目录,或者进入容器放进去

检查插件是否安装成功

docker exec -it es /bin/bash

cd bin

elasticsearch-plugin list # 列出所有插件
POST _analyze
{
  "analyzer": "ik_max_word",
  "text": ["我是中国人"]
}

image-20200526203324141

POST _analyze
{
  "analyzer": "ik_smart",
  "text": ["我是中国人"]
}

image-20200526203415739

自定义词库

安装nginx
# 随便启动一个nginx实例
docker run -p 80:80 --name nginx -d nginx:1.10

# 把容器内配置文件复制出来,(后面有个 空格+. )
docker container cp nginx:/etc/nginx .

# 停止、删除这个容器

# 把配置放到conf目录再转移到nginx目录下
mv nginx conf
mkdir nginx
mv conf nginx/

# 运行
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10

#测试是否启动成功
cd html
vi index.html
#写入
<h1>hahahha~~~~~</h1>

#保存退出,访问80端口

创建自定义词库
  1. 在nginx的html目录下新建es文件夹新建txt文件,录入单词,保存访问ip:80/es/xxx.txt

  2. 进入plugins/elasticsearch-analysis-ik-7.4.2/config目录修改IKAnalyzer.cfg.xml文件

    在扩展字典那里写上各个的访问路径

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
    <properties>
            <comment>IK Analyzer 扩展配置</comment>
            <!--用户可以在这里配置自己的扩展字典 -->
            <entry key="ext_dict"></entry>
             <!--用户可以在这里配置自己的扩展停止词字典-->
            <entry key="ext_stopwords"></entry>
            <!--用户可以在这里配置远程扩展字典 -->
            <entry key="remote_ext_dict">http://你的ip/es/participle.txt</entry>
            <!--用户可以在这里配置远程扩展停止词字典-->
            <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
    </properties>
    
  3. 重启

整合SpringBoot

Rest Client文档

  • 创建springboot项目,选择web不选择springboot整合的elasticsearch,因其版本没有更新到7.0之后
  • 导入依赖
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.4.2</version>
</dependency>
  • 调整版本,springboot 2.2.7内置6.8.8,指定版本即可
<properties>
    <java.version>1.8</java.version>
    <elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
  • 配置
@Configuration
public class ElasticSearchConfig {

    public  static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        /*builder.addHeader("Authorization", "Bearer " + TOKEN);
        builder.setHttpAsyncResponseConsumerFactory(
                new HttpAsyncResponseConsumerFactory
                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));*/
        COMMON_OPTIONS = builder.build();
    }

    @Bean
    public RestHighLevelClient esRestCilent(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("ip...", 9200, "http")));
        return client;
    }
}

创建索引

@Test
void index() throws IOException{
    //新建一个索引
    IndexRequest indexRequest = new IndexRequest("users");
    //指定id
    indexRequest.id("1");

    User user = new User();
    user.setUserName("小明");
    user.setGender("M");
    user.setAge(18);
    // 转为json字符
    String u = JSON.toJSONString(user);
    // 保存json
    indexRequest.source(u, XContentType.JSON);

    // 执行操作
    IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS);

    System.out.println(index);
}

输出结果

IndexResponse[index=users,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]

获取

@Test
	public void get() throws IOException{
		GetRequest getRequest = new GetRequest(
				"users",
				"1");
		GetResponse fields = client.get(getRequest, ElasticSearchConfig.COMMON_OPTIONS);
		System.out.println(fields);
	}

输出

{"_index":"users","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{"age":18,"gender":"M","userName":"小明"}}

删除

@Test
	public void delete() throws IOException{
		DeleteRequest request = new DeleteRequest(
				"users",
				"1");
		DeleteResponse delete = client.delete(request, ElasticSearchConfig.COMMON_OPTIONS);
		System.out.println(delete);
	}

检索

json工具网站

点击此处

	@Test
	public void search() throws IOException{
		//创建请求
		SearchRequest searchRequest = new SearchRequest("new_bank");
		//封装条件
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		//筛选address有mill单词的数据
		searchSourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
		//按年龄分布聚合,
		TermsAggregationBuilder ageTerm = AggregationBuilders.terms("ageTerm").field("age");
		//求余额平均值
		AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");

		//添加聚合
		searchSourceBuilder.aggregation(ageTerm);
		searchSourceBuilder.aggregation(balanceAvg);

		System.out.println("请求"+searchSourceBuilder);

		//保存
		searchRequest.source(searchSourceBuilder);

		//执行
		SearchResponse response = client.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);

		System.out.println("检索返回信息"+response);

		//分析返回信息
		SearchHits hits = response.getHits();
		System.out.println("检索结果有几个:"+hits.getTotalHits());

		SearchHit[] searchHits = hits.getHits();
		for(SearchHit hit :searchHits){
			String string = hit.getSourceAsString();
			Account account = JSON.parseObject(string,Account.class);
			System.out.println("检索结果"+account);
		}

		//分析聚合结果
		Aggregations aggregations = response.getAggregations();

		Terms term = aggregations.get("ageTerm");
		for (Terms.Bucket bucket : term.getBuckets()) {
			String keyAsString = bucket.getKeyAsString();
			System.out.println("年龄为:"+keyAsString+"有:"+bucket.getDocCount()+"个");
		}

		Avg avg = aggregations.get("balanceAvg");
		System.out.println("求得余额平均值为:"+avg.getValue());

	}

输出的结果

请求
{
	"query": {
		"match": {
			"address": {
				"query": "mill",
				"operator": "OR",
				"prefix_length": 0,
				"max_expansions": 50,
				"fuzzy_transpositions": true,
				"lenient": false,
				"zero_terms_query": "NONE",
				"auto_generate_synonyms_phrase_query": true,
				"boost": 1.0
			}
		}
	},
	"aggregations": {
		"ageTerm": {
			"terms": {
				"field": "age",
				"size": 10,
				"min_doc_count": 1,
				"shard_min_doc_count": 0,
				"show_term_doc_count_error": false,
				"order": [{
					"_count": "desc"
				}, {
					"_key": "asc"
				}]
			}
		},
		"balanceAvg": {
			"avg": {
				"field": "balance"
			}
		}
	}
}
检索返回信息
{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 4,
			"relation": "eq"
		},
		"max_score": 5.4032025,
		"hits": [{
			"_index": "new_bank",
			"_type": "_doc",
			"_id": "970",
			"_score": 5.4032025,
			"_source": {
				"account_number": 970,
				"balance": 19648,
				"firstname": "Forbes",
				"lastname": "Wallace",
				"age": 28,
				"gender": "M",
				"address": "990 Mill Road",
				"employer": "Pheast",
				"email": "forbeswallace@pheast.com",
				"city": "Lopezo",
				"state": "AK"
			}
		}, ...]
	},
	"aggregations": {
		"avg#balanceAvg": {
			"value": 25208.0
		},
		"lterms#ageTerm": {
			"doc_count_error_upper_bound": 0,
			"sum_other_doc_count": 0,
			"buckets": [{
				"key": 38,
				"doc_count": 2
			}, {
				"key": 28,
				"doc_count": 1
			}, {
				"key": 32,
				"doc_count": 1
			}]
		}
	}
}
检索结果有几个:4 hits
检索结果Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK)
检索结果Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL)
检索结果Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY)
检索结果Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT)
年龄为:38有:2个
年龄为:28有:1个
年龄为:32有:1个
求得余额平均值为:25208.0

用kibana测试

image-20200527164414965


原文链接:https://www.cnblogs.com/jklixin/p/12974270.html
如有疑问请与原作者联系

标签:

版权申明:本站文章部分自网络,如有侵权,请联系:west999com@outlook.com
特别注意:本站所有转载文章言论不代表本站观点,本站所提供的摄影照片,插画,设计作品,如需使用,请与原作者联系,版权归原作者所有

上一篇:在阿里一位新员工是怎么一步步培养起来的

下一篇:Dubbo 的心跳设计,值得学习!