一句话定义
使用lucene开源搜索引擎为基础,使用Java编写并提供简单易用RESTful API,
并且能轻易横向扩展,支持PB级别大数据的应用.
能作甚: 数据仓库,数据分析引擎,全文搜索引擎等.
版本
1.X -> 2.X -> 5.X -> 6.X
差异
安装
单节点安装
下载,解压即可.
打开127.0.0.1:9200得到:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| { "name" : "JPM5EYQ", "cluster_name" : "elasticsearch", "cluster_uuid" : "JG6IvmbgTmicNDFFBDDMCQ", "version" : { "number" : "6.3.1", "build_flavor" : "default", "build_type" : "tar", "build_hash" : "eb782d0", "build_date" : "2018-06-29T21:59:26.107521Z", "build_snapshot" : false, "lucene_version" : "7.3.1", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" }
|
安装HEAD插件
可以提供WEB界面进行查看结果和操作.
head
1 2 3
| npm install npm run start open http://localhost:9100/
|
前提是打开了elasticsearch服务,并且修改下跨域问题.
config/elasticsearch.yml
1 2
| http.cors.enabled: true http.cors.allow-origin: "*"
|
集群安装
master节点配置:config/elasticsearch
1 2 3 4 5
| cluster.name: jimo node.name: master node.master: true network.host: 127.0.0.1
|
slave节点只需要打开新的terminal,使用新的文件夹.配置如下:改改端口
1 2 3 4 5 6 7
| cluster.name: jimo node.name: slave1
network.host: 127.0.0.1
http.port: 8200 discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
|
注意的就是:节点的elasticsearch目录不能相互copy
基本概念
索引(Index):还有相同属性的文档集合(图书索引,车辆索引等)
类型(Type):索引可以定义一个或多个类型,文档必须属于一个类型(科普类文学类的书,卡车小轿车)
文档(Document):文档是可以被索引的基本数据单位(每本书,每辆车)
分片(Shards):每个索引都有多个分片,每个分片是一个lucene索引
备份(Replica):拷贝一个分片就完成了分片的备份
集群(Cluster):节点的集合,每个集群有一个名字,通过名字识别不同集群
基本用法
RESTful API格式
http://:/<索引>/<类型>/<文档id>
操作:PUT/GET/POST/DELETE
基本操作
创建索引
- 使用elasticsearch-head创建
可以看到分片的分布,竖着看,细边框的是粗边框的备份
查看其信息,发现mappings这一项:如果是空(”mappings”: { })的则代表是非结构化数据,否则可以自定义结构化结构.
下面给book索引定义一个带有title字段的novel属性:
2. 使用HTTP请求创建,推荐使用Postman,便于编写json
插入
文档插入:
指定id插入
或者自动生成id插入.(注意POST方式和去掉id)
在head中查看结果:
修改
指定id通过URL修改
通过脚本修改,支持的脚本语言有:内置的,js,python.
下面使用内置的修改:(可看到,脚本可以灵活的使用参数)
删除
删除文档
删除索引
使用head
使用命令行:
查询
先插入一些数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
| { "title": "python之父", "author": "王麻子", "word_count": 1000, "publish_date": "2002-10-01" }, { "title": "java", "author": "王三", "word_count": 2000, "publish_date": "2017-08-20" }, { "title": "java入门", "author": "王四", "word_count": 5000, "publish_date": "2017-08-15" }, { "title": "C++入门", "author": "王五", "word_count": 10000, "publish_date": "2000-09-20" }, { "title": "java精通", "author": "李四", "word_count": 8000, "publish_date": "2010-09-20" }, { "title": "java大法好", "author": "张三", "word_count": 3000, "publish_date": "2017-08-01" }, { "title": "代码整洁之道", "author": "寂寞哥", "word_count": 5000, "publish_date": "1997-01-20" }, { "title": "太极拳", "author": "赵牛", "word_count": 1000, "publish_date": "2005-08-20" }
|
- 简单查询
Get查询
POST查询所有数据:
- 条件查询
指定数据量:
按条件并按日期降序排序:
- 聚合查询
按日期和字数聚合:
统计:
或直接指定函数:
高级查询
query
query context:
1 2 3 4 5 6
| 查询时除了判断文档是否满足查询条件外,还会 计算一个_score的字段来标识匹配程度,范围0-1
常用查询: 1.全文本查询:针对文本数据 2.字段级别查询:针对结构化数据,如日期,数字等
|
1.全文本查询
模糊查询
定向查询: 如果使用match,那么java入门会被分成java和入门2个词
多个关键字查询
语法查询:fields省略可查询所有字段
2.字段级别查询
指定值
范围查询: 数字,日期呀
filter
filter context:
复合查询
结合查询和过滤.
固定分数查询:通过boost指定分数,每个filter过滤出的结果都是这个分数
布尔查询
must:
must_not:
should:
同时加上过滤:
实战
集成spring-boot. 代码地址
环境:Intellij IDE,JDK1.8
- 建一个spring-boot项目
- pom.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency>
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-devtools</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport --> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>transport</artifactId> <version>6.3.1</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>2.7</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>2.7</version> </dependency> </dependencies>
|
- ESConfig配置TransportClient:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| @Configuration public class ESConfig {
@Bean public TransportClient client() throws UnknownHostException { final InetSocketTransportAddress nodeAddress = new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300);
final Settings settings = Settings.builder().put("cluster.name", "jimo").build();
final PreBuiltTransportClient client = new PreBuiltTransportClient(settings); client.addTransportAddress(nodeAddress);
return client; } }
|
- 增删改查操作
1 2 3 4 5 6 7 8 9 10 11 12
| @RestController @RequestMapping("/book/novel") public class BookNovelController {
private final TransportClient client;
@Autowired public BookNovelController(TransportClient client) { this.client = client; } }
|
查询:
增加:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| @PostMapping("/new") public ResponseEntity addBook( @RequestParam("title") String title, @RequestParam("author") String author, @RequestParam("word_count") int wordCount, @RequestParam("publish_date") @DateTimeFormat(pattern = "yyyy-MM-dd HH:mm:ss") Date publishDate) { try { final XContentBuilder content = XContentFactory.jsonBuilder() .startObject() .field("title", title) .field("author", author) .field("word_count", wordCount) .endObject(); final IndexResponse result = client.prepareIndex("book", "novel") .setSource(content).get(); return new ResponseEntity(result.getId(), HttpStatus.OK); } catch (IOException e) { e.printStackTrace(); return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR); } }
|
删除:
1 2 3 4 5
| @DeleteMapping public ResponseEntity deleteBook(@RequestParam("id") String id) { final DeleteResponse response = client.prepareDelete("book", "novel", id).get(); return new ResponseEntity(response.getResult(), HttpStatus.OK); }
|
修改:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| @PostMapping("/update") public ResponseEntity updateBook( @RequestParam("id") String id, @RequestParam(name = "title", required = false) String title, @RequestParam(name = "author", required = false) String author) { final UpdateRequest updateRequest = new UpdateRequest("book", "novel", id); try { final XContentBuilder builder = XContentFactory.jsonBuilder().startObject(); if (title != null) { builder.field("title", title); } if (author != null) { builder.field("author", author); } builder.endObject(); updateRequest.doc(builder); } catch (IOException e) { e.printStackTrace(); return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR); } try { final UpdateResponse updateResponse = client.update(updateRequest).get(); return new ResponseEntity(updateResponse.getResult(), HttpStatus.OK); } catch (InterruptedException | ExecutionException e) { e.printStackTrace(); return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR); } }
|
查询:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| @PostMapping("/query") public ResponseEntity queryBook( @RequestParam(name = "author", required = false) String author, @RequestParam(name = "title", required = false) String title, @RequestParam(name = "gt_word_count", defaultValue = "0") Integer gtWordCount, @RequestParam(name = "lt_word_count", required = false) Integer ltWordCount) { final BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); if (author != null) { boolQuery.must(QueryBuilders.matchQuery("author", author)); } if (title != null) { boolQuery.must(QueryBuilders.matchQuery("title", title)); } final RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("word_count").from(gtWordCount); if (ltWordCount != null && ltWordCount >= gtWordCount) { rangeQuery.to(ltWordCount); } boolQuery.filter(rangeQuery); final SearchRequestBuilder builder = client.prepareSearch("book") .setTypes("novel") .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) .setQuery(boolQuery) .setFrom(0) .setSize(10);
System.out.println(builder);
final SearchResponse response = builder.get(); List<Map<String, Object>> result = new ArrayList<>();
for (SearchHit hit : response.getHits()) { result.add(hit.getSource()); } return new ResponseEntity(result, HttpStatus.OK); }
|
遇到的问题
参考:blog
java.lang.ClassNotFoundException: org.elasticsearch.transport.Netty3Plugin
failed to parse [publish_date]
需要将date的fromat改为dateOptionalTime.注意:数据类型是不能修改的,所以建立索引时就要考虑清除.
总结
官方的java客户端连接例子