elasticsearchdsl查询

Z时代
2024-01-10
分类：综合

coding

接续上篇，本篇使用python的elasticsearch-dsl库操作elasticsearch进行查询。

7.查询

Elasticsearch是功能非常强大的搜索引擎，使用它的目的就是为了快速的查询到需要的数据。

查询分类：

基本查询：使用es内置查询条件进行查询

组合查询：把多个查询组合在一起进行复合查询

过滤：查询同时，通过filter条件在不影响打分的情况下筛选数据

7.1、基本查询

查询前先创建一张表

 1PUT chaxun
 2{
 3"mappings": {
 4"job":{
 5"properties": {
 6"title":{
 7"store": true,
 8"type": "text",
 9"analyzer": "ik_max_word"
10        },
11"company_name":{
12"store": true,
13"type": "keyword"
14        },
15"desc":{
16"type": "text"
17        },
18"comments":{
19"type":"integer"
20        },
21"add_time":{
22"type":"date",
23"format": "yyyy-MM-dd"
24        }
25      }
26    }
27  }28 }

表截图：

match查询

1 GET chaxun/job/_search
2{
3"query": {
4"match": {
5"title": "python"
6    }
7  }8 }

1 s = Search(index='chaxun').query('match', title='python')2 response = s.execute()

term查询

term查询不会对查询条件进行解析（分词）

1 GET chaxun/job/_search
2{
3"query": {
4"term":{
5"title":"python爬虫"
6    }
7  }8 }

1 s = Search(index='chaxun').query('term', title='python爬虫')2 response = s.execute()

terms查询

1 GET chaxun/job/_search
2{
3"query": {
4"terms":{
5"title":["工程师", "django", "系统"]
6    }
7  }8 }

1 s = Search(index='chaxun').query('terms', title=['django', u'工程师', u'系统'])2 response = s.execute()

控制查询的返回数量

 1 GET chaxun/job/_search
 2{
 3"query": {
 4"term":{
 5"title":"python"
 6    }
 7  },
 8"from":1,
 9"size":210 }

1 s = Search(index='chaxun').query('terms', title=['django', u'工程师', u'系统'])[0:2]2 response = s.execute()

match_all 查询所有

1 GET chaxun/job/_search
2{
3"query": {
4"match_all": {}
5  }6 }

1 s = Search(index='chaxun').query('match_all')2 response = s.execute()

match_phrase短语查询
```
 1 GET chaxun/job/_search
 2{
 3"query": {
 4"match_phrase": {
 5"title": {
 6"query": "python系统",
 7"slop": 3
 8      }
 9    }
10  }11 }
```
```
1 s = Search(index='chaxun').query('match_phrase', title={"query": u"elasticsearch引擎", "slop": 3})2 response = s.execute()
```
注释：将查询条件“python系统”分词成[“python”, “系统”]，结果需同时满足列表中分词短语，“slop”指定分词词距，匹配结果需不超过slop，比如“python打造推荐引擎系统”，如果slop小于6则无法匹配。

multi_match查询

1 GET chaxun/job/_search
2{
3"query": {
4"multi_match": {
5"query": "python",
6"fields": ["title^3", "desc"]
7    }
8  }9 }

1 q = Q('multi_match', query="python", fields=["title", "desc"])
2 s = Search(index='chaxun').query(q)3 response = s.execute()

注释：指定查询多个字段，”^3”指定”title”权重是”desc”的3倍。

指定返回字段

1 GET chaxun/job/_search
2{
3"stored_fields": ["title", "company_name"],
4"query": {
5"match": {
6"title": "python"
7    }
8  }9 }

1 s = Search(index='chaxun').query('match', title='python').source(['title', 'company_name'])2 response = s.execute()

通过sort对结果排序

 1 GET chaxun/job/_search
 2{
 3"query": {
 4"match_all": {}
 5  },
 6"sort": [
 7    {
 8"comments": {
 9"order": "desc"
10      }
11    }
12  ]13 }

1 s = Search(index='chaxun').query('match_all').sort({"comments": {"order": "desc"}})2 response = s.execute()

range查询范围

 1 GET chaxun/job/_search
 2{
 3"query": {
 4"range": {
 5"comments": {
 6"gte": 10,
 7"lte": 50,
 8"boost": 2.0   --权重
 9      }
10    }
11  }12 }

1 s = Search(index='chaxun').query('range', comments={"gte": 10, "lte": 50, "boost": 2.0})2 response = s.execute()

wildcard查询

 1 GET chaxun/job/_search
 2{
 3"query": {
 4"wildcard": {
 5"title": {
 6"value": "pyth*n",
 7"boost": 2
 8      }
 9    }
10  }11 }

1 s = Search(index='chaxun').query('wildcard', title={"value": "pyth*n", "boost": 2})2 response = s.execute()

7.2、组合查询

- 新建一张查询表

- bool查询

格式如下

1bool:{
2"filter":[],
3"must":[],
4"should":[],
5"must_not":[]6 }

最简单的filter查询

1select*from testdb where salary=20

 1 GET bool/testdb/_search
 2{
 3"query": {
 4"bool": {
 5"must": {
 6"match_all":{}
 7      },
 8"filter": {
 9"term":{
10"salary":20
11        }
12      }
13    }
14  }15 }

1 s = Search(index='bool').query('bool', filter=[Q('term', salary=20)])2 response = s.execute()

查看分析器解析（分词）的结果
```
1GET _analyze
2{
3"analyzer": "ik_max_word",
4"text": "成都电子科技大学"5 }
```
注释：”ik_max_word”，精细分词；”ik_smart”，粗略分词

bool组合过滤查询

1select*from testdb where (salary=20or title=python) and (salary !=30)

 1 GET bool/testdb/_search
 2{
 3"query": {
 4"bool": {
 5"should": [
 6         {"term":{"salary":20}},
 7         {"term":{"title":"python"}}
 8      ],
 9"must_not": [
10         {"term":{"salary":30}}
11      ]
12    }
13  }14 }

1 q = Q('bool', should=[Q('term', salary=20), Q('term', title='python')],must_not=[Q('term', salary=30)])2 response = s.execute()

嵌套查询

1select*from testdb where title=python or (title=django and salary=30)

 1 GET bool/testdb/_search
 2{
 3"query": {
 4"bool":{
 5"should":[
 6         {"term":{"title":"python"}},
 7         {"bool":{
 8"must":[{"term":{"title":"django"}},
 9                   {"term":{"salary":30}}]
10        }}
11      ]
12    }
13  }14 }

1 q = Q('bool', should=[Q('term', title='python'), Q('bool', must=[Q('term', title='django'), Q('term', salary=30)])])
2 s = Search(index='bool').query(q)3 response = s.execute()

过滤空和非空

建立测试数据

 1 POST null/testdb2/_bulk
 2 {"index":{"_id":1}}
 3 {"tags":["search"]}
 4 {"index":{"_id":2}}
 5 {"tags":["search", "python"]}
 6 {"index":{"_id":3}}
 7 {"other_field":["some data"]}
 8 {"index":{"_id":4}}
 9 {"tags":null}
10 {"index":{"_id":5}}11 {"tags":["search", null]}

处理null空值的方法

1select tags from testdb2 where tags isnotNULL

 1 GET null/testdb2/_search
 2{
 3"query": {
 4"bool":{
 5"filter": {
 6"exists": {
 7"field": "tags"
 8        }
 9      }
10    }
11  }12 }

1 s = Search(index='null').query('bool', filter={"exists": {"field": "tags"}})2 response = s.execute()

7.3、聚合查询

未完待续...

以上是 elasticsearchdsl查询的全部内容，来源链接： utcz.com/z/509936.html

elasticsearchdsl查询

7.查询

7.1、基本查询

7.2、组合查询

7.3、聚合查询

其他人也看了：