Elasticsearch指标聚合:数组中的元素数

我想做一个相当复杂的查询/聚合。我看不到该怎么做,因为我刚刚开始使用ES。我的文档看起来像这样:

{

"keyword": "some keyword",

"items": [

{

"name":"my first item",

"item_property_1":"A",

( other properties here )

},

{

"name":"my second item",

"item_property_1":"B",

( other properties here )

},

{

"name":"my third item",

"item_property_1":"A",

( other properties here )

}

]

( other properties... )

},

{

"keyword": "different keyword",

"items": [

{

"name":"cool item",

"item_property_1":"A",

( other properties here )

},

{

"name":"awesome item",

"item_property_1":"C",

( other properties here )

},

]

( other properties... )

},

( other documents... )

现在,我想为每个关键字计算property_1可以具有的几个可能值中有多少个。也就是说,我需要一个具有以下响应的存储桶聚合:

{

"keyword": "some keyword",

"item_property_1_aggretation": [

{

"key":"A",

"count": 2,

},

{

"key":"B",

"count": 1,

}

]

},

{

"keyword": "different keyword",

"item_property_1_aggretation": [

{

"key":"A",

"count": 1,

},

{

"key":"C",

"count": 1,

}

]

},

( other keywords... )

如果需要映射,您还可以指定哪个吗?我没有任何非默认映射,我只是将所有内容都转储在那里。

编辑:通过在此处发布上一个示例的批量PUT为您节省了麻烦

PUT /test/test/_bulk

{ "index": {}}

{ "keyword": "some keyword", "items": [ { "name":"my first item", "item_property_1":"A" }, { "name":"my second item", "item_property_1":"B" }, { "name":"my third item", "item_property_1":"A" } ]}

{ "index": {}}

{ "keyword": "different keyword", "items": [ { "name":"cool item", "item_property_1":"A" }, { "name":"awesome item", "item_property_1":"C" } ]}

编辑2:

我只是试过这个:

POST /test/test/_search

{

"size":2,

"aggregations": {

"property_1_count": {

"terms":{

"field":"item_property_1"

}

}

}

}

并得到了这个:

"aggregations": {

"property_1_count": {

"doc_count_error_upper_bound": 0,

"sum_other_doc_count": 0,

"buckets": [

{

"key": "a",

"doc_count": 2

},

{

"key": "b",

"doc_count": 1

},

{

"key": "c",

"doc_count": 1

}

]

}

}

关闭但没有雪茄。您可以看到发生了什么,item_property_1无论keyword它们属于哪个,它都在进行存储。我确定该解决方案涉及正确添加一些映射,但是我无法全力以赴。有什么建议吗?

EDIT3:基于此:https ://www.elastic.co/guide/zh-

cn/elasticsearch/reference/current/mapping-nested-type.html

我想尝试将一个nested类型添加到property items。为此,我尝试:

PUT /test/_mapping/test

{

"test":{

"properties": {

"items": {

"type": "nested",

"properties": {

"item_property_1":{"type":"string"}

}

}

}

}

}

但是,这将返回错误:

{

"error": "MergeMappingException[Merge failed with failures {[object mapping [items] can't be changed from non-nested to nested]}]",

"status": 400

}

这可能与该URL上的警告有关:“将对象类型更改为嵌套类型需要重新索引。”

那么,我该怎么做呢?

回答:

不错的尝试,您快到了!这是我想出的。根据您的映射建议,我正在使用的映射如下:

curl -XPUT localhost:9200/test/_mapping/test -d '{

"test": {

"properties": {

"keyword": {

"type": "string",

"index": "not_analyzed"

},

"items": {

"type": "nested",

"properties": {

"name": {

"type": "string"

},

"item_property_1": {

"type": "string",

"index": "not_analyzed"

}

}

}

}

}

}'

注意:您需要擦除数据并重新编制索引,因为您无法将字段类型从不是更改nestednested

然后,我使用您共享的批量查询创建了一些数据:

curl -XPOST localhost:9200/test/test/_bulk -d '

{ "index": {}}

{ "keyword": "some keyword", "items": [ { "name":"my first item", "item_property_1":"A" }, { "name":"my second item", "item_property_1":"B" }, { "name":"my third item", "item_property_1":"A" } ]}

{ "index": {}}

{ "keyword": "different keyword", "items": [ { "name":"cool item", "item_property_1":"A" }, { "name":"awesome item", "item_property_1":"C" } ]}

'

最后,这是可用于获取期望结果的聚合查询。我们首先keyword使用terms聚合来进行存储,然后针对每个关键字通过嵌套item_property_1字段进行存储。由于items现在是一个nested类型的,关键是用nested聚合的items,然后一个terms子聚集的item_property_1领域。

{

"size": 0,

"aggregations": {

"by_keyword": {

"terms": {

"field": "keyword"

},

"aggs": {

"prop_1_count": {

"nested": {

"path": "items"

},

"aggs": {

"prop_1": {

"terms": {

"field": "items.item_property_1"

}

}

}

}

}

}

}

}

在您的数据集上运行该查询将产生以下结果:

{

...

"aggregations" : {

"by_keyword" : {

"doc_count_error_upper_bound" : 0,

"sum_other_doc_count" : 0,

"buckets" : [ {

"key" : "different keyword", <---- keyword 1

"doc_count" : 1,

"prop_1_count" : {

"doc_count" : 2,

"prop_1" : {

"doc_count_error_upper_bound" : 0,

"sum_other_doc_count" : 0,

"buckets" : [ { <---- buckets for item_property_1

"key" : "A",

"doc_count" : 1

}, {

"key" : "C",

"doc_count" : 1

} ]

}

}

}, {

"key" : "some keyword", <---- keyword 2

"doc_count" : 1,

"prop_1_count" : {

"doc_count" : 3,

"prop_1" : {

"doc_count_error_upper_bound" : 0,

"sum_other_doc_count" : 0,

"buckets" : [ { <---- buckets for item_property_1

"key" : "A",

"doc_count" : 2

}, {

"key" : "B",

"doc_count" : 1

} ]

}

}

} ]

}

}

}

以上是 Elasticsearch指标聚合:数组中的元素数 的全部内容, 来源链接: utcz.com/qa/433480.html

回到顶部