在Elasticsearch中查找不同的内部对象

我们正在尝试在Elasticsearch中找到不同的内部对象。这将是我们案例的最小示例。我们一直坚持下面的映射(更改类型或索引或添加新字段不会有问题,但结构应保持原样):

{

"building": {

"properties": {

"street": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"house number": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"city": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"people": {

"type": "object",

"store": "yes",

"index": "not_analyzed",

"properties": {

"firstName": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"lastName": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

}

}

}

}

}

}

假设我们有以下示例数据:

{

"buildings": [

{

"street": "Baker Street",

"house number": "221 B",

"city": "London",

"people": [

{

"firstName": "John",

"lastName": "Doe"

},

{

"firstName": "Jane",

"lastName": "Doe"

}

]

},

{

"street": "Baker Street",

"house number": "5",

"city": "London",

"people": [

{

"firstName": "John",

"lastName": "Doe"

}

]

},

{

"street": "Garden Street",

"house number": "1",

"city": "London",

"people": [

{

"firstName": "Jane",

"lastName": "Smith"

}

]

}

]

}

当查询街道“贝克街”(以及所需的任何其他选项)时,我们希望获得以下列表:

[

{

"firstName": "John",

"lastName": "Doe"

},

{

"firstName": "Jane",

"lastName": "Doe"

}

]

格式并不重要,但是我们应该能够解析名字和姓氏。只是,由于我们的实际数据集要大得多,因此我们需要使输入项不同。

我们正在使用Elasticsearch 1.7。

回答:

我们终于解决了我们的问题。

我们的解决方案是(如我们预期的那样)一个预先计算的people_all字段。但是在导入数据时,我们正在编写其他字段,而不是使用copy_toor

transform而是在编写它。该字段如下所示:

"people": {

"type": "nested",

..

"properties": {

"firstName": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"lastName": {

"type": "string",

"store": "yes",

"index": "not_analyzed"

},

"people_all": {

"type": "string",

"index": "not_analyzed"

}

}

}

"index":

"not_analyzed"people_all现场注意。这对于拥有完整的存储桶很重要。如果您不使用它,我们的示例将返回3个存储桶“ john”,“

jane”和“ doe”。

编写完这个新字段后,我们可以进行如下操作:

{

"size": 0,

"query": {

"term": {

"street": "Baker Street"

}

},

"aggs": {

"people_distinct": {

"nested": {

"path": "people"

},

"aggs": {

"people_all_distinct": {

"terms": {

"field": "people.people_all",

"size": 0

}

}

}

}

}

}

我们返回以下响应:

{

"took": 2,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 2,

"max_score": 0.0,

"hits": []

},

"aggregations": {

"people_distinct": {

"doc_count": 3,

"people_name_distinct": {

"doc_count_error_upper_bound": 0,

"sum_other_doc_count": 0,

"buckets": [

{

"key": "John Doe",

"doc_count": 2

},

{

"key": "Jane Doe",

"doc_count": 1

}

]

}

}

}

}

现在,在响应中,我们可以创建不同的人员对象。

解析存储桶不是最佳解决方案,firstName并且lastName在每个存储桶中都包含字段会更加有趣。

以上是 在Elasticsearch中查找不同的内部对象 的全部内容, 来源链接: utcz.com/qa/401235.html

回到顶部