Elasticsearch中的加权随机抽样

我需要从ElasticSearch指数获得了随机抽样,即发出查询检索从加权概率定索引一些文档Wj/ΣWi(这里Wj是行的权重j,并Wj/ΣWi在此查询所有文件的权重的总和)。

当前,我有以下查询:

GET products/_search?pretty=true

{"size":5,

"query": {

"function_score": {

"query": {

"bool":{

"must": {

"term":

{"category_id": "5df3ab90-6e93-0133-7197-04383561729e"}

}

}

},

"functions":

[{"random_score":{}}]

}

},

"sort": [{"_score":{"order":"desc"}}]

}

它从选定类别中随机返回5个项目。每个项目都有一个字段weight。所以,我可能必须使用

"script_score": {

"script": "weight = data['weight'].value / SUM; if (_score.doubleValue() > weight) {return 1;} else {return 0;}"

}

作为描述在这里。

我有以下问题:

  • 正确的方法是什么?
  • 我需要启用动态脚本吗?
  • 如何计算查询的总和?

非常感谢你的帮助!

回答:

万一它对任何人都有帮助,这就是我最近实施加权改组的方式。

在此示例中,我们对公司进行了洗牌。每个公司都有一个介于0到100之间的“

company_score”。通过这种简单的加权改组,得分为100的公司出现在首页的可能性是得分为20的公司的5倍。

json_body = {

"sort": ["_score"],

"query": {

"function_score": {

"query": main_query, # put your main query here

"functions": [

{

"random_score": {},

},

{

"field_value_factor": {

"field": "company_score",

"modifier": "none",

"missing": 0,

}

}

],

# How to combine the result of the two functions 'random_score' and 'field_value_factor'.

# This way, on average the combined _score of a company having score 100 will be 5 times as much

# as the combined _score of a company having score 20, and thus will be 5 times more likely

# to appear on first page.

"score_mode": "multiply",

# How to combine the result of function_score with the original _score from the query.

# We overwrite it as our combined _score (random x company_score) is all we need.

"boost_mode": "replace",

}

}

}

以上是 Elasticsearch中的加权随机抽样 的全部内容, 来源链接: utcz.com/qa/424556.html

回到顶部