elasticsearch禁用术语频率评分
我想更改elasticsearch中的评分系统以摆脱计数术语的多次出现。例如,我想:elasticsearch禁用术语频率评分
“得克萨斯州得克萨斯州得克萨斯州”
和
“得克萨斯”
出来的分数相同。我发现这个键盘映射elasticsearch表示将禁用词频统计,但我的搜索不出来的相同比分:
"mappings":{ "business": { 
    "properties" : { 
     "name" : { 
      "type" : "string", 
      "index_options" : "docs", 
      "norms" : { "enabled": false}} 
     } 
    } 
} 
}
任何帮助将不胜感激,我一直没能找到很多这方面的信息。
编辑:
我加入我的搜索代码,当我使用的解释得到返回的东西。
我的搜索代码:
Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "escluster").build();     Client client = new TransportClient(settings) 
    .addTransportAddress(new InetSocketTransportAddress("127.0.0.1", 9300)); 
    SearchRequest request = Requests.searchRequest("businesses") 
      .source(SearchSourceBuilder.searchSource().query(QueryBuilders.boolQuery() 
      .should(QueryBuilders.matchQuery("name", "Texas") 
      .minimumShouldMatch("1")))).searchType(SearchType.DFS_QUERY_THEN_FETCH); 
    ExplainRequest request2 = client.prepareIndex("businesses", "business") 
,当我解释我搜索得到:
"took" : 14,     "timed_out" : false, 
    "_shards" : { 
    "total" : 3, 
    "successful" : 3, 
    "failed" : 0 
    }, 
    "hits" : { 
    "total" : 2, 
    "max_score" : 1.0, 
    "hits" : [ { 
     "_shard" : 1, 
     "_node" : "BTqBPVDET5Kr83r-CYPqfA", 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9U5KBks4zEorv9YI4n", 
     "_score" : 1.0, 
     "_source":{ 
"name" : "texas" 
} 
, 
     "_explanation" : { 
     "value" : 1.0, 
     "description" : "weight(_all:texas in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 1.0, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 1.0, 
      "description" : "idf(docFreq=2, maxDocs=3)" 
      }, { 
      "value" : 1.0, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } 
    }, { 
     "_shard" : 1, 
     "_node" : "BTqBPVDET5Kr83r-CYPqfA", 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9U5K6Ks4zEorv9YI4o", 
     "_score" : 0.8660254, 
     "_source":{ 
"name" : "texas texas texas" 
} 
, 
     "_explanation" : { 
     "value" : 0.8660254, 
     "description" : "weight(_all:texas in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 0.8660254, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.7320508, 
      "description" : "tf(freq=3.0), with freq of:", 
      "details" : [ { 
       "value" : 3.0, 
       "description" : "termFreq=3.0" 
      } ] 
      }, { 
      "value" : 1.0, 
      "description" : "idf(docFreq=2, maxDocs=3)" 
      }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } 
    } ] 
    } 
看起来它仍在考虑频率和文档频率。有任何想法吗?对不起格式不好,我不知道为什么它显得那么怪异。
编辑编辑:
我从浏览器搜索http://localhost:9200/businesses/business/_search?pretty=true&qname=texas 代码:
{     "took" : 2, 
    "timed_out" : false, 
    "_shards" : { 
    "total" : 3, 
    "successful" : 3, 
    "failed" : 0 
    }, 
    "hits" : { 
    "total" : 4, 
    "max_score" : 1.0, 
    "hits" : [ { 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9YcCKjKvtg8NgyozGK", 
     "_score" : 1.0, 
     "_source":{"business" : { 
"name" : "texas texas texas texas" } 
} 
    }, { 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9YateBKvtg8Ngyoy-p", 
     "_score" : 1.0, 
     "_source":{ 
"name" : "texas" } 
    }, { 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9YavVnKvtg8Ngyoy-4", 
     "_score" : 1.0, 
     "_source":{ 
"name" : "texas texas texas" } 
    }, { 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9Yb7NgKvtg8NgyozFf", 
     "_score" : 1.0, 
     "_source":{"business" : { 
"name" : "texas texas texas" } 
} 
    } ] 
    } 
} 
它发现的所有4个对象我在那里,有他们都以同样的比分。 当我运行我的Java API搜索与解释,我得到:
{     "took" : 2, 
    "timed_out" : false, 
    "_shards" : { 
    "total" : 3, 
    "successful" : 3, 
    "failed" : 0 
    }, 
    "hits" : { 
    "total" : 2, 
    "max_score" : 1.287682, 
    "hits" : [ { 
     "_shard" : 1, 
     "_node" : "BTqBPVDET5Kr83r-CYPqfA", 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9YateBKvtg8Ngyoy-p", 
     "_score" : 1.287682, 
     "_source":{ 
"name" : "texas" } 
, 
     "_explanation" : { 
     "value" : 1.287682, 
     "description" : "weight(name:texas in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 1.287682, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 1.287682, 
      "description" : "idf(docFreq=2, maxDocs=4)" 
      }, { 
      "value" : 1.0, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } 
    }, { 
     "_shard" : 1, 
     "_node" : "BTqBPVDET5Kr83r-CYPqfA", 
     "_index" : "businesses", 
     "_type" : "business", 
     "_id" : "AU9YavVnKvtg8Ngyoy-4", 
     "_score" : 1.1151654, 
     "_source":{ 
"name" : "texas texas texas" } 
, 
     "_explanation" : { 
     "value" : 1.1151654, 
     "description" : "weight(name:texas in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 1.1151654, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.7320508, 
      "description" : "tf(freq=3.0), with freq of:", 
      "details" : [ { 
       "value" : 3.0, 
       "description" : "termFreq=3.0" 
      } ] 
      }, { 
      "value" : 1.287682, 
      "description" : "idf(docFreq=2, maxDocs=4)" 
      }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } 
    } ] 
    } 
} 
回答:
看起来像一个不能覆盖index options了场场后就一直初始集映射
例子:
put test put test/business/_mapping 
{ 
     "properties": { 
     "name": { 
      "type": "string", 
      "index_options": "freqs", 
      "norms": { 
       "enabled": false 
      } 
     } 
     } 
} 
put test/business/_mapping 
{ 
     "properties": { 
     "name": { 
      "type": "string", 
      "index_options": "docs", 
      "norms": { 
       "enabled": false 
      } 
     } 
     } 
} 
get test/business/_mapping 
    { 
    "test": { 
     "mappings": { 
     "business": { 
      "properties": { 
       "name": { 
        "type": "string", 
        "norms": { 
        "enabled": false 
        }, 
        "index_options": "freqs" 
       } 
      } 
     } 
     } 
    } 
} 
你将不得不重新创建索引来获取新的映射
以上是 elasticsearch禁用术语频率评分 的全部内容, 来源链接: utcz.com/qa/265411.html








