使用Java API从Elasticsearch获取所有记录

我正在尝试使用Java API从Elasticsearch获取所有记录。但我收到以下错误

n [[Wild Thing] [localhost:9300] [indices:data / read / search [phase /

dfs]]]; 嵌套:QueryPhaseExecutionException

[结果窗口太大,从+大小必须小于或等于:[10000],但为[10101]。

我的代码如下

Client client;

try {

client = TransportClient.builder().build().

addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));

int from = 1;

int to = 100;

while (from <= 131881) {

SearchResponse response = client

.prepareSearch("demo_risk_data")

.setSearchType(SearchType.DFS_QUERY_THEN_FETCH).setFrom(from)

.setQuery(QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", "")))

.setSize(to).setExplain(true).execute().actionGet();

if (response.getHits().getHits().length > 0) {

for (SearchHit searchData : response.getHits().getHits()) {

JSONObject value = new JSONObject(searchData.getSource());

System.out.println(value.toString());

}

}

}

}

目前本记录总数是131881,所以我开始from = 1to = 100再拿到100个记录,直到from <=

131881。有什么方法可以检查例如说100中的获取记录,直到Elasticsearch中没有其他记录。

回答:

是的,您可以使用Java客户端也支持的滚动API来实现。

您可以这样做:

Client client;

try {

client = TransportClient.builder().build().

addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));

QueryBuilder qb = QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", ""));

SearchResponse scrollResp = client.prepareSearch("demo_risk_data")

.addSort(SortParseElement.DOC_FIELD_NAME, SortOrder.ASC)

.setScroll(new TimeValue(60000))

.setQuery(qb)

.setSize(100).execute().actionGet();

//Scroll until no hits are returned

while (true) {

//Break condition: No hits are returned

if (scrollResp.getHits().getHits().length == 0) {

break;

}

// otherwise read results

for (SearchHit hit : scrollResp.getHits().getHits()) {

JSONObject value = new JSONObject(searchData.getSource());

System.out.println(value.toString());

}

// prepare next query

scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();

}

}

以上是 使用Java API从Elasticsearch获取所有记录 的全部内容, 来源链接: utcz.com/qa/427497.html

回到顶部