ElasticSearch返回的搜索结果中包括不在映射中的字段

我想使用Tire gem作为ElasticSearch的客户端来索引pdf附件。在我的映射中,我从_source中排除了附件字段,因此附件不存储在索引中,

mapping :_source => { :excludes => ['attachment_original'] } do

indexes :id, :type => 'integer'

indexes :folder_id, :type => 'integer'

indexes :attachment_file_name

indexes :attachment_updated_at, :type => 'date'

indexes :attachment_original, :type => 'attachment'

end

运行以下curl命令时,仍然可以看到搜索结果中包含的附件​​内容:

curl -X POST "http://localhost:9200/user_files/user_file/_search?pretty=true" -d '{

"query": {

"query_string": {

"query": "rspec"

}

}

}'

我已经在这个线程中发布了我的问题:

但是我刚刚注意到,不仅附件包含在搜索结果中,而且所有其他字段(包括未映射的字段)也都包含在内,如下所示:

{

"took": 20,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 1,

"max_score": 0.025427073,

"hits": [

{

"_index": "user_files",

"_type": "user_file",

"_id": "5",

"_score": 0.025427073,

"_source": {

"user_file": {

"id": 5,

"folder_id": 1,

"updated_at": "2012-08-16T11:32:41Z",

"attachment_file_size": 179895,

"attachment_updated_at": "2012-08-16T11:32:41Z",

"attachment_file_name": "hw4.pdf",

"attachment_content_type": "application/pdf",

"created_at": "2012-08-16T11:32:41Z",

"attachment_original": "JVBERi0xLjQKJeLjz9MKNyA"

}

}

}

]

}

}

attachment_file_size并且attachment_content_type未在映射中定义,而是在搜索结果中返回:

{

"id": 5,

"folder_id": 1,

"updated_at": "2012-08-16T11:32:41Z",

"attachment_file_size": 179895, <---------------------

"attachment_updated_at": "2012-08-16T11:32:41Z",

"attachment_file_name": "hw4.pdf", <------------------

"attachment_content_type": "application/pdf",

"created_at": "2012-08-16T11:32:41Z",

"attachment_original": "JVBERi0xLjQKJeLjz9MKNyA"

}

这是我的完整实现:

  include Tire::Model::Search

include Tire::Model::Callbacks

def self.search(folder, params)

tire.search() do

query { string params[:query], default_operator: "AND"} if params[:query].present?

#filter :term, folder_id: folder.id

#highlight :attachment_original, :options => {:tag => "<em>"}

raise to_curl

end

end

mapping :_source => { :excludes => ['attachment_original'] } do

indexes :id, :type => 'integer'

indexes :folder_id, :type => 'integer'

indexes :attachment_file_name

indexes :attachment_updated_at, :type => 'date'

indexes :attachment_original, :type => 'attachment'

end

def to_indexed_json

to_json(:methods => [:attachment_original])

end

def attachment_original

if attachment_file_name.present?

path_to_original = attachment.path

Base64.encode64(open(path_to_original) { |f| f.read })

end

end

有人可以帮我弄清楚为什么所有字段都包含在中_source吗?

这是运行的输出localhost:9200/user_files/_mapping

{

"user_files": {

"user_file": {

"_source": {

"excludes": [

"attachment_original"

]

},

"properties": {

"attachment_content_type": {

"type": "string"

},

"attachment_file_name": {

"type": "string"

},

"attachment_file_size": {

"type": "long"

},

"attachment_original": {

"type": "attachment",

"path": "full",

"fields": {

"attachment_original": {

"type": "string"

},

"author": {

"type": "string"

},

"title": {

"type": "string"

},

"name": {

"type": "string"

},

"date": {

"type": "date",

"format": "dateOptionalTime"

},

"keywords": {

"type": "string"

},

"content_type": {

"type": "string"

}

}

},

"attachment_updated_at": {

"type": "date",

"format": "dateOptionalTime"

},

"created_at": {

"type": "date",

"format": "dateOptionalTime"

},

"folder_id": {

"type": "integer"

},

"id": {

"type": "integer"

},

"updated_at": {

"type": "date",

"format": "dateOptionalTime"

}

}

}

}

}

如您所见,由于某些原因,所有字段都包含在映射中!

回答:

在您的中to_indexed_json,您包含了attachment_original方法,因此将其发送给elasticsearch。这也是为什么所有其他属性都包含在映射中并因此包含在源中的原因。

有关该主题的更多信息,请参见ElasticSearch&Tire:使用映射和to_indexed_json问题。

似乎Tire确实确实在将正确的映射JSON发送到elasticsearch -我的建议是使用Tire.configure { logger STDERR,level: "debug" }来检查正在发生的事情,并使用trz在原始级别上查明问题。

以上是 ElasticSearch返回的搜索结果中包括不在映射中的字段 的全部内容, 来源链接: utcz.com/qa/413406.html

回到顶部