用于Elasticsearch的查询字符串中的符号

我有“文档”(活动记录),其属性称为“偏差”。该属性具有“ Bin X”,“ Bin $”,“ Bin q”,“ Bin%”等值。

我正在尝试使用tire / elasticsearch搜索属性。我正在使用空白分析器索引偏差属性。这是我用于创建索引的代码:

settings :analysis => {

:filter => {

:ngram_filter => {

:type => "nGram",

:min_gram => 2,

:max_gram => 255

},

:deviation_filter => {

:type => "word_delimiter",

:type_table => ['$ => ALPHA']

}

},

:analyzer => {

:ngram_analyzer => {

:type => "custom",

:tokenizer => "standard",

:filter => ["lowercase", "ngram_filter"]

},

:deviation_analyzer => {

:type => "custom",

:tokenizer => "whitespace",

:filter => ["lowercase"]

}

}

} do

mapping do

indexes :id, :type => 'integer'

[:equipment, :step, :recipe, :details, :description].each do |attribute|

indexes attribute, :type => 'string', :analyzer => 'ngram_analyzer'

end

indexes :deviation, :analyzer => 'whitespace'

end

end

当查询字符串不包含特殊字符时,搜索似乎工作正常。例如,Bin X将仅返回其中包含单词BinAND的那些记录X。但是,搜索类似Bin

$或的Bin %结果将显示单词Bin几乎忽略了该符号的所有结果(带有符号的结果在没有搜索结果的搜索中会显示得更高)。

这是我创建的搜索方法

def self.search(params)

tire.search(load: true) do

query { string "#{params[:term].downcase}:#{params[:query]}", default_operator: "AND" }

size 1000

end

end

这是我构建搜索表单的方式:

<div>

<%= form_tag issues_path, :class=> "formtastic issue", method: :get do %>

<fieldset class="inputs">

<ol>

<li class="string input medium search query optional stringish inline">

<% opts = ["Description", "Detail","Deviation","Equipment","Recipe", "Step"] %>

<%= select_tag :term, options_for_select(opts, params[:term]) %>

<%= text_field_tag :query, params[:query] %>

<%= submit_tag "Search", name: nil, class: "btn" %>

</li>

</ol>

</fieldset>

<% end %>

</div>

回答:

您可以清理查询字符串。这是一种消毒剂,适用于我尝试扔给它的所有东西:

def sanitize_string_for_elasticsearch_string_query(str)

# Escape special characters

# http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters

escaped_characters = Regexp.escape('\\/+-&|!(){}[]^~*?:')

str = str.gsub(/([#{escaped_characters}])/, '\\\\\1')

# AND, OR and NOT are used by lucene as logical operators. We need

# to escape them

['AND', 'OR', 'NOT'].each do |word|

escaped_word = word.split('').map {|char| "\\#{char}" }.join('')

str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ")

end

# Escape odd quotes

quote_count = str.count '"'

str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1

str

end

params[:query] = sanitize_string_for_elasticsearch_string_query(params[:query])

以上是 用于Elasticsearch的查询字符串中的符号 的全部内容, 来源链接: utcz.com/qa/422177.html

回到顶部