错误字符启发式
不良字符启发式方法是Boyer Moore算法的方法之一。另一种方法是良好的后缀启发式。在这种方法中,我们将尝试找到一个错误的字符,这意味着主字符串的一个字符与模式不匹配。发生不匹配时,我们将转移整个模式,直到不匹配成为匹配为止,否则,模式将移至错误字符之后。
在此情况下,最佳情况下的时间复杂度为O(m / n),最坏情况下的时间复杂度为O(mn),其中n是文本的长度,m是模式的长度。
输入输出
Input:Main String: “ABAAABCDBBABCDDEBCABC”, Pattern “ABC”
Output:
Pattern found at position: 4
Pattern found at position: 10
Pattern found at position: 18
算法
badCharacterHeuristic(pattern,badCharacterArray)
输入- 模式(将被搜索),不良字符数组以存储位置
输出:填充不良字符数组以备将来使用
Beginn := pattern length
for all entries of badCharacterArray, do
set all entries to -1
done
for all characters of the pattern, do
set last position of each character in badCharacterArray.
done
End
searchPattern(样式,文字)
输入-模式(将被搜索)和主要文本
输出-找到图案的位置
BeginpatLen := length of pattern
strLen := length of text.
call badCharacterHeuristic(pattern, badCharacterArray)
shift := 0
while shift <= (strLen - patLen), do
j := patLen -1
while j >= 0 and pattern[j] = text[shift + j], do
decrease j by 1
done
if j < 0, then
print the shift as, there is a match
if shift + patLen < strLen, then
shift:= shift + patLen – badCharacterArray[text[shift + patLen]]
else
increment shift by 1
else
shift := shift + max(1, j-badCharacterArray[text[shift+j]])
done
End
示例
#include<iostream>#define MAXCHAR 256
using namespace std;
int maximum(int data1, int data2) {
if(data1 > data2)
return data1;
return data2;
}
void badCharacterHeuristic(string pattern, int badCharacter[MAXCHAR]) {
int n = pattern.size(); //find length of pattern
for(int i = 0; i<MAXCHAR; i++)
badCharacter[i] = -1; //set all character distance as -1
for(int i = 0; i < n; i++) {
badCharacter[(int)pattern[i]] = i; //set position of character in the array.
}
}
void searchPattern(string mainString, string pattern, int *array, int *index) {
int patLen = pattern.size();
int strLen = mainString.size();
int badCharacter[MAXCHAR]; //make array for bad character position
badCharacterHeuristic(pattern, badCharacter); //fill bad character array
int shift = 0;
while(shift <= (strLen - patLen)) {
int j = patLen - 1;
while(j >= 0 && pattern[j] == mainString[shift+j]) {
j--; //reduce j when pattern and main string character is matching
}
if(j < 0) {
(*index)++;
array[(*index)] = shift;
if((shift + patLen) < strLen) {
shift += patLen - badCharacter[mainString[shift + patLen]];
}else {
shift += 1;
}
}else {
shift += maximum(1, j - badCharacter[mainString[shift+j]]);
}
}
}
int main() {
string mainString = "ABAAABCDBBABCDDEBCABC";
string pattern = "ABC";
int locArray[mainString.size()];
int index = -1;
searchPattern(mainString, pattern, locArray, &index);
for(int i = 0; i <= index; i++) {
cout << "Pattern found at position: " << locArray[i]<<endl;
}
}
输出结果
Pattern found at position: 4Pattern found at position: 10
Pattern found at position: 18
以上是 错误字符启发式 的全部内容, 来源链接: utcz.com/z/335062.html