正则表达式匹配一个句子

如何匹配“ Hello world”或“ Hello World”形式的句子。该句子可能包含“-/数字0-9”。任何信息对我都会非常有帮助。谢谢。

回答:

这将做得很好。我对句子的定义:句子以非空格开头,以句点,感叹号或问号(或字符串结尾)结尾。标点符号后可能会有一个结束语。

[^.!?\s][^.!?]*(?:[.!?](?!['"]?\s|$)[^.!?]*)*[.!?]?['"]?(?=\s|$)

import java.util.regex.*;

public class TEST {

public static void main(String[] args) {

String subjectString =

"This is a sentence. " +

"So is \"this\"! And is \"this?\" " +

"This is 'stackoverflow.com!' " +

"Hello World";

String[] sentences = null;

Pattern re = Pattern.compile(

"# Match a sentence ending in punctuation or EOS.\n" +

"[^.!?\\s] # First char is non-punct, non-ws\n" +

"[^.!?]* # Greedily consume up to punctuation.\n" +

"(?: # Group for unrolling the loop.\n" +

" [.!?] # (special) inner punctuation ok if\n" +

" (?!['\"]?\\s|$) # not followed by ws or EOS.\n" +

" [^.!?]* # Greedily consume up to punctuation.\n" +

")* # Zero or more (special normal*)\n" +

"[.!?]? # Optional ending punctuation.\n" +

"['\"]? # Optional closing quote.\n" +

"(?=\\s|$)",

Pattern.MULTILINE | Pattern.COMMENTS);

Matcher reMatcher = re.matcher(subjectString);

while (reMatcher.find()) {

System.out.println(reMatcher.group());

}

}

}

这是输出:

This is a sentence.

So is "this"!

And is "this?"

This is 'stackoverflow.com!'

Hello World

正确地匹配所有这些(最后一个句子没有结尾标点符号),看起来似乎并不那么容易!

以上是 正则表达式匹配一个句子 的全部内容, 来源链接: utcz.com/qa/433151.html

回到顶部