Java用逗号分隔引号外

我的程序从文件中读取一行。此行包含逗号分隔的文本,例如:

123,test,444,"don't split, this",more test,1

我希望拆分的结果是这样的:

123

test

444

"don't split, this"

more test

1

如果使用String.split(","),我将得到:

123

test

444

"don't split

this"

more test

1

换句话说:子字符串中的逗号"don't split, this"不是分隔符。该如何处理?

回答:

你可以尝试以下正则表达式:

str.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");

这将分割字符串,,后跟偶数双引号。换句话说,它用双引号引起来的逗号分隔。如果你在字符串中使用了引号,则此方法将起作用。

说明:

,           // Split on comma

(?= // Followed by

(?: // Start a non-capture group

[^"]* // 0 or more non-quote characters

" // 1 quote

[^"]* // 0 or more non-quote characters

" // 1 quote

)* // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)

[^"]* // Finally 0 or more non-quotes

$ // Till the end (This is necessary, else every comma will satisfy the condition)

)

你甚至可以在代码中使用(?x)正则表达式使用修饰符来键入此类内容。修饰符会忽略你的正则表达式中的任何空格,因此更容易读取分成多行的正则表达式,如下所示:

String[] arr = str.split("(?x)   " + 

", " + // Split on comma

"(?= " + // Followed by

" (?: " + // Start a non-capture group

" [^\"]* " + // 0 or more non-quote characters

" \" " + // 1 quote

" [^\"]* " + // 0 or more non-quote characters

" \" " + // 1 quote

" )* " + // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)

" [^\"]* " + // Finally 0 or more non-quotes

" $ " + // Till the end (This is necessary, else every comma will satisfy the condition)

") " // End look-ahead

);

以上是 Java用逗号分隔引号外 的全部内容, 来源链接: utcz.com/qa/408584.html

回到顶部