Jsoup:如何获取2个标题标签之间的所有html

我正在尝试获取2 h1标签之间的所有html。实际的任务是根据h1(heading 1)标签将html分成几帧。

感谢任何帮助。

谢谢苏尼尔

回答:

如果要获取和处理两个连续h1标签之间的所有元素,则可以处理同级对象。这是一些示例代码:

public static void h1s() {

String html = "<html>" +

"<head></head>" +

"<body>" +

" <h1>title 1</h1>" +

" <p>hello 1</p>" +

" <table>" +

" <tr>" +

" <td>hello</td>" +

" <td>world</td>" +

" <td>1</td>" +

" </tr>" +

" </table>" +

" <h1>title 2</h1>" +

" <p>hello 2</p>" +

" <table>" +

" <tr>" +

" <td>hello</td>" +

" <td>world</td>" +

" <td>2</td>" +

" </tr>" +

" </table>" +

" <h1>title 3</h1>" +

" <p>hello 3</p>" +

" <table>" +

" <tr>" +

" <td>hello</td>" +

" <td>world</td>" +

" <td>3</td>" +

" </tr>" +

" </table>" +

"</body>" +

"</html>";

Document doc = Jsoup.parse(html);

Element firstH1 = doc.select("h1").first();

Elements siblings = firstH1.siblingElements();

List<Element> elementsBetween = new ArrayList<Element>();

for (int i = 1; i < siblings.size(); i++) {

Element sibling = siblings.get(i);

if (! "h1".equals(sibling.tagName()))

elementsBetween.add(sibling);

else {

processElementsBetween(elementsBetween);

elementsBetween.clear();

}

}

if (! elementsBetween.isEmpty())

processElementsBetween(elementsBetween);

}

private static void processElementsBetween(

List<Element> elementsBetween) {

System.out.println("---");

for (Element element : elementsBetween) {

System.out.println(element);

}

}

以上是 Jsoup:如何获取2个标题标签之间的所有html 的全部内容, 来源链接: utcz.com/qa/424030.html

回到顶部