Jsoup:如何获取2个标题标签之间的所有html
我正在尝试获取2 h1标签之间的所有html。实际的任务是根据h1(heading 1)标签将html分成几帧。
感谢任何帮助。
谢谢苏尼尔
回答:
如果要获取和处理两个连续h1
标签之间的所有元素,则可以处理同级对象。这是一些示例代码:
public static void h1s() { String html = "<html>" +
"<head></head>" +
"<body>" +
" <h1>title 1</h1>" +
" <p>hello 1</p>" +
" <table>" +
" <tr>" +
" <td>hello</td>" +
" <td>world</td>" +
" <td>1</td>" +
" </tr>" +
" </table>" +
" <h1>title 2</h1>" +
" <p>hello 2</p>" +
" <table>" +
" <tr>" +
" <td>hello</td>" +
" <td>world</td>" +
" <td>2</td>" +
" </tr>" +
" </table>" +
" <h1>title 3</h1>" +
" <p>hello 3</p>" +
" <table>" +
" <tr>" +
" <td>hello</td>" +
" <td>world</td>" +
" <td>3</td>" +
" </tr>" +
" </table>" +
"</body>" +
"</html>";
Document doc = Jsoup.parse(html);
Element firstH1 = doc.select("h1").first();
Elements siblings = firstH1.siblingElements();
List<Element> elementsBetween = new ArrayList<Element>();
for (int i = 1; i < siblings.size(); i++) {
Element sibling = siblings.get(i);
if (! "h1".equals(sibling.tagName()))
elementsBetween.add(sibling);
else {
processElementsBetween(elementsBetween);
elementsBetween.clear();
}
}
if (! elementsBetween.isEmpty())
processElementsBetween(elementsBetween);
}
private static void processElementsBetween(
List<Element> elementsBetween) {
System.out.println("---");
for (Element element : elementsBetween) {
System.out.println(element);
}
}
以上是 Jsoup:如何获取2个标题标签之间的所有html 的全部内容, 来源链接: utcz.com/qa/424030.html