尝试提取特定div和子div下的数据

Z时代
2024-01-10
分类：问答

我试图获得它，因此我可以让它打印本书和章节的标题，但只列出每本书和标题。尝试提取特定div和子div下的数据

所以基本上 “雅各布的第一本书” 章节1-7

，而不是它遍历所有的书。

这里是页面布局（URL包括在Python代码）

<dl> 
    <dt>Title</dt> 
    <dd> 
    <dl> 
     <dt>Sub Title</dt> 
    </dl> 
    </dd> 
    <dt>Title 2</dt> 
    <dd> 
    <dl> 
     <dt>Sub Title 2</dt> 
    </dl> 
    </dd> 
</dl> 
#this continues for Title 3, Sub title 3, etc etc

这里是Python代码

import requests 
import bs4 
scripture_url = 'http://scriptures.nephi.org/docbook/bom/' 
response = requests.get(scripture_url) 
soup = bs4.BeautifulSoup(response.text) 
links = soup.select('dl dd dt') 
for item in links: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title

这里是输出

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 Chapter 31 Chapter 32 Chapter 33 Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 1 Chapter 1

回答：

你可以尝试这样的事情。首先，找一本书，例如，具有标题“雅各书”：

book_title = 'The Book of Jacob' 
book = soup.find('a', text=book_title) 
print book.text

然后选择<dd>即书名直接兄弟姐妹，并发现<dd>元素中所有相应章节：

links = book.parent.select('+ dd > dl > dt') 
for item in links: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title

输出：你需要什么

The Book of Jacob Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7

回答：

刚刚切断在数组中的最后2个，控制不是细粒度的，因为在html标记中没有任何id或名字

links = soup.select('dl dd dt') 
for item in links[:-2]: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title

回答：

假设你知道他们总是第一和第二值，你可以使用一个数组引用：

title = links[0]; 
subtitle = links[1];

以上是尝试提取特定div和子div下的数据的全部内容，来源链接： utcz.com/qa/265384.html

尝试提取特定div和子div下的数据

回答：

回答：

回答：

其他人也看了：

js实现的点击div区域外隐藏div区域

【CSS】讓 div 裡面的 div 無條件靠上？

为什么给id为d2的div设置高度后两边的div会下沉？

理解正则表达式

一个艺术风格化的神经网络算法

Java版赫夫曼编码