为什么我没有获得领域的价值而不是领域本身?

所以我想使用BeautifulSoup和Python第一次做网页抓取。我试图刮掉页面是:http://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=34172为什么我没有获得领域的价值而不是领域本身?

client = request('http://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=34172') 

page_html = client.read()

client.close()

page_soup = soup(page_html)

identification = page_soup.find('div', {'data-bind':'text: name'})

print(identification.text)

当我这样做,我只是得到一个空字符串。如果我打印出简单的标识变量,我得到:

<div class="col-xs-7" data-bind="text: name"></div> 

This is the line of html that I am trying to get the value of, as you can see there is a value A LEBLANC there in the tag

回答:

你可以试试这个代码:

from selenium import webdriver 

driver=webdriver.Chrome()

browser=driver.get('http://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=34172')

find=driver.find_element_by_xpath('//*[@id="identificationCollapse"]/div/div/div/div[1]/div[1]/div[2]')

print(find.text)

输出:

A LEBLANC 

回答:

有几种方法你可以达到同样的目标。但是,我在脚本中使用了选择器,这很容易理解,并且除非该网站的html结构发生重大变化,否则就不会有突破的机会。试试这个。

from selenium import webdriver 

from bs4 import BeautifulSoup

driver = webdriver.Chrome()

driver.get('http://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=34172')

soup = BeautifulSoup(driver.page_source,"lxml")

driver.quit()

item_name = soup.select("[data-bind$='name']")[0].text

print(item_name)

结果:

A LEBLANC 

顺便说一句,你启动的方式也将工作:

from selenium import webdriver 

from bs4 import BeautifulSoup

driver = webdriver.Chrome()

driver.get('http://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=34172')

soup = BeautifulSoup(driver.page_source,"lxml")

driver.quit()

item_name = soup.find('div', {'data-bind':'text: name'}).text

print(item_name)

以上是 为什么我没有获得领域的价值而不是领域本身? 的全部内容, 来源链接: utcz.com/qa/267273.html

回到顶部