Python Selenium获取所有“ href”属性

如何在此页面上获取此“ h2”标题的所有“ href”属性?

<h2 class="entry-title">

<a href="http://www.allitebooks.com/deep-learning-with-python-2/" rel="bookmark">Deep Learning with Python</a>

</h2>

我尝试过的没有得到href的是:

title = driver.find_elements_by_class_name('entry-title')

title[0].get_attribute('href')

这没有获得“ a”标签的链接。如果我在“ a”标签上找到了所有元素,它将返回页面上的每个href(这不是我想要的)。我只想返回上述标题,但能够获取其url“

href”属性。

回答:

这是从所有页面获取所有书籍的代码:

from selenium import webdriver

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()

baseUrl = "http://www.allitebooks.com/page/1/?s=python"

driver.get(baseUrl)

# wait = WebDriverWait(driver, 5)

# wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".search-result-list li")))

# Get last page number

lastPage = int(driver.find_element(By.CSS_SELECTOR, ".pagination a:last-child").text)

# Get all HREFs for the first page and save them in hrefs list

js = 'return [...document.querySelectorAll(".entry-title a")].map(e=>e.href)'

hrefs = driver.execute_script(js)

# Iterate throw all pages and get all HREFs of books

for i in range(2, lastPage):

driver.get("http://www.allitebooks.com/page/" + str(i) + "/?s=python")

hrefs.extend(driver.execute_script(js))

for href in hrefs:

print(href)

以上是 Python Selenium获取所有“ href”属性 的全部内容, 来源链接: utcz.com/qa/427707.html

回到顶部