如何在Selenium驱动程序中获取整个页面的innerHTML?

selenium用来单击所需的网页,然后使用解析网页Beautiful Soup

有人展示了如何在中获取元素的内部HTMLSeleniumWebDriver。有没有办法获取整个页面的HTML?谢谢

中的示例代码Python (基于上面的帖子,语言似乎没有太大关系):

from selenium import webdriver

from selenium.webdriver.support.ui import Select

from bs4 import BeautifulSoup

url = 'http://www.google.com'

driver = webdriver.Firefox()

driver.get(url)

the_html = driver---somehow----.get_attribute('innerHTML')

bs = BeautifulSoup(the_html, 'html.parser')

回答:

要获取整个页面的HTML:

from selenium import webdriver

driver = webdriver.Firefox()

driver.get("http://stackoverflow.com")

html = driver.page_source

要获取外部HTML(包括标记):

# HTML from `<html>`

html = driver.execute_script("return document.documentElement.outerHTML;")

# HTML from `<body>`

html = driver.execute_script("return document.body.outerHTML;")

# HTML from element with some JavaScript

element = driver.find_element_by_css_selector("#hireme")

html = driver.execute_script("return arguments[0].outerHTML;", element)

# HTML from element with `get_attribute`

element = driver.find_element_by_css_selector("#hireme")

html = element.get_attribute('outerHTML')

要获取内部HTML(不包括标签):

# HTML from `<html>`

html = driver.execute_script("return document.documentElement.innerHTML;")

# HTML from `<body>`

html = driver.execute_script("return document.body.innerHTML;")

# HTML from element with some JavaScript

element = driver.find_element_by_css_selector("#hireme")

html = driver.execute_script("return arguments[0].innerHTML;", element)

# HTML from element with `get_attribute`

element = driver.find_element_by_css_selector("#hireme")

html = element.get_attribute('innerHTML')

以上是 如何在Selenium驱动程序中获取整个页面的innerHTML? 的全部内容, 来源链接: utcz.com/qa/432676.html

回到顶部