使用python无限滚动的爬网站点

Z时代
2024-01-10
分类：问答

我一直在进行研究，到目前为止，我已经找到了计划使用它的scrapy的python包，现在，我试图找出什么是使用scrapy无限滚动滚动站点来构建刮板的好方法。深入研究后，我发现有一个名为selenium的程序包，它具有python模块。我有一种感觉，有人已经使用Scrapy和Selenium进行无限滚动来刮取网站。如果有人可以指出一个例子，那就太好了。

回答：

这是对我有用的简短代码：

SCROLL_PAUSE_TIME = 20
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)
    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
posts = driver.find_elements_by_class_name("post-text")
for block in posts:
    print(block.text)

以上是使用python无限滚动的爬网站点的全部内容，来源链接： utcz.com/qa/419720.html

使用python无限滚动的爬网站点

回答：

其他人也看了：