python爬虫时循环过程报错

python爬虫时循环过程报错

编写了一个爬虫文件,设定爬取指定网站,进行200次循环爬取,然后出门买东西,回来发现爬到第7条后出错,错误信息如下(基本一致):

Traceback (most recent call last):

File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 472, in wrap_socket

cnx.do_handshake()

File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1915, in do_handshake

self._raise_ssl_error(self._ssl, result)

File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1639, in _raise_ssl_error

raise SysCallError(errno, errorcode.get(errno))

OpenSSL.SSL.SysCallError: (10054, 'WSAECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 603, in urlopen

chunked=chunked)

File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 344, in _make_request

self._validate_conn(conn)

File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 843, in _validate_conn

conn.connect()

File "D:\PythonLearn\venv\lib\site-packages\urllib3\connection.py", line 370, in connect

ssl_context=context)

File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\ssl_.py", line 355, in ssl_wrap_socket

return context.wrap_socket(sock, server_hostname=server_hostname)

File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 478, in wrap_socket

raise ssl.SSLError('bad handshake: %r' % e)

ssl.SSLError: ("bad handshake: SysCallError(10054, 'WSAECONNRESET')",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 449, in send

timeout=timeout

File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 641, in urlopen

_stacktrace=sys.exc_info()[2])

File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\retry.py", line 399, in increment

raise MaxRetryError(_pool, url, error or ResponseError(cause))

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "D:/PythonLearn/practise/img_download.py", line 99, in <module>

imgdownloader(img_info['imgs'], img_info['titles'])

File "D:/PythonLearn/practise/img_download.py", line 84, in imgdownloader

ig_content = requests.get(ig.attr('src')).content # 获取每张图片的二进制数据

File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 75, in get

return request('get', url, params=params, **kwargs)

File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 60, in request

return session.request(method=method, url=url, **kwargs)

File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 533, in request

resp = self.send(prep, **send_kwargs)

File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 646, in send

r = adapter.send(request, **kwargs)

File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 514, in send

raise SSLError(e, request=request)

requests.exceptions.SSLError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))

网上说增加如下代码避免SSL认证就可以:https://www.zhihu.com/questio...

不过不确定效果。

所以,是什么原因导致爬取中断?


回答:

 我今天遇到这种问题了,然后关闭ssl认证可以了。你关闭认证后可以测试下看下效果。

以上是 python爬虫时循环过程报错 的全部内容, 来源链接: utcz.com/a/162975.html

回到顶部