python爬虫时循环过程报错
编写了一个爬虫文件,设定爬取指定网站,进行200次循环爬取,然后出门买东西,回来发现爬到第7条后出错,错误信息如下(基本一致):
Traceback (most recent call last):
File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 472, in wrap_socket
cnx.do_handshake()
File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1915, in do_handshake
self._raise_ssl_error(self._ssl, result)
File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1639, in _raise_ssl_error
raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (10054, 'WSAECONNRESET')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 603, in urlopen
chunked=chunked)
File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 344, in _make_request
self._validate_conn(conn)
File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 843, in _validate_conn
conn.connect()
File "D:\PythonLearn\venv\lib\site-packages\urllib3\connection.py", line 370, in connect
ssl_context=context)
File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\ssl_.py", line 355, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 478, in wrap_socket
raise ssl.SSLError('bad handshake: %r' % e)
ssl.SSLError: ("bad handshake: SysCallError(10054, 'WSAECONNRESET')",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 641, in urlopen
_stacktrace=sys.exc_info()[2])
File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\retry.py", line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/PythonLearn/practise/img_download.py", line 99, in <module>
imgdownloader(img_info['imgs'], img_info['titles'])
File "D:/PythonLearn/practise/img_download.py", line 84, in imgdownloader
ig_content = requests.get(ig.attr('src')).content # 获取每张图片的二进制数据
File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))
网上说增加如下代码避免SSL认证就可以:https://www.zhihu.com/questio...
不过不确定效果。
所以,是什么原因导致爬取中断?
回答:
我今天遇到这种问题了,然后关闭ssl认证可以了。你关闭认证后可以测试下看下效果。
以上是 python爬虫时循环过程报错 的全部内容, 来源链接: utcz.com/a/162975.html