如何在python 3.0中通过授权在http上下载文件,解决bug?

我有一个脚本想要继续使用,但是看起来我要么必须找到某种解决方法来解决Python 3中的错误,要么降级到2.6,因此还必须降级其他脚本…

希望这里有人已经设法找到解决方法。

问题在于,由于Python 3.0中有关字节和字符串的新更改,因此并非所有库代码都经过了明显的测试。

我有一个从Web服务器下载页面的脚本。该脚本在python 2.6中将用户名和密码作为url的一部分传递,但是在Python 3.0中,此操作不再起作用。

例如,这:

import urllib.request;

url = "http://username:password@server/file";

urllib.request.urlretrieve(url, "temp.dat");

失败,但出现以下异常:

Traceback (most recent call last):

File "C:\Temp\test.py", line 5, in <module>

urllib.request.urlretrieve(url, "test.html");

File "C:\Python30\lib\urllib\request.py", line 134, in urlretrieve

return _urlopener.retrieve(url, filename, reporthook, data)

File "C:\Python30\lib\urllib\request.py", line 1476, in retrieve

fp = self.open(url, data)

File "C:\Python30\lib\urllib\request.py", line 1444, in open

return getattr(self, name)(url)

File "C:\Python30\lib\urllib\request.py", line 1618, in open_http

return self._open_generic_http(http.client.HTTPConnection, url, data)

File "C:\Python30\lib\urllib\request.py", line 1576, in _open_generic_http

auth = base64.b64encode(user_passwd).strip()

File "C:\Python30\lib\base64.py", line 56, in b64encode

raise TypeError("expected bytes, not %s" % s.__class__.__name__)

TypeError: expected bytes, not str

显然,base64编码现在需要输入字节并输出一个字符串,因此urlretrieve(或其中的一些代码)会建立一个username:password字符串,并尝试对base64进行编码以进行简单授权,但失败了。

如果我改为尝试使用urlopen,如下所示:

import urllib.request;

url = "http://username:password@server/file";

f = urllib.request.urlopen(url);

contents = f.read();

然后失败,出现以下异常:

Traceback (most recent call last):

File "C:\Temp\test.py", line 5, in <module>

f = urllib.request.urlopen(url);

File "C:\Python30\lib\urllib\request.py", line 122, in urlopen

return _opener.open(url, data, timeout)

File "C:\Python30\lib\urllib\request.py", line 359, in open

response = self._open(req, data)

File "C:\Python30\lib\urllib\request.py", line 377, in _open

'_open', req)

File "C:\Python30\lib\urllib\request.py", line 337, in _call_chain

result = func(*args)

File "C:\Python30\lib\urllib\request.py", line 1082, in http_open

return self.do_open(http.client.HTTPConnection, req)

File "C:\Python30\lib\urllib\request.py", line 1051, in do_open

h = http_class(host, timeout=req.timeout) # will parse host:port

File "C:\Python30\lib\http\client.py", line 620, in __init__

self._set_hostport(host, port)

File "C:\Python30\lib\http\client.py", line 632, in _set_hostport

raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])

http.client.InvalidURL: nonnumeric port: 'password@server'

显然,此“下一代url检索库”中的url解析不知道如何处理url中的用户名和密码。

我还有其他选择吗?

回答:

直接来自Py3k文档:http

://docs.python.org/dev/py3k/library/urllib.request.html#examples

import urllib.request

# Create an OpenerDirector with support for Basic HTTP Authentication...

auth_handler = urllib.request.HTTPBasicAuthHandler()

auth_handler.add_password(realm='PDQ Application',

uri='https://mahler:8092/site-updates.py',

user='klem',

passwd='kadidd!ehopper')

opener = urllib.request.build_opener(auth_handler)

# ...and install it globally so it can be used with urlopen.

urllib.request.install_opener(opener)

urllib.request.urlopen('http://www.example.com/login.html')

以上是 如何在python 3.0中通过授权在http上下载文件,解决bug? 的全部内容, 来源链接: utcz.com/qa/417438.html

回到顶部