19
(2)请用requests库的get()函数访问如下一个网站20次,打印返回状态,text()内容,计算text()属性和content属性所返回网页内容的长度。(不同学号选做如下网页,必做及格)
import requestsfrom lxml import etreeurl
='https://www.baidu.com/'headers
= {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3823.400 QQBrowser/10.7.4307.400'}
req
= requests.get(url=url, headers=headers)req.encoding
= 'utf-8'a
=req.textb
=req.contentprint(req.text)print(req.status_code)print(len(str(a)))print(len(str(b)))for i in range(20):req
= requests.get(url=url, headers=headers)print(req.status_code)
这是一个简单的html页面,请保持为字符串,完成后面的计算要求。(良好)
爬中国大学排名网站内容,http://www.zuihaodaxue.com/zuihaodaxuepaiming2018.html
import requestsfrom lxml import etreeimport csvurl
='https://www.shanghairanking.cn/rankings/bcur/201911'headers
= {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3823.400 QQBrowser/10.7.4307.400'}
req
=requests.get(url=url,headers=headers)req.encoding
='utf-8'# print(req.text)
html=etree.HTML(req.text)
rank=html.xpath("//td[@class='align-left']/a/text()")
r=1
with open(r'E:\python\test.csv', 'w', newline='')as f:
csv_write = csv.writer(f, dialect='excel')
csv_write.writerow(['rank','name'])
for i in rank:
item=[]
item.append(r)
item.append(i)
r = r + 1
print(item)
csv_write.writerow(item)
以上是 19 的全部内容, 来源链接: utcz.com/a/76307.html