python3 requests爬取网页乱码

1.用python3爬取网页的时候,网页端显示编码为utf-8,自己爬取的时候也是设置了编码为utf-8,但是一打印结果就是中文乱码。
图片描述

2.下面是我的代码:
url = 'http://wxqqyy.info/thread-922...'
headers = {

'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',

'Accept-Encoding':'gzip, deflate',

'Accept-Language':'zh-CN,zh;q=0.9',

'Cache-Control':'max-age=0',

'Connection':'keep-alive',

'Cookie':'__cfduid=d5694104d3cee5343587fe14079ab330e1521898937; A8tI_2132_pro=152091208; A8tI_2132_pro_x=152091208; _ga=GA1.2.110055740.1521898942; A8tI_2132_saltkey=ThWghFhJ; A8tI_2132_lastvisit=1521895434; A8tI_2132_auth=8af0QYqMLvY%2F9nw1wLTXXvp2eFiZ%2FuRip2oEU%2FqRxn8ha4%2FuQhtmxvCdI3Wz97wwkKj0o%2FdzZ1vqh4SQ08z1XbgCrzGG; A8tI_2132_lastcheckfeed=7589318%7C1521899102; A8tI_2132_atarget=1; A8tI_2132_smile=2D1; A8tI_2132_adv_gid=18; A8tI_2132_ulastactivity=1522066642%7C0; A8tI_2132_noticeTitle=1; A8tI_2132_self_unique_code=259956aa-0f37-ada1-8eb6-c33e24d3354d; cus_cookie=7; A8tI_2132_ignore_notice=1; _gid=GA1.2.1230674868.1522066645; _gat=1; _gat_gtag_UA_113290385_1=1; _gat_gtag_UA_115157189_1=1; A8tI_2132_notification_unread_tips=1522066642; __insp_wid=1484672786; __insp_nv=true; __insp_targlpu=aHR0cDovL3d4cXF5eS5pbmZvL2ZvcnVtLTE4MS0xLmh0bWw%3D; __insp_targlpt=44CQ5paw5o_Q6YaS44CR5Y2O5Lq6572R5Y_L6Ieq5ouN5Yy6fFNlbGYtU2hvb3RpbmcgVmlkZW8t5p2P5ZCnfOadj_S5i_W9seWQp3zmiJDkurrlnKjnur%2Fop4bpopHljLp8QWR1bHQgT25saW5lIFZpZGVvLeadj_WQp1%2FmgKflkKdfc2V4OF%2FmnY%2FlkKfmnInkvaDmmKXmmpboirHlvIAt5oCn5ZCn5pyJ5L2gLOaYpeaaluiKseW8gA%3D%3D; A8tI_2132_sign_close=1; __insp_norec_sess=true; A8tI_2132_notification_readed_ids=50512723; A8tI_2132_sendmail=1; A8tI_2132_st_t=7589318%7C1522066681%7Cb8db58d9dfa07894e071d0d58abeadb5; A8tI_2132_forum_lastvisit=D_181_1522066642D_155_1522066681; A8tI_2132_visitedfid=155D181D64D45; A8tI_2132_checkpm=1; A8tI_2132_self_uid=7589318; A8tI_2132_self_fid=155; A8tI_2132_st_p=7589318%7C1522066690%7C8bb3ead7576481add2dbbca9cd90760b; A8tI_2132_viewid=tid_9227343; A8tI_2132_self_tid=9227343; __insp_slim=1522066692596; A8tI_2132_lastact=1522066692%09plugin.php%09',

'Host':'wxqqyy.info',

'Referer':'http://wxqqyy.info/forum-155-1.html',

'Upgrade-Insecure-Requests':'1',

'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36'

}
response = requests.get(url,headers=headers)
response.encoding = "utf-8"
print(response.text)
3.下面是显示乱码的截图图片描述
4.自己在网上找了很久,始终找不到解决方法, 论坛也逛了很久,还是没有找到解决的方法,实在是没有办法了,希望大家能够帮忙解决一下,谢谢。

回答:

import gzip

resp = requests.get(url)

html = gzip.decompress(resp.content).decode('u8')

回答:

response = requests.get("http://wxqqyy.info/thread-9227343-1-1.html", headers={"Accept-Encoding" : ""})

以上是 python3 requests爬取网页乱码 的全部内容, 来源链接: utcz.com/a/157637.html

回到顶部