Python爬虫案例:爬取酷狗音乐全排行榜歌曲

python

前言

本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理

本次目标

爬取酷狗音乐全站排行榜歌曲

 

 

目标地址

https://www.kugou.com/yy/html/rank.html?from=homepage

环境

Python3.6.5

pycharm

 

爬虫代码

调入工具

import requests

import re

import parsel

请求网站

headers = {

"authority": "wwwapi.kugou.com",

"cookie": "kg_mid=ac3836df72c523f46a85d8a5fd90fe59; kg_dfid=3ve7aQ2XyGmN0yE3uv3WcaHs; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d=1600260110,1602312707; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; kg_mid_temp=ac3836df72c523f46a85d8a5fd90fe59; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1602312738",

"referer": "https://www.kugou.com/song/",

"user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",

}

url = "https://www.kugou.com/yy/html/rank.html"

response = requests.get(url=url, headers=headers)

解析网站数据

def func(url):

response = requests.get(url=url, headers=headers)

response.encode = response.apparent_encoding

hashs = re.findall(""Hash":"(.*?)"", response.text, re.S)

album_ids = re.findall(""album_id":(.*?),"", response.text, re.S)

FileNames = re.findall(""FileName":"(.*?)"", response.text, re.S)

data = zip(hashs, album_ids, FileNames)

for i in data:

hash = i[0]

album_ids = i[1]

FileName = i[2].encode("utf-8").decode("unicode_escape")

# print(hash, album_ids, FileName)

download_url = "https://wwwapi.kugou.com/yy/index.php"

params = {

"r": "play/getdata",

"callback": "jQuery19107150201841602037_1602314563329",

"hash": "{}".format(hash),

"album_id": "{}".format(album_ids),

"dfid": "3ve7aQ2XyGmN0yE3uv3WcaHs",

"mid": "ac3836df72c523f46a85d8a5fd90fe59",

"platid": "4",

"_": "1602312793005",

}

for i in html_data:

page_url = i[0]

name = i[1]

print(page_url)

func(page_url)

print("==========================正在爬取{}歌曲========================".format(name))

保存数据

def download(url, title):

filename = "保存地址" + title + ".mp3"

response = requests.get(url=url, headers=headers)

with open(filename, mode="wb") as f:

f.write(response.content)

print(title)

运行代码,效果如下图

 

以上是 Python爬虫案例:爬取酷狗音乐全排行榜歌曲 的全部内容, 来源链接: utcz.com/z/529875.html

回到顶部