熊猫表scrape

我想找出将表转换成JSON记录的最佳方法。目前我有我想要的输出,但桌子的格式令我困惑。下面的例子应该解释:熊猫表scrape

ID Product  Item_Material Owner   Interest % 

123 Test Item 1 Electric Elctrotech 60%

null null null Spark inc 40%

124 Test Item 2 Wood TY Toys 100%

125 Test Item 3 Plastic NA Materials 100%

我的新行JSON是我想要的,但我期待如果父行的一部分以某种方式实现嵌套表行到一个嵌套的JSON格式。

{"ID":"Test Item 1", "Item_Material":"Electric", "Owner":"Elctrotech","Interest %":"60%"} 

{"ID":null, "Item_Material":null, "Owner":"Spark inc","Insterest %":"40%"}

{"ID":"Test Item 2", "Item_Material":"Wood", "Owner":"TY Toys","Insterest %":"100%"}

{"ID":"Test Item 3","Item_Material":"Plastic","Owner":"NA Materials","Interest %":"100%"}

其目的是让第一行JSON像这样?

{"ID":"Test Item 1", "Item_Material":"Electric", "Owners": [{"Owner": "Elctrotech", "Interest %":"60%", "Owner":"Spark inc","Interest %":"40%"}]} 

数据使用美丽的汤从刮表起源,所以当拉成大熊猫数据帧就提出这样我提供的表中的行都是在单独的<tr>标签。我不知道是否有功能,甚至在熊猫上合并到上面的行,所以我可以有一个JSON记录每个'产品'。有时可能有多个'所有者'每个项目不只是2.

回答:

输出字典行不是你所期望的,但你的字典sintax是错误的。尝试这个。只有熊猫

p=[[123,"Test Item 1","Electric","Elctrotech","60%"], [124,"Test Item 2","Wood"," TY Toys","100%"],[125,"Test Item 1","Plastic","NA Materials","100%"], [123,"Test Item 1","Foo","Bar","80%"], [123,"Test Item 1","Electric","TRY TRY TRY","70%"]] 

x=pd.DataFrame(p, columns=["ID","Product","Item_Material","Owner","Interest %"])

d=dict(ID="", Item_Material="", Owners={"Owner":[], "Interest %":[]})

x_gb=x.groupby(["Product", "Item_Material"])

grouped_Series_Owner = x_gb["Owner"].apply(list).to_dict()

grouped_Series_Interest = x_gb["Interest %"].apply(list).to_dict()

for k in out.keys():

d["Item_Material"]=out[k]["Item_Material"]

d["ID"]=out[k]["Product"]

d["Owners"]["Owner"]= grouped_Series_Owner[(out[k]["Product"], out[k]["Item_Material"])]

d["Owners"]["Interest %"]= grouped_Series_Interest[(out[k]["Product"], out[k]["Item_Material"])]

print(d)

以上是 熊猫表scrape 的全部内容, 来源链接: utcz.com/qa/264155.html

回到顶部