如何根据熊猫数据框中的多列获取百分比数？

Z时代
2024-01-10
分类：问答

我在数据框中有20列。我列出其中4这里作为例子：如何根据熊猫数据框中的多列获取百分比数？

is_guarantee：0或1
hotel_star：0，1，2，3，4，5
ORDER_STATUS：40，60，80
旅程（标签）： 0，1，2

is_guarantee hotel_star order_status journey 0 0 5 60 0 1 1 5 60 0 2 1 5 60 0 3 0 5 60 1 4 0 4 40 0 5 0 4 40 1 6 0 4 40 1 7 0 3 60 0 8 0 2 60 0 9 1 5 60 0 10 0 2 60 0 11 0 2 60 0

Click to View Image

但该系统需要输入发生矩阵像以下格式函数：

Click to View Image

身体能帮助吗？

df1 = pd.DataFrame(index=range(0,20)) 
df1['is_guarantee'] = np.random.choice([0,1], df1.shape[0]) 
df1['hotel_star'] = np.random.choice([0,1,2,3,4,5], df1.shape[0]) 
df1['order_status'] = np.random.choice([40,60,80], df1.shape[0]) 
df1['journey '] = np.random.choice([0,1,2], df1.shape[0])

回答：

我想你需要：

重塑通过melt和groupby与size得到计数，通过unstack

重塑然后划分和每行和参加MultiIndex到index：

df = (df.melt('journey') 
     .astype(str) 
     .groupby(['variable', 'journey','value']) 
     .size() 
     .unstack(1, fill_value=0)) 
df = (df.div(df.sum(1), axis=0) 
     .mul(100) 
     .add_prefix('journey_') 
     .set_index(df.index.map(' = '.join)) 
     .rename_axis(None, 1)) 
print (df) 
        journey_0 journey_1 
hotel_star = 2  100.000000 0.000000 
hotel_star = 3  100.000000 0.000000 
hotel_star = 4  33.333333 66.666667 
hotel_star = 5  80.000000 20.000000 
is_guarantee = 0 66.666667 33.333333 
is_guarantee = 1 100.000000 0.000000 
order_status = 40 33.333333 66.666667 
order_status = 60 88.888889 11.111111

以上是如何根据熊猫数据框中的多列获取百分比数？的全部内容，来源链接： utcz.com/qa/262810.html

如何根据熊猫数据框中的多列获取百分比数？

回答：

其他人也看了：