Python - 从 Pandas DataFrame 中删除重复值
要从 Pandas DataFrame 中删除重复值,请使用drop_duplicates()方法。首先,创建一个包含 3 列的 DataFrame -
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'],'UnitsSold': [95, 70, 80, 95, 70, 90]})
删除重复值 -
dataFrame = dataFrame.drop_duplicates()
示例
以下是完整的代码 -
import pandas as pd输出结果# Create DataFrame
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'], 'UnitsSold': [95, 70, 80, 95, 70, 90]})
print"Dataframe...\n", dataFrame
# counting frequency of column Car
count = dataFrame['Car'].value_counts()
print"\nCount in column Car"
print(count)
# removing duplicates
dataFrame = dataFrame.drop_duplicates()
print"\nUpdated DataFrame after removing duplicates...\n",dataFrame
# counting frequency of column Car after removing duplicates
count = dataFrame['Car'].value_counts()
print"\nCount in column Car"
print(count)
这将产生以下输出 -
Dataframe...Car Place UnitsSold
0 BMW Delhi 95
1 Mercedes Hyderabad 70
2 Lamborghini Chandigarh 80
3 BMW Delhi 95
4 Mercedes Hyderabad 70
5 Porsche Mumbai 90
Count in column Car
BMW 2
Mercedes 2
Porsche 1
Lamborghini 1
Name: Car, dtype: int64
Updated DataFrame after removing duplicates...
Car Place UnitsSold
0 BMW Delhi 95
1 Mercedes Hyderabad 70
2 Lamborghini Chandigarh 80
5 Porsche Mumbai 90
Count in column Car
BMW 1
Porsche 1
Lamborghini 1
Mercedes 1
Name: Car, dtype: int64
以上是 Python - 从 Pandas DataFrame 中删除重复值 的全部内容, 来源链接: utcz.com/z/352643.html