日期值比较Python列表

我正在重写一个csv文件,我正在寻找创建一个函数,通过列表中的项目进行比较。更清楚的是,这里是一个例子。日期值比较Python列表

我的CSV转换为表:

import csv 

with open('test.csv', 'rb') as csvfile:

spamreader = csv.reader(csvfile, delimiter=';', quotechar='|')

lista = list(spamreader)

print lista

>>>[['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']]

因此,首先,我需要comparate马丁和汤姆的所有值。我的意思是,item[2] of 20/12/2017 to item[2] of 21/12/2017. item[2] of 21/12/2017 to item[2] of 22/12/2017。我需要这些用于我的清单中的所有项目(项目[2,3,4,5,6]。日期是最重要的值,因为这个想法是一天比较的。)

结果我希望是这样的:

21/12/2017 Martin 

item[2]: smaller

item[3]: smaller

item[4]: bigger

item[5]: smaller

item[6]: smaller

22/12/2017 Martin

item[2]: smaller

item[3]: bigger

item[4]: bigger

item[5]: bigger

item[6]: bigger

21/12/2017 Tom

item[2]: smaller

item[3]: bigger

item[4]: bigger

item[5]: bigger

item[6]: bigger

22/12/2017 Tom

item[2]: smaller

item[3]: smaller

item[4]: smaller

item[5]: smaller

item[6]: bigger

如果我想显示的名称为“Subastas”,而不是项目[2],所有的名字太...我怎么能做到这一点

回答:

让我们开始呢?注意到你有一些数据的键是(date, name)。一个相当明显的方法是将数据存储在一个以(date, name)为关键字的字典中。

所以,把你的发布数据mylist

mylist = [['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']] 

转换它(除了第一行与列标签),以这样的词典:

import datetime 

mydict = {}

for row in mylist[1:]:

date = datetime.datetime.strptime(row[0],'%d/%m/%Y')

name = row[1]

mydict[(date,name)] = row[2:]

棘手位在这里是你的日期是形式为dd/mm/yyyy的字符串,但你稍后想要在一天和下一天之间进行比较。这并不令人意外,因为您将此问题作为您问题的主题。所以你需要把字符串日期转换成你可以进行适当比较的东西。这就是strptime()所做的。

您的数据现在看起来是这样的:

>>> mydict 

{(datetime.datetime(2017, 12, 20, 0, 0), 'Martin'): ['165.665', '3.777', '2,28%', '1,58', '0,42'],

(datetime.datetime(2017, 12, 22, 0, 0), 'Tom'): ['60.189', '12.654', '21,02%', '11,58', '0,92'],

(datetime.datetime(2017, 12, 21, 0, 0), 'Martin'): ['229.620', '18.508', '8,06%', '14,56', '0,79'],

(datetime.datetime(2017, 12, 21, 0, 0), 'Tom'): ['90.962', '19.186', '21,09%', '14,26', '0,74'],

(datetime.datetime(2017, 12, 20, 0, 0), 'Tom'): ['102.613', '20.223', '19,71%', '17,86', '0,88'],

(datetime.datetime(2017, 12, 22, 0, 0), 'Martin'): ['204.042', '48.526', '23,78%', '43,98', '0,91']}

下一个要观察的是,你的数据由浮点数字和百分比,但表示为字符串。这使事情变得复杂,因为你想做比较。如果你比较'165.665''229.620'第一个会更小,这是你所期望的

['165.665', '3.777', ... 

['229.620', '18.508', ...

:先取2个数据点的马丁。但是,如果您将'3.777''18.508'进行比较,则第一个将会更大:不是您所期望的。这是因为字符串按字母顺序进行比较,31之后。

更糟糕的是,您的数据有时以小数点表示逗号,有时不表示。

所以你需要一个函数来对字符串进行数值转换。这里是一个天真的一个为你的数据的作品,但很可能需要进行在现实生活中更稳健:

def convert(n): 

n = n.replace(",",".").replace("%","")

try:

return float(n)

except ValueError:

return 0e0

现在你在一个位置做比较:

for (day, name) in mydict: 

previous_day = day - datetime.timedelta(days=1)

if (previous_day,name) in mydict:

print datetime.datetime.strftime(day,"%d/%m/%Y"), name

day2_values = mydict[(day, name)]

day1_values = mydict[(previous_day, name)]

comparer = zip(day2_values, day1_values)

for n,value in enumerate(comparer):

print "item[%d]:" % (n+2,),

if convert(value[1]) < convert(value[0]):

print value[1], "smaller than", value[0]

else:

print value[1], "bigger than", value[0]

print

我有使消息更加明确,例如,item[2]: 165.665 smaller than 229.620。这样,您就可以轻松验证程序是否正确,而无需重新查看数据,这很容易出错且乏味。如果你愿意,你可以随时让这些信息不那么明确。

22/12/2017 Tom 

item[2]: 90.962 bigger than 60.189

item[3]: 19.186 bigger than 12.654

item[4]: 21,09% bigger than 21,02%

item[5]: 14,26 bigger than 11,58

item[6]: 0,74 smaller than 0,92

21/12/2017 Martin

item[2]: 165.665 smaller than 229.620

item[3]: 3.777 smaller than 18.508

item[4]: 2,28% smaller than 8,06%

item[5]: 1,58 smaller than 14,56

item[6]: 0,42 smaller than 0,79

21/12/2017 Tom

item[2]: 102.613 bigger than 90.962

item[3]: 20.223 bigger than 19.186

item[4]: 19,71% smaller than 21,09%

item[5]: 17,86 bigger than 14,26

item[6]: 0,88 bigger than 0,74

22/12/2017 Martin

item[2]: 229.620 bigger than 204.042

item[3]: 18.508 smaller than 48.526

item[4]: 8,06% smaller than 23,78%

item[5]: 14,56 smaller than 43,98

item[6]: 0,79 smaller than 0,91

要显示"Subastas",而不是item[2],记得,列标签是在mylist的第一个元素:

>>> mylist[0] 

['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"']

所以将它们包括在输出中,你需要改变这一行:

print "item[%d]:" % (n+2,), 

print mylist[0][n+2] + ":", 

回答:

您可以加载LISTA成数据帧,然后从那里执行比较:

import pandas as pd 

import numpy as np

headers = lista.pop(0)

df = pd.DataFrame(lista, columns = headers)

martin = df[df['"Cliente"'] == 'Martin']

tom = df[df['"Cliente"'] == 'Tom']

merge = pd.merge(martin, tom, on = '"Fecha"')

stats = headers[2:]

compare = ['"Fecha"']

for index, row in merge.iterrows():

for x in stats:

merge[x+'_compare'] = np.where(row[x+'_x'] > row[x+'_y'], 'Martin', 'Tom')

if x+'_compare' not in compare:

compare.append(x+'_compare')

print(merge[compare])

#output

"Fecha" "Subastas"_compare "Impresiones_exchange"_compare "Fill_rate"_compare "Importe_a_pagar_a_medio"_compare "ECPM_medio"_compare

20/12/2017 Tom Martin Martin Martin Tom

21/12/2017 Tom Martin Martin Martin Tom

22/12/2017 Tom Martin Martin Martin Tom

以上是 日期值比较Python列表 的全部内容, 来源链接: utcz.com/qa/265131.html

回到顶部