新手提问,提取日志文件中的某些数据?
有一个日志文件test.log,这里面的日志是多行的,每一行的结构都是三个字段,分别为来源ip,被访问的接口,以及访问时间,字段之间是逗号分割的,求这个日志文件被访问最多前5个的接口是哪些
回答:
如果是在linux下一条命令就可以解决
cat 123.txt | awk -F"," '{print $2}' | sort |uniq -c |sort -nr |head -5
回答:
import csvdef extract_data(file):
interface_count = {}
with open(file, 'r') as f:
reader = csv.reader(f)
for row in reader:
if row[1] in interface_count:
interface_count[row[1]] += 1
else:
interface_count[row[1]] = 1
return interface_count
def get_top_five(interface_count):
sorted_interface_count = sorted(interface_count.items(), key=lambda x: x[1], reverse=True)
return [i[0] for i in sorted_interface_count[:5]]
file = "test.log"
interface_count = extract_data(file)
top_five = get_top_five(interface_count)
print(top_five)
回答:
import sysfrom collections import Counter
with open(sys.argv[1]) as log:
lines = log.readlines()
apis = Counter()
for line in lines:
line = line.strip()
if line == '':
continue
[_, api, _] = line.split(',')
apis[api] += 1
top_5_apis = apis.most_common(5)
for api, count in top_5_apis:
print(api, count)
新手上路,请多包涵
以上是 新手提问,提取日志文件中的某些数据? 的全部内容, 来源链接: utcz.com/p/938738.html