合并文件,但只在标题行输出

我见过一些以前的文章有解决方案,为别人工作,但由于某种原因一直没有为我工作。我试图编写一个python脚本来合并3个具有相同格式的文件,2)删除重复的头只,3)排序行Specimen_ID,和4)在每个行之间添加2个新的空行独特的Specimen_ID(也就是说,除了第一个实例,由于头部的原因,每三行都需要第4行)。合并文件,但只在标题行输出

我有一个脚本,前两个和最后一步工作的一部分:

import glob 

read_files = glob.glob("*.txt")

header_saved = False

linecnt=0

with open("merged_data.txt", "wb") as outfile:

for f in read_files:

with open(f, "rb") as infile:

header = next(infile)

if not header_saved:

outfile.write(header)

header_saved = True

for line in infile:

outfile.write(line)

linecnt=linecnt+1

if (linecnt%3)==0:

outfile.write("\n\n")

分拣行有什么建议?另外,如果数据从制表符分隔的txt文件中导出Excel,我发现该脚本只会导致包含第一个infile的内容的输出,而不会导致其他输出。如果我只是将数据复制并粘贴到新的txt文件中并将它们用作infiles,那么我没有任何问题。有谁知道我为什么遇到这个问题?

例输入文件的文本(INFILE 1):

Specimen_ID Measured_by_initals Measure_date Sex Beak_length Pronotal_width Right_fore_femur_length Right_fore_femur_width Left_fore_femur_length Left_fore_femur_width Right_hind_femur_length Right_hind_femur_width Left_hind_femur_length Left_hind_femur_width Right_hind_femur_area Left_hind_femur_area Right_hind_tibia_width Left_hind_tibia_width Notes 

a 1 30-Dec-16 M 4 4 4 4 4 4 4 4 4 4 4 4 4 4

b 1 30-Dec-16 F 4 4 4 4 4 4 4 4 4 4 4 4 4 4 beak bent

c 1 30-Dec-16 M 4 4 4 4 4 4 4 4 4 4 4 4 4 4

d 1 30-Dec-16 F 4 4 4 4 4 4 4 4 4 4 4 4 4 4

e 1 30-Dec-16 F 4 4 4 4 4 4 4 4 4 4 4 4 4 4 pronotum deformed

f 1 30-Dec-16 F 4 4 4 4 4 4 4 4 4 4 4 4 4 4

例输入文件的文本(INFILE 2):

Specimen_ID Measured_by_initals Measure_date Sex Beak_length Pronotal_width Right_fore_femur_length Right_fore_femur_width Left_fore_femur_length Left_fore_femur_width Right_hind_femur_length Right_hind_femur_width Left_hind_femur_length Left_hind_femur_width Right_hind_femur_area Left_hind_femur_area Right_hind_tibia_width Left_hind_tibia_width Notes 

a 2 30-Dec-16 M 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

b 2 30-Dec-16 F 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

c 2 30-Dec-16 M 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

d 2 30-Dec-16 F 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

e 2 30-Dec-16 F 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

f 2 30-Dec-16 F 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1 4.1

回答:

您的解决方案应该是工作完美,除非有一些意想不到的数据文件。我刚添加的代码为您的第三个项目

read_files = glob.glob("*.txt") 

header_saved = False

linecnt=0

with open("merged_data.txt", "wb") as outfile:

for f in read_files:

with open(f, "rb") as infile:

header = next(infile)

if not header_saved:

outfile.write(header)

header_saved = True

for line in infile:

outfile.write(line)

linecnt=linecnt+1

if (linecnt%3)==0:

outfile.write("\n\n")

inputfile1.txt

Employee,Account,Currency,Amount,Location 

Test 1, Basic,USD,3000,Airport

Test 2, Net, USD,2000,Airport

Test 3, Basic,USD,4000,Town

Test 4, Net, USD,3000,Town

Test 5, Basic,GBP,5000,Town

Test 6, Net, GBP,4000,Town

inputfile2.txt

Employee,Account,Currency,Amount,Location 

Test 8, Basic,USD,3000,Airport

Test 9, Net, USD,2000,Airport

Test 10, Basic,USD,4000,Town

Test 11, Net, USD,3000,Town

Test 12, Basic,GBP,5000,Town

Test 13, Net, GBP,4000,Town

输出

Employee,Account,Currency,Amount,Location 

Test 1, Basic,USD,3000,Airport

Test 2, Net, USD,2000,Airport

Test 3, Basic,USD,4000,Town

Test 4, Net, USD,3000,Town

Test 5, Basic,GBP,5000,Town

Test 6, Net, GBP,4000,Town

Test 8, Basic,USD,3000,Airport

Test 9, Net, USD,2000,Airport

Test 10, Basic,USD,4000,Town

Test 11, Net, USD,3000,Town

Test 12, Basic,GBP,5000,Town

Test 13, Net, GBP,4000,Town

以上是 合并文件,但只在标题行输出 的全部内容, 来源链接: utcz.com/qa/263880.html

回到顶部