我有一个如下的文件,其中有 n 行,我想计算其总和(基于第三列),并将行相应地分布在 3 个不同的文件中(基于每个文件的总和)
例如,如果我们将所有第三列值相加,则总数为 516,如果我们将其除以 3,则为 172。
所以我想向文件添加一行,使其不超过 172 标记,与第二个文件相同,其余所有行应移动到第三个文件。
输入文件
a aa 10
b ab 15
c ac 17
a dy 30
y ae 12
a dl 34
a fk 45
l ah 56
o aj 76
l ai 12
q al 09
d pl 34
e ik 30
f ll 10
g dl 15
h fr 17
i dd 23
j we 27
k rt 12
l yt 13
m tt 19
预期产出
file1(total -163)
a aa 10
b ab 15
c ac 17
a dy 30
y ae 12
a dl 34
a fk 45
文件2(共153)
l ah 56
o aj 76
l ai 12
q al 9
文件3(总计 - 200)
d pl 34
e ik 30
f ll 10
g dl 15
h fr 17
i dd 23
j we 27
k rt 12
l yt 13
m tt 19
您能否尝试按照 GNU 中所示的示例进行编写和测试awk
.
awk '
FNR==NR{
sum+=$NF
next
}
FNR==1{
count=sum/3
}
{
curr_sum+=$NF
}
(curr_sum>=count || FNR==1) && fileCnt<=2{
close(out_file)
out_file="file" ++fileCnt
curr_sum=$NF
}
{
print > (out_file)
}' Input_file Input_file
解释:对上述内容添加详细解释。
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
sum+=$NF ##Taking sum of last field of all lines here and keep adding them to get cumulative sum of whole Input_file.
next ##next will skip all further statements from here.
}
FNR==1{ ##Checking condition if its first line for 2nd time reading of Input_file.
count=sum/3 ##Creating count with value of sum/3 here.
}
{
curr_sum+=$NF ##Keep adding lst field sum in curr_sum here.
}
(curr_sum>=count || FNR==1) && fileCnt<=2{ ##Checking if current sum is <= count OR its first line(in 2nd time reading) AND output file count is <=2 here.
close(out_file) ##Closing output file here, may NOT be needed here since we are having only 3 files here in output.
out_file="file" ++fileCnt ##Creating output file name here.
curr_sum=$NF ##Keep adding lst field sum in curr_sum here.
}
{
print > (out_file) ##Printing current line into output file here.
}' Input_file Input_file ##Mentioning Input_file names here.
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)