如果我理解正确的话,那么
awk 'NR == FNR { selected[$1] = 1; next } selected[FNR]' indexfile datafile
应该可以工作,假设索引按升序排序,或者您希望在数据文件中按行的顺序打印行,而不管索引的排序方式如何。其工作原理如下:
NR == FNR { # while processing the first file
selected[$1] = 1 # remember if an index was seen
next # and do nothing else
}
selected[FNR] # after that, select (print) the selected lines.
如果索引未排序,则应按行在索引中出现的顺序打印行:
NR == FNR { # processing the index:
++counter
idx[$0] = counter # remember that and at which position you saw
next # the index
}
FNR in idx { # when processing the data file:
lines[idx[FNR]] = $0 # remember selected lines by the position of
} # the index
END { # and at the end: print them in that order.
for(i = 1; i <= counter; ++i) {
print lines[i]
}
}
这也可以内联(后面加分号)++counter
and index[FNR] = counter
,但我可能会把它放在一个文件中,比如说foo.awk
,然后运行awk -f foo.awk indexfile datafile
。带有索引文件
1
4
3
和一个数据文件
line1
line2
line3
line4
这将打印
line1
line4
line3
剩下的警告是,这假设索引中的条目是唯一的。如果这也是一个问题,您将必须记住索引位置列表,在扫描数据文件时将其拆分并记住每个位置的行。那是:
NR == FNR {
++counter
idx[$0] = idx[$0] " " counter # remember a list here
next
}
FNR in idx {
split(idx[FNR], pos) # split that list
for(p in pos) {
lines[pos[p]] = $0 # and remember the line for
# all positions in them.
}
}
END {
for(i = 1; i <= counter; ++i) {
print lines[i]
}
}
最后,这与问题中的代码功能等效。您必须决定用例的复杂程度。