我有一个包含许多 txt 文件的文件夹。每个文件都以以下格式存在:
Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False
19 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
我的目标:是循环文件夹中的文件并将它们组合到一个全局输出中。在示例中值得注意的是,我只想考虑“19(每个文件中的这个数字不同)联系人”之后(并包括它)的字符串,从而跳过文件的前六行。
实现的可能工作流程:
# make a log file which will contain info from all files going to be looped on the next step.
echo "This is a beginning of the global output" > ./final_output.txt
# that is a key phrase which is the indicator of the first string which should be taken from each of the files
key= "#any of the digit# contacts"
#now I want to loop each of the files with the aim to add all of the strings after (and including) ${key} to the final_output.txt
for file in ${folder}/*.txt; do
file_title=$(basename "$file")
# 1- print the ${file_title} within the final_output.txt
# 2 - add all of the strings from the file into the final_output.txt
# NB ! I need to take only the strings after (and including) the key-phrace
done
答案1
以3个文件为例
文件1
Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False
19 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
文件3
Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False
17 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
文件4
Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False
12 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
下面的代码将保存 19contacts,17contacts,12 contacts 的输出直到文件末尾
for i in file1 file3 file4; do sed -n '/^[0-9]/,$p' $i; done > /var/tmp/outputfile.txt
输出
19 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
17 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
12 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
答案2
我找到了具有相同输入文件的另一种方法
代码:
for i in file1 file3 file4; do sed '1,6d' $i; done > /var/tmp/outputfile.txt
输出
19 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
17 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335
12 contacts
atom1 atom2 overlap distance
:128.B@BB :300.C@BB -1.676 4.996
:179.B@BB :17.C@BB -1.898 5.218
:182.B@BB :17.C@BB -2.015 5.335