Bash:根据搜索模式组合多个日志文件的组成

Bash:根据搜索模式组合多个日志文件的组成

我有一个包含许多 txt 文件的文件夹。每个文件都以以下格式存在:

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

我的目标:是循环文件夹中的文件并将它们组合到一个全局输出中。在示例中值得注意的是,我只想考虑“19(每个文件中的这个数字不同)联系人”之后(并包括它)的字符串,从而跳过文件的前六行。

实现的可能工作流程:

# make a log file which will contain info from all files going to be looped on the next step.
echo "This is a beginning of the global output" > ./final_output.txt
# that is a key phrase which is the indicator of the first string which should be taken from each of the files
key= "#any of the digit# contacts" 

#now I want to loop each of the files with the aim to add all of the strings after (and including) ${key} to the final_output.txt
for file in ${folder}/*.txt; do
  file_title=$(basename "$file")
  # 1- print the ${file_title} within the final_output.txt
  # 2 -  add all of the strings from the file into the final_output.txt
  # NB ! I need to take only the strings after (and including) the key-phrace

done

答案1

以3个文件为例

文件1

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

文件3

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

文件4

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

下面的代码将保存 19contacts,17contacts,12 contacts 的输出直到文件末尾

 for i in file1 file3 file4; do sed -n '/^[0-9]/,$p'  $i; done > /var/tmp/outputfile.txt

输出

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

答案2

我找到了具有相同输入文件的另一种方法

代码:

 for i in file1 file3 file4; do sed '1,6d'  $i; done > /var/tmp/outputfile.txt

输出

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

相关内容