使用 awk 创建的文件中的样本名称存在问题

使用 awk 创建的文件中的样本名称存在问题

我有一些数据如下:

dir1
  |___dir2
         |___dir3
               |____files
                      |____TGH4_1.tar.gz
                      |____TGH4_2.tar.gz
                      |____IOP5_1.tar.gz
                      |____IOP5_2.tar.gz
                      |____RGH2_btre_1.tar.gz
                      |____RGH2_btre_2.tar.gz
                      |____QWE6_btre_1.tar.gz
                      |____QWE6_btre_2.tar.gz

在文件夹内dir3我创建了一个脚本test.awk,其代码如下:

BEGIN {
    FS="[/_]"; OFS="\t"
    print "sample", "Second", "Third"
}
NR%2 { second = $0; next }
{ print $2, second, $0 }

使用test.awk我创建了一个文件:

printf '%s\n' $PWD/files/* | awk -f test.awk > test.txt

我得到的输出test.txt如下所示:

sample  Second  Third
dir1    /dir1/dir2/dir3/H0032/files/TGH4_1.tar.gz   /dir1/dir2/dir3/H0032/files/TGH4_2.tar.gz
dir1    /dir1/dir2/dir3/H0032/files/IOP5_1.tar.gz   /dir1/dir2/dir3/H0032/files/IOP5_2.tar.gz
dir1    /dir1/dir2/dir3/H0032/files/RGH2_btre_1.tar.gz  /dir1/dir2/dir3/H0032/files/RGH2_btre_2.tar.gz
dir1    /dir1/dir2/dir3/H0032/files/QWE6_btre_1.tar.gz  /dir1/dir2/dir3/H0032/files/QWE6_btre_2.tar.gz

输出应如下所示:

sample  Second  Third
TGH4    /dir1/dir2/dir3/H0032/files/TGH4_1.tar.gz   /dir1/dir2/dir3/H0032/files/TGH4_2.tar.gz
IOP5    /dir1/dir2/dir3/H0032/files/IOP5_1.tar.gz   /dir1/dir2/dir3/H0032/files/IOP5_2.tar.gz
RGH2_btre   /dir1/dir2/dir3/H0032/files/RGH2_btre_1.tar.gz  /dir1/dir2/dir3/H0032/files/RGH2_btre_2.tar.gz
QWE6_btre   /dir1/dir2/dir3/H0032/files/QWE6_btre_1.tar.gz  /dir1/dir2/dir3/H0032/files/QWE6_btre_2.tar.gz

答案1

改变:

BEGIN { FS="[/_]" ... }
...
{ print $2, second, $0 }

到:

BEGIN { FS="/" ... }
...
{
    sample = $NF
    sub(/_[^_]*$/,"",sample)
    print sample, second, $0
}

您的代码正在打印,dir1因为这是给定 FS 设置的第二个字段/dir1/dir2/dir3/H0032/files/TGH4_2.tar.gz,并且您的代码显示print $2...

相关内容