我有以下 zip 存档结构:
$ unzip -l Undetermined_S0_L004_R1_001_fastqc.zip
Archive: Undetermined_S0_L004_R1_001_fastqc.zip
Length Date Time Name
-------- ---- ---- ----
0 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/
0 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Icons/
0 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/
1197 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Icons/fastqc_icon.png
1450 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Icons/warning.png
1561 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Icons/error.png
1715 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Icons/tick.png
782 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/summary.txt
9095 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_base_quality.png
14381 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_tile_quality.png
23205 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_sequence_quality.png
30978 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_base_sequence_content.png
31152 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_sequence_gc_content.png
7861 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/per_base_n_content.png
18356 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/sequence_length_distribution.png
23040 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/duplication_levels.png
9096 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/adapter_content.png
58683 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/Images/kmer_profiles.png
355919 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/fastqc_report.html
301092 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/fastqc_data.txt
10117 10-10-14 14:44 Undetermined_S0_L004_R1_001_fastqc/fastqc.fo
-------- -------
899680 21 files
如何并行使用fastqc_data.txt
with crimson
,因为目前我收到以下错误:
find `pwd`/*_fastqc.zip -type f | parallel -j 3 unzip -c {} {}/fastqc_data.txt | crimson fastqc {} | less
Usage: crimson fastqc [OPTIONS] INPUT [OUTPUT]
Error: Invalid value for "input": Path "{}" does not exist.
答案1
您有一个由四个命令组成的管道:
find
,其中列出了 zip 文件。parallel
,它调用unzip
以提取每个 zip 文件中的一个文件。鉴于 被{}
zip 文件的路径替换,您尝试home/user977828/stuff/Undetermined_S0_L004_R1_001_fastqc.zip/fastqc_data.txt
从存档中提取文件(如果当前目录是/home/user977828/stuff
)。crimson
,它在标准输入上接收一堆提取的文件,并使用参数fastqc
和进行调用{}
,less
。
parallel
仅{}
在其论点中进行替代。它对管道的其他部分无能为力。如果要单独调用crimson
每个fastqc_data.txt
文件,则需要将管道从unzip
tocrimson
作为参数传递给parallel
.
find *_fastqc.zip -type f | sed 's/\.zip$//' |
parallel -j 3 'unzip -c {}.zip {}/fastqc_data.txt | crimson fastqc /dev/stdin' |
less