Bash 脚本无法识别命令(Perl 和 Python 脚本)

Bash 脚本无法识别命令(Perl 和 Python 脚本)

我在 bash 里面有一个脚本,它定义为使用 perl 和 python 作为需要分析我的数据的子程序。

脚本执行正常,直到步骤 3 和步骤 4,然后崩溃command not found。我尝试了chmod u+x那些 .pl .py 脚本;我尝试明确定义 perl 和 python 的路径,但没有任何效果。脚本与我正在执行的 .sh 脚本位于同一目录中。我遗漏了一些东西,但我不知道是什么。

这是我在步骤 3 和步骤 4 中遇到的错误:

步骤3&4错误

这是步骤 3 和 4 中需要使用的 bash 脚本和 .pl 和 .py 脚本。

#!/bin/bash

if [ $# -lt 1 ]
then
    echo "Please type $0 -h for helps"
    exit 65
fi

FORMAT="fastq"
REF_PATH="/home/maja2004lts/16sPIP/"
MODE="fast"
THREAD=8
Version="0.1.1"
step="step1"
THREAD=8

while getopts ":i:r:f:s:p:m:t:hv" opt
do
    case "${opt}" in
        i) NGS=${OPTARG}
           HELP=0       
        ;;
        r) NGS_R2=${OPTARG}
        #echo "${OPTARG}"
        ;;
        f) FORMAT=${OPTARG}
        ;; 
        h) HELP=1
        #echo "HELP IS $HELP"
        ;;
        s) STEP=${OPTARG}  
        ;;
        v) VERIFICATION=1 
        ;;
        p) REF_PATH=${OPTARG} #reference DB path
        ;;
        m) MODE=${OPTARG} #fast or sensitive
        ;;
        t) THREAD=${OPTARG}
        ;;
        ?) echo "Option -${opt} requires an argument. Please type $0 -h for helps" >&2
        exit 1
        ;;
    esac
done

if [ $HELP -eq 1 ]
then
    cat <<USAGE

16sPIP version ${Version}

This program will run the 16sPIP pipeline with the supplied parameters.

Command Line Switches:

    -h  Show this help & ignore all other switches

    -i  Specify NGS file or pair-end forward reads for processing

    -r      Specify pair-end reverse reads for processing

    -f  Specify NGS file format (fastq/fasta/bam/sam/sff) [fastq]
        
        16sPIP will support more NGS file formats in the future
        
    -p  Specify the PATH for database (DB)

        16sPIP will search the reference DB under the Path provided.
        
             fast_16S_nucl_DB
             sensitive_nucl_DB

    -v  Verification mode

    -s      jump to that step [1]

    -m      Specify the analysis mode: fast or sensitive [fast]
    -t      number of threads [1]



Usage:

        $0 -i <NGSfile> -f <fastq/fasta/bam/sam/sff> -p <reference_path>

        $0 -i <forward> -r <reverse> -f <fastq> -p <reference_path> -m <fast> -s <step>


USAGE
    exit
fi

if [ ! -f $NGS ]
then
    echo "$NGS file doesnot exist. Please check it"
    exit 65
fi 

if [ $NGS_R2 -a "${FORMAT}" != "fastq" ]
then
        echo "$NGS_R2 must be in fastq format"
    exit 
fi

if [ "$MODE"x != "fast"x -a "$MODE"x != "sensitive"x ]
then
        echo "Specify the analysis mode: fast or sensitive [fast]"
    exit
fi

if [ "${FORMAT}" = "sam" -o "${FORMAT}" = "bam" ]
then
     picard-tools SamToFastq INPUT=$NGS FASTQ=${NGS}.tmp
     mv ${NGS}.tmp $NGS
elif [ "${FORMAT}" = "sff" ]
then
      python ${REF_PATH}/bin/sff_extract $NGS >${NGS}.fastq
      mv ${NGS}.fastq $NGS
fi

if [ "$step" != "" ] 
then 
  goto ${step}
fi

step1:
echo "Step 1: Quality control "
if [ "${FORMAT}" = "fastq" -o "${FORMAT}" = "sam" -o "${FORMAT}" = "bam" -o "${FORMAT}" = "sff" ]
then
    if [ ${NGS_R2} -a -f ${NGS_R2} ]
    then
             /usr/bin/perl ${REF_PATH}/bin/TrimmingReads.pl -i ${NGS} -irev ${NGS_R2} -q 20 -n 50
    else
             /usr/bin/perl ${REF_PATH}/bin/TrimmingReads.pl -i $NGS -q 20 -n 50
    fi
elif [ "${FORMAT}" = "fasta" ]
then
          /usr/bin/perl ${REF_PATH}/bin/TrimmingReads.pl -i $NGS -n 50
else
          echo "Specify NGS file format (fastq/fasta/bam/sam/sff) [fastq]"
      exit
fi

if [ $NGS_R2 -a -f $NGS_R2 ]
then
       goto step2
else
       goto step3
fi

step2:
echo "Step 2: Merge double-ended reads"
${REF_PATH}/bin/pear -f ${NGS}_trimmed -r ${NGS_R2}_trimmed -o $NGS -q 20 -t 50
mv ${NGS}.assembled.fastq ${NGS}_trimmed
rm ${NGS_R2}_trimmed

step3:
echo "Step 3: sequence filtering"
if [ "$FORMAT" = "fasta" ]
then
      perl ${REF_PATH}/bin/FilterReads.pl ${NGS}_trimmed fasta 10
else
      perl ${REF_PATH}/bin/FilterReads.pl ${NGS}_trimmed fastq 10
fi

step4:
echo "Step 4: Results report"
if [ "$FORMAT" = "fasta" ]
then
      python ${REF_PATH}/bin/basicStatistics.py ${NGS}_trimmed_filter fasta ${NGS}.basic_stat.txt
else
      python ${REF_PATH}/bin/basicStatistics.py ${NGS}_trimmed_filter fastq ${NGS}.basic_stat.txt
fi

if [ "$MODE" = "fast" ]
then
      bwa mem -t ${THREAD} ${REF_PATH}/db/155pathogens.fa ${NGS}_trimmed_filter > ${NGS}.sam
      /usr/bin/perl ${REF_PATH}/bin/pathogenSamMatch.pl $NGS.sam ${NGS}.pathon.match.txt
      cat ${NGS}.pathon.match.txt >> ${NGS}.basic_stat.txt
      rm ${NGS}.pathon.match.txt
      /usr/bin/perl ${REF_PATH}/bin/SamSingleResult.pl -i $NGS.sam -s 99 -l species -o $NGS.pathogen
      /usr/bin/perl ${REF_PATH}/bin/PathogenAnnotation.pl -i $NGS.pathogen -l ${REF_PATH}/db/155pathogens.list -o $NGS.pathogen.list
      /usr/bin/perl ${REF_PATH}/bin/FastReport.pl -l $NGS.pathogen.list -s ${NGS}.basic_stat.txt -o ${NGS}.pathogen.prediction.report
elif [ "$MODE" = "sensitive" ]
then
      bwa mem -t ${THREAD} ${REF_PATH}/db/16S-complete.fa ${NGS}_trimmed_filter > $NGS.com.sam &
      echo $! >Job_id
      bwa mem -t ${THREAD} ${REF_PATH}/db/155pathogens.fa ${NGS}_trimmed_filter > $NGS.sam
      /usr/bin/perl ${REF_PATH}/bin/SamSingleResult.pl -i $NGS.sam -l species -s 99 -o $NGS.pathogen
      /usr/bin/perl ${REF_PATH}/bin/PathogenAnnotation.pl -i $NGS.pathogen -l ${REF_PATH}/db/155pathogens.list -o $NGS.pathogen.list
      if [ "$FORMAT" = "fasta" ]
      then
         /usr/bin/perl ${REF_PATH}/bin/extractSequence.pl -i ${NGS}_trimmed_filter -o ${NGS} -l $NGS.pathogen.list -f fasta
      else
         /usr/bin/perl ${REF_PATH}/bin/extractSequence.pl -i ${NGS}_trimmed_filter -o ${NGS} -l $NGS.pathogen.list -f fastq
      fi
      for i in `ls ${NGS}.*.pathon.fa`
      do
       ${REF_PATH}/bin/blastn -query $i -out $i.blast -db ${REF_PATH}/db/16S-complete -outfmt 6 -evalue 1E-20 -num_threads ${THREAD} &
       echo $! >>Job_id
      done
      /usr/bin/perl ${REF_PATH}/bin/pathogenSamMatch.pl $NGS.sam ${NGS}.pathon.match.txt
      cat ${NGS}.pathon.match.txt >> ${NGS}.basic_stat.txt
      while [ 1 ]
      do
       tag=1
       for i in `cat Job_id`
       do
          ID=`ps |grep " $i "`
       if [ "$ID" != "" ]
       then
           tag=0
           sleep 10
       fi
       done
       if [ $tag == 1 ]
       then
            break
       fi
    done
    /usr/bin/perl ${REF_PATH}/bin/SamSingleResult.pl -i $NGS.com.sam -s 99 -l species 
    /usr/bin/perl ${REF_PATH}/bin/SamSingleResult.pl -i $NGS.com.sam -s 97 -l genus
    /usr/bin/perl ${REF_PATH}/bin/SamSingleResult.pl -i $NGS.com.sam -s 95 -l family
    /usr/bin/perl ${REF_PATH}/bin/completeAnnotation.pl $NGS.com.sam.species ${REF_PATH}/db/16S-complete.list species $NGS.com.sam.species.list
    /usr/bin/perl ${REF_PATH}/bin/completeAnnotation.pl $NGS.com.sam.genus ${REF_PATH}/db/16S-complete.list genus $NGS.com.sam.genus.list
    /usr/bin/perl ${REF_PATH}/bin/completeAnnotation.pl $NGS.com.sam.family ${REF_PATH}/db/16S-complete.list family $NGS.com.sam.family.list
    cat ${NGS}.*.pathon.fa.blast >$NGS.tmp.blast
    rm ${NGS}.*.pathon.fa*
    /usr/bin/perl ${REF_PATH}/bin/completeAnnotation.pl $NGS.tmp.blast ${REF_PATH}/db/16S-complete.list species $NGS.tmp.blast.list.species
    /usr/bin/perl ${REF_PATH}/bin/sensitiveReport.pl -l $NGS.pathogen.list -t ${NGS}.basic_stat.txt -g $NGS.com.sam.genus.list -f $NGS.com.sam.family.list -b $NGS.tmp.blast.list.species -s $NGS.com.sam.species.list -o ${NGS}.pathogen.prediction.report
fi

enscript -p ${NGS}.pathogen.prediction.report.ps ${NGS}.pathogen.prediction.report
ps2pdf ${NGS}.pathogen.prediction.report.ps ${NGS}.pathogen.prediction.report.pdf

--------------perl script-----------

#!/usr/bin/perl
# 
#    FilterReads.pl
#    Read filter procedures were based on the read ambiguous base (N) content.
#    
#    FilterReads.pl <seq> <fasta|fastq> <N_percentage> 
#
### Authors : JiaojiaoMiao <[email protected]>


use strict;
use warnings;

if($#ARGV<1){
    die "usage: $0 <seq> <fasta|fastq> <N_percentage>\n";
}

my $f=$ARGV[0];
my $ncutoff;
my $format=$ARGV[1];
if(defined($ARGV[2])){
    $ncutoff=$ARGV[2];
}
else{
    $ncutoff=10;
}

my $o=$f."_filter";
open(F,"<$f")or die "can not open file: $f\n";
open(O,">$o")or die "can not open file: $o\n";

if($format=~/fastq/i){
    while(my $l=<F>){
        my $l2=<F>;
        my $l3=<F>;
        my $l4=<F>;
        my $length=length($l2);
        my $ncount=$l2=~tr/N|n/N/;
        if($length-1 == 0){
            next;
        }
        if($ncount/($length-1)*100<=$ncutoff){
            print O "$l$l2$l3$l4";
        }
    }
}
else{
    my $line=<F>;
    my $ncount=0;
    my $seqlen=0;
    my $seq="";
    while(my $l=<F>){
        if($l=~/^>/){
            if($seqlen != 0){
                if($ncount/$seqlen*100 <=$ncutoff){
                    print O "$line$seq\n";
                }
            }
        $line=$l;
        $ncount=0;
        $seqlen=0;
        $seq="";
        }
        else{
            $l=~s/\s//g;
            $seqlen += length($l);
            $ncount += $l=~tr/N|n/N/;
            $seq .= $l;
        }
    }
    if($seqlen != 0){
        if($ncount/$seqlen*100 <= $ncutoff){
            print O "$line$seq\n";
        }
    }
}

close F;
close O;
-----------------------python script---------------------

#!/usr/bin/python
#-*- coding:utf-8 -*-
#
#

import re,os,sys

if len(sys.argv) < 3:
    print "usage: basicStatistics.py <parent_file> <fasta|fastq> <outFile>"
    sys.exit(-1)

file=sys.argv[1]
format=sys.argv[2]
outfile=sys.argv[3]

file_size=float(os.path.getsize(file))/1024/1024
#print ("%.3fM\n")%file_size
fh=open(file)
lines=fh.readlines()
if format == "fastq" :
    read=[line[:-1] for line in lines[1::4]]
    reads_num=len(read)
    seq="".join(read).upper()
    base_num=len(seq)
    gc_num=seq.count("C")+seq.count("G")
    gc_num_p=float(gc_num)/base_num*100
elif format == "fasta" :
    read=[line[:-1] for line in lines[1::2]]
    reads_num=len(read)
    seq="".join(read).upper()
    base_num=len(seq)
    gc_num=seq.count("C")+seq.count("G")
    gc_num_p=float(gc_num)/base_num*100
else:
    print "usage: basicStatistics.py <parent_file> <fasta|fastq>"
    sys.exit(-1)

#print ("%s\t%s\n%s\n%s\t%.3f\n")%(read,reads_num,seq,base_num,gc_num_p)

out=open(outfile,'w')
out.write("SampleFile: "+file+"\n")
out.write("Sum Data: %.3fM\n"%file_size)
out.write("Read Num: %i\n"%reads_num)
out.write("Read Len: %i\n"%(base_num/reads_num))
out.write("GC: %.3f\n"%gc_num_p)

fh.close()
out.close()

答案1

bash 中没有goto命令,因此脚本中包含此命令的行不正确,应将其删除。

此外,行(我猜应该是标签)step1:step2:和不正确,这些行实际上导致了错误消息。在您发布的屏幕截图上,消息很清晰:step3:step4:

16sPIP.sh: line 168: step3:: command not found

16sPIP.sh: line 177: step4:: command not found

这显然表明“未找到”的命令是step3:step4:。(您一定也收到过类似的step1:和消息step2:,但它们已经滚出屏幕,您可能没有注意到它们)。

与您的说法相反,脚本不会因以下消息而“崩溃”:bash 只是忽略未找到的命令并从脚本的下一行继续执行,这可以通过以下事实清楚地表明:文本(由脚本中的行后命令Step 4: Results report打印)出现在屏幕上并且 Python 脚本开始执行(但抛出语法错误)。echostep4:basicStatistics.py

您必须重新设计脚本以摆脱goto命令和标签,因为它们在 bash 中不存在,并用适当的控制结构替换它们。

您的脚本还存在另一个问题。如果变量为空,则if包含“裸”变量的命令(如下面的命令)将因语法错误而失败。$NGS_R2

if [ $NGS_R2 -a "${FORMAT}" != "fastq" ]
if [ ${NGS_R2} -a -f ${NGS_R2} ]
if [ $NGS_R2 -a -f $NGS_R2 ]

我假设您想测试$NGS_R2变量是否为空。在这种情况下,您应该使用-n运算符,如下所示:

if [ -n "$NGS_R2" -a "${FORMAT}" != "fastq" ]
if [ -n "${NGS_R2}" -a -f "${NGS_R2}" ]
if [ -n "$NGS_R2" -a -f "$NGS_R2" ]

(您在步骤 4 中执行的 Python 脚本中似乎也有一个语法错误basicStatistics.py- 屏幕截图上显示了一条错误消息 - 但这是一个完全不同的问题。)

相关内容