Using a "for statement" to simplify bash statement with multiple repeats

Using a "for statement" to simplify bash statement with multiple repeats

I'm subsetting a filename via grep and then concatenating the resulting files with cat. However, I'm still a bit confused as to how I should use the for statement, e.g. for ((i=1;i<23;i+=1));

Given my file file1.txt, I would like to grep sample1 as follows:

grep -w '^sample1' file1.txt > sample1_file.txt
grep -w '^sample2' file2.txt > sample2_file.txt
grep -w '^sample3' file3.txt > sample3_file.txt
....
grep -w '^sample22' file22.txt > sample22_file.txt

And then concatenate these:

cat  sample1_file.txt  sample2_file.txt  sample3_file.txt ...  sample22_file.txt > final_output.txt

答案1

Try:

for i in {1..22}
do
    grep -w "^sample$i" "file$i.txt"
done >final_output.txt

Notes:

  1. {1..22} runs through all the integers from 1 to 22. For people not familiar with C, it is probably more intuitive (but less flexible) than ((i=1;i<23;i+=1))

  2. It is important that the expression ^sample$i be inside double-quotes rather than single-quotes so that the shell will expand $i.

  3. If all you want is final_output.txt, there is no need to create the intermediate files.

  4. Notice that it is efficient to place the redirection to final_output.txt after the done statement: in this way, the shell needs to open and close this file only once.

答案2

You reference parameters and variables with the dollar sign, so the loop counter i can be used as $i within the loop (within double quotes, not single quotes). Then you just need the do and done keywords to start and end the loopy part.

So, the straightforward conversion:

> final_output.txt
for (( i = 1 ; i < 23 ; i += 1)); do
    grep -w "^sample$i" "file$i.txt" > "sample${i}_file.txt"
    cat "sample${i}_file.txt" >> final_output.txt
done

Using the quotes around "file${i}.txt" is not strictly necessary as long as i only contains a number, but it's a good habit to quote any variable references, for lots of reasons.

Note that in the case of sample${i}_file.txt we need the braces in ${i}, since the underscore is valid in a variable name, and writing $i_file.txt would refer to the variable i_file.

The initial > final_output.txt is to clear the file at first, since we append to it within the loop. Of course, you can just skip creating the sample1_file.txt files if you don't need them, and just grep ... >> final_output.txt.

Alternatively, you could use brace expansion to generate a list of the numbers, instead of counting manually with the for (( ... )) loop, i.e. for i in {1..22}; do ... done.

Or, in standard POSIX shell:

i=1
while [ "$i" -lt 23 ] ; do
    grep ...
    i=$((i + 1))
done

答案3

Use:

#/bin/bash
:>final_output.txt
for i in {1..22}; do
    a=file"$i".txt
    b=sample"$i"_file.txt

    grep -w '^sample'"$i" "$a" > "$b"
    cat "$b" >> final_output.txt
done

The line :>final_output clears the file to 0 bytes.
The "Brace expansion" {1..22} expands to the list of numbers 1 2 3 … 22.
The line for i in {1..22}; do loops over all numbers 1 … 22.
The "Append" (>>) is required as we want to store all outputs to the file.

相关内容