shell中两个输出的笛卡尔积

shell中两个输出的笛卡尔积

我有一个 shell 脚本来从给定文件中提取文件名和列:并且需要从目录中读取的示例文件是:

2222_AAA Accounting Statistic-42005_04May2020_0900-04May2020_1000.csv


#!/bin/bash

# Go to where the files are located
filedir=/home/vikrant_singh_rana/AAA_USP/sample-Files/*

for filename in $filedir
do
 #echo "Processing $filepath"
 # do something on $f
 printf '%s,%s\n' "$(basename "$filename" ".csv" | grep -oP '(?<=_).*(?=\-\d\d\d)' )" "$(head -n1 "$filename")"

done > test.txt;

上面的 shell 脚本将产生以下输出: 输入文件中的文件名和标题列

cat test.txt
AAA Accounting Statistic,TIMESTAMP,C420050004,C420050005,C420050006,C420050007

我期待文件名和文件中的列的笛卡儿乘积:

AAA Accounting Statistic,TIMESTAMP
AAA Accounting Statistic,C420050004
AAA Accounting Statistic,C420050005
AAA Accounting Statistic,C420050006
AAA Accounting Statistic,C420050007

答案1

您需要第二个循环来处理第一行$filename

for filename in /home/vikrant_singh_rana/AAA_USP/sample-Files/*; do
    # ...
    b=$(basename "$filename" ".csv" | grep -oP '(?<=_).*(?=\-\d\d\d)' )
    for c in $(head -n1 "$filename" | sed 's/,/ /g'); do
        printf '%s,%s\n' "$b" "$c"
    done
done > test.txt

PS:这假设 的第一行中没有空格字符或换行符$filename

答案2

#!/bin/sh

for pathname in /home/vikrant_singh_rana/AAA_USP/sample-Files/*.csv
do
    name=${pathname##*/}   # remove directory path
    name=${name#*_}        # remove *_ prefix (up to first underscore)
    name=${name%%-*}       # remove -* suffix (from first dash)

    awk -F , -v name="$name" 'BEGIN { OFS=FS } { for (i = 1; i <= NF; ++i) print name, $i; exit }' "$pathname"
done

这会迭代所有 CSV 文件,并NNNN_从名称中删除目录路径和初始字符串,以及第一个-字符之后的所有内容。该字符串保存在$name.

然后在该文件上运行一个简短的awk程序,该程序将文件第一行中的字段打印在单独的行上,每行都以 中提取的值作为前缀$name

这假设 CSV 文件是简单的第一行字段中没有嵌入逗号或换行符的 CSV 文件。


如果你没有数千个文件,你也可以awk像这样使用 GNU:

awk -F , '
    BEGIN { OFS=FS }
    BEGINFILE {
        name = FILENAME
        sub(".*/", "", name)       # remove directory path
        sub("^[^_]*_", "", name)   # remove *_ prefix (up to first underscore)
        sub("-.*", "", name)       # remove -* suffix (from first dash)
    }
    {
        for (i = 1; i <= NF; ++i) print name, $i
        nextfile
    }' /home/vikrant_singh_rana/AAA_USP/sample-Files/*.csv

相关内容