我有多个文件,超过 100 个,如下所示,我需要通过将同名文件连接在一起将它们合并到 1 个文件中。
AB_HCE_USERS_20221228_001.txt
AB_HCE_USERS_20221228_002.txt
AB_HCE_TASKS_20221228_001.txt
AB_HCE_TASKS_20221228_002.txt
AB_HCE_TASKS_20221228_003.txt
AB_HCE_ASSESSMENTS_20221228_001.txt
AB_HCE_ASSESSMENTS_20221228_002.txt
AB_HCE_CONTACT_20221228_003.txt
AB_HCE_CONTACT_20221228_004.txt
AB_HCE_CONSUMERS_20221228_001.txt
AB_HCE_VERIFICATION_20221228_001.txt
AB_HCE_VERIFICATION_20221228_002.txt
AB_HCE_CONSUMER_RELATIONSHIP_20221228_001.txt
AB_HCE_CONSUMER_RELATIONSHIP_20221228_002.txt
...
期望的输出:
AB_HCE_USERS_20221228.txt
AB_HCE_TASKS_20221228.txt
AB_HCE_ASSESSMENTS_20221228.txt
AB_HCE_CONTACT_20221228.txt
AB_HCE_CONSUMERS_20221228.txt
AB_HCE_VERIFICATION_20221228.txt
AB_HCE_CONSUMER_RELATIONSHIP_20221228.txt
..
答案1
和awk
:
#!/bin/bash
for file in AB*.txt; do
awk -F'_[0-9]+.txt$' '{
system("cat "$0" >> ("$1".txt"));
close($1".txt")
}' <<< "$file"
done
答案2
和gawk
:
gawk '
BEGINFILE {out = FILENAME; sub(/_[^_]*$/, ".txt", out)}
{print > out}' ./*_[[:digit:]][[:digit:]][[:digit:]].txt
请注意,这将为没有行分隔符的行添加行分隔符。替换print
为printf "%s", $0 RT
以避免它。如果给定文件的源文件全部为空,则不会创建/截断相应的输出文件。您可以printf "" > out
在声明中添加一个BEGINFILE
来解决这个问题。
使用zsh
, 不受上述限制且不限于 000...999 数字(使用n
glob 限定符确保文件名按数字排序):
typeset -A map=()
for f (*_<->.txt(NDn.)) map[${f%_*}]+=$f$'\0'
for out files (${(kv)map}) cat -- ${(0)files} > $out.txt
答案3
bash 的简单模式:
for i in AB_*; do
cat $i >> ${i%_*}.txt
done
为了避免可能的冲突,最好为新文件创建一个单独的目录:cat $i >> my_dir/${i%_*}.txt
.也可以将掩码更改为:for i in AB_*_[0-9][0-9][0-9].txt