在 Unix 平面文件中的所需位置设置分隔符

2024-6-4 • tag-icon

我是 Unix shell 脚本的初学者。我有一个巨大的文本文件，假设有超过 10 万条记录，每行有近 600 个字符。我的要求是通过根据所需位置放置分隔符将平面文件转换为 csv 格式。

示例文件1.txt

1234567890
9876543210

分隔符位置：[1,3,5,9]

预期产出

1,23,45,6789,0
9,87,65,4321,0

我尝试了下面的代码，当我使用 1.cfg 和 3 条记录时，它正在工作

1.cfg内容：

4
2
1

但是，一旦我在配置文件中使用带有 4 个分隔符的文件（ 6 4 2 和 1 ），它就不会打印记录号 2（即 4），并且打印记录为 6 2 和 1

这是我的示例代码

j=`cat 1.cfg |wc -l`
echo "Total split" $j
counter=0
set -x
for i in `cat 1.cfg`
do
counter=`expr $counter + 1`
echo "Printing value of counter " $counter

# If there is only one field in the config file
        if [ "$j" = 1 ]
        then
                COMMAND_FINAL=`echo "sed -i 's/./&,/$i' 1.txt"`
                #COMMAND_FINAL=`echo "`sed -i 's/./&,/$i' 1.txt`"`
        fi
# If there are more than one fields in the config file and for first record generating the command
        if [[ "$counter" != "$j" && "$counter" = 1 ]]
        then
                COMMAND=`echo "sed -i 's/./&,/$i;"`
                #COMMAND=`echo "`sed -i 's/./&,/$i;"`
                echo "Value of COMMAND VARIABLE is" $COMMAND
# For the 2nd fields untill 2nd last field generating the command
        elif [[ "$counter" != "$j" && "$counter" != 1 ]]
        then
                COMMAND1=`echo "s/./&,/$i;"`
                COMMAND2=$COMMAND$COMMAND1
                echo "Value of command :" $COMMAND
                echo "Value of command1 :" $COMMAND1
                echo "Value of command2 :" $COMMAND2
                #echo "If i is not 1 and i is not last Printing middle records" $COMMAND2
# For the last field generating the command
        elif [[ "$counter" = "$j" && "$j" != 1 ]]
        then
                COMMAND3=`echo "s/./&,/$i' 1.txt"`
                #COMMAND3=`echo "s/./&,/$i' 1.txt"`
                COMMAND_FINAL=$COMMAND2$COMMAND3
                echo "Final Command is " $COMMAND_FINAL
        fi
done
set -x
echo "$COMMAND_FINAL" > execute.ksh
chmod 755 execute.ksh
./execute.ksh
echo "Executing the final command"

答案1

使用 GNU awk：

awk '{$1=$1}1' FIELDWIDTHS='1 2 2 4 1' OFS=',' file

或者使用 GNU sed：

sed -r 's/^(.{1})(.{2})(.{2})(.{4})(.{1})$/\1,\2,\3,\4,\5/' file

输出：

1,23,45,6789,0
9,87,65,4321,0

看：8 个强大的 awk 内置变量 – FS、OFS、RS、ORS、NR、NF、FILENAME、FNR

答案1

相关内容