我是 Unix shell 脚本的初学者。我有一个巨大的文本文件,假设有超过 10 万条记录,每行有近 600 个字符。我的要求是通过根据所需位置放置分隔符将平面文件转换为 csv 格式。
示例文件1.txt
1234567890
9876543210
分隔符位置:[1,3,5,9]
预期产出
1,23,45,6789,0
9,87,65,4321,0
我尝试了下面的代码,当我使用 1.cfg 和 3 条记录时,它正在工作
1.cfg内容:
4
2
1
但是,一旦我在配置文件中使用带有 4 个分隔符的文件( 6 4 2 和 1 ),它就不会打印记录号 2(即 4),并且打印记录为 6 2 和 1
这是我的示例代码
j=`cat 1.cfg |wc -l`
echo "Total split" $j
counter=0
set -x
for i in `cat 1.cfg`
do
counter=`expr $counter + 1`
echo "Printing value of counter " $counter
# If there is only one field in the config file
if [ "$j" = 1 ]
then
COMMAND_FINAL=`echo "sed -i 's/./&,/$i' 1.txt"`
#COMMAND_FINAL=`echo "`sed -i 's/./&,/$i' 1.txt`"`
fi
# If there are more than one fields in the config file and for first record generating the command
if [[ "$counter" != "$j" && "$counter" = 1 ]]
then
COMMAND=`echo "sed -i 's/./&,/$i;"`
#COMMAND=`echo "`sed -i 's/./&,/$i;"`
echo "Value of COMMAND VARIABLE is" $COMMAND
# For the 2nd fields untill 2nd last field generating the command
elif [[ "$counter" != "$j" && "$counter" != 1 ]]
then
COMMAND1=`echo "s/./&,/$i;"`
COMMAND2=$COMMAND$COMMAND1
echo "Value of command :" $COMMAND
echo "Value of command1 :" $COMMAND1
echo "Value of command2 :" $COMMAND2
#echo "If i is not 1 and i is not last Printing middle records" $COMMAND2
# For the last field generating the command
elif [[ "$counter" = "$j" && "$j" != 1 ]]
then
COMMAND3=`echo "s/./&,/$i' 1.txt"`
#COMMAND3=`echo "s/./&,/$i' 1.txt"`
COMMAND_FINAL=$COMMAND2$COMMAND3
echo "Final Command is " $COMMAND_FINAL
fi
done
set -x
echo "$COMMAND_FINAL" > execute.ksh
chmod 755 execute.ksh
./execute.ksh
echo "Executing the final command"
答案1
使用 GNU awk:
awk '{$1=$1}1' FIELDWIDTHS='1 2 2 4 1' OFS=',' file
或者使用 GNU sed:
sed -r 's/^(.{1})(.{2})(.{2})(.{4})(.{1})$/\1,\2,\3,\4,\5/' file
输出:
1,23,45,6789,0 9,87,65,4321,0