我正在对多列日志填充进行后处理,格式如下:
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_07_lig_cne_420,6, -5.3300, 201.2781, 0,, 26, 8, 1, -0.2132
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_10_lig_cne_420,5, -5.2300, 230.0910, 0,, 26, 8, 1, -0.2092
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_12_lig_cne_420,4, -5.1500, 222.2095, 0,, 26, 8, 1, -0.2060
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_07_lig_cne_420,5, -5.0500, 201.1757, 0,, 26, 8, 1, -0.2020
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_12_lig_cne_420,2, -5.0200, 233.0833, 0,, 26, 8, 1, -0.2008
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_05_lig_cne_420,5, -4.9500, 203.5671, 0,, 26, 8, 1, -0.1980
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_08_lig_cne_420,4, -4.9500, 227.0462, 0,, 26, 8, 1, -0.1980
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_12_lig_cne_420,14, -4.7700, 231.9237, 0,, 26, 8, 1, -0.1908
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_03_lig_cne_420,5, -4.7200, 194.9009, 0,, 26, 8, 1, -0.1888
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_11_lig_cne_420,3, -4.6700, 217.3995, 0,, 26, 8, 1, -0.1868
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_07_lig_cne_420,1, -4.6400, 200.7227, 0,, 26, 8, 1, -0.1856
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_09_lig_cne_420,1, -4.5900, 184.7898, 0,, 26, 8, 1, -0.1836
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_11_lig_cne_420,3, -4.5500, 215.7487, 0,, 26, 8, 1, -0.1820
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_08_lig_cne_420,3, -4.4500, 198.2857, 0,, 26, 8, 1, -0.1780
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_07_lig_cne_420,1, -4.4200, 204.6418, 0,, 26, 8, 1, -0.1768
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_08_lig_cne_420,6, -4.3700, 199.5359, 0,, 26, 8, 1, -0.1748
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_06_lig_cne_420,6, -4.3500, 232.3248, 0,, 26, 8, 1, -0.1740
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_06_lig_cne_420,3, -4.2700, 234.3468, 0,, 26, 8, 1, -0.1708
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_05_lig_cne_420,1, -4.2500, 195.9439, 0,, 26, 8, 1, -0.1700
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_03_lig_cne_420,7, -4.2400, 198.9363, 0,, 26, 8, 1, -0.1696
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_03_lig_cne_420,1, -4.1600, 208.6377, 0,, 26, 8, 1, -0.1664
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_01_lig_cne_420,3, -4.1500, 179.4341, 0,, 26, 8, 1, -0.1660
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_12_lig_cne_420,4, -4.1300, 233.9607, 0,, 26, 8, 1, -0.1652
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_09_lig_cne_420,1, -4.1200, 189.5660, 0,, 26, 8, 1, -0.1648
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_10_lig_cne_420,1, -4.1100, 209.8679, 0,, 26, 8, 1, -0.1644
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_11_lig_cne_420,5, -4.1000, 213.5573, 0,, 26, 8, 1, -0.1640
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_08_lig_cne_420,1, -4.0700, 227.6124, 0,, 26, 8, 1, -0.1628
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_05_lig_cne_420,3, -4.0400, 209.6345, 0,, 26, 8, 1, -0.1616
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_06_lig_cne_420,4, -3.9700, 233.5914, 0,, 26, 8, 1, -0.1588
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_11_lig_cne_420,4, -3.9500, 223.9189, 0,, 26, 8, 1, -0.1580
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_05_lig_cne_420,1, -3.9000, 180.8133, 0,, 26, 8, 1, -0.1560
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_11_lig_cne_420,1, -3.9000, 224.1828, 0,, 26, 8, 1, -0.1560
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_02_lig_cne_420,1, -3.8800, 204.1735, 0,, 26, 8, 1, -0.1552
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_09_lig_cne_420,1, -3.8500, 195.5399, 0,, 26, 8, 1, -0.1540
/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/7000_cne_lig420.AllBoxes/7000_10_lig_cne_420,2, -3.8400, 227.9037, 0,, 26, 8, 1, -0.1536
请注意,第 1 列和第 2 列以逗号 (,) 分隔,其余列以逗号空格 (, ) 分隔。从这个日志文件中我需要:
- 将第一列(长unix格式路径
/Users/gleb/Desktop/scripts/...
)中的所有数据替换为相应的行号(仅第N行); - 删除第 6-9 列(最后四列);
最终生成的日志应包含相同数量的行,但仅从第 1 列(带替换!)到第 5 列(带 的最后一列0,
)。
我能够完成的是使用 sed 在第一列中进行替换,但这只是切断了路径,但没有在那里引入相应的行号:
sed -i '' -e 's|\/Users/gleb/Desktop/scripts/analys_clusters/sub_folders_to_analyse/*.*/||' log.txt
答案1
gawk -F'^[^,]*,|, ' '{ print NR, $2, $3, $4, $5; }' OFS=', ' infile
首先跳过氮行,添加NR> N
到awk,所以首先氮行将被跳过;要跳过第一行,你可以这样做:
gawk -F'^[^,]*,|, ' 'NR> 1{ print NR, $2, $3, $4, $5; }' OFS=', ' infile
随后您需要修改NR
为NR-1
,因此将从1不是2,或者将其替换为另一个临时变量,例如:
gawk -F'^[^,]*,|, ' 'NR> 1{ print ++lineNumber, $2, $3, $4, $5; }' OFS=', ' infile
^[^,]*,
匹配从行首到第一个逗号字符;
,
匹配逗号空格字符。
上面这些我们定义为字段分隔符(用 分隔|
),并基于此我们打印相应的字段;NR
在awk代表当前行号。
另一种选择是使用cut
and nl
:
<infile cut -d',' -f2-6 |nl -w1 -s', '
cut
命令剪切字段 2~6 并nl
用逗号分隔的行编号,
;-w
将 1 列宽度设置为数字。