我有一个 CSV 文件,如下所示:
我想INITIAL OFFER
从此文件中删除“”块并仅保留“ FINAL OFFER
”块我还想从第一个字段中删除逗号(,),并从最后一列中删除多余的空格,以便更轻松地搜索这些列。
输入
500076592, INITIAL OFFER
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500076592,
500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500076592,
500028952, INITIAL OFFER
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,
输出
500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,
答案1
sed -e '/FINAL OFFER/p;/INITIAL OFFER/,/FINAL OFFER/ d' input.csv > output.csv
这会再次打印 FINAL OFFER 行,因为它即将被范围删除/INITIAL OFFER/,/FINAL OFFER/
。
答案2
如果您使用管道作为分隔符,则可以轻松awk
根据字段数量过滤数据,例如:
awk -F'|' 'NF==2 { f=1 } NF==1 { f=0 } f' infile
打高尔夫球:
awk -F\| 'NF==1{f=0}NF==2{f=1}f'
答案3
您可以使用sed
删除INITIAL OFFER
和 之间的所有内容,只包含数字和一个逗号:
$ sed '/INITIAL OFFER/,/^[0-9][0-9]*,$/d' file
500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500076592,
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,
如果您不想包含500076592,
和行,请使用500028952,
@cas的更简单的方法,或者你可以这样做:
$ sed '/INITIAL OFFER/,/^[0-9][0-9]*,$/d; /^[0-9][0-9]*,$/d' file
500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
答案4
使用GNU sed扩展正则表达式模式打开-E
sed -En '
/^[^|]*\|?[^|]*$/h
G;/\n.*\|/P
' file
笔记:
- 停止