假设我有一个文本文件:
aaa-123;bread;apple;banana
aaa-123;bread;apple;banana
aaa-123;bread;apple;banana
bbb-123;bread;apple;banana
bbb-1234;bread;app-le;banana
bbb-222;bread;apple;banana
我需要删除 - 和 之间的内容;通过 awk 在第一列上预期结果:
aaa;bread;apple;banana
aaa;bread;apple;banana
aaa;bread;apple;banana
bbb;bread;apple;banana
bbb;bread;app-le;banana
bbb;bread;apple;banana
答案1
无论哪个字段包含-
s 以及第一个字段是否包含 s,这都可以在每个 Unix 机器上的任何 shell 中使用任何 awk:
$ awk 'BEGIN{FS=OFS=";"} {sub(/-.*/,"",$1)} 1' file
aaa;bread;apple;banana
aaa;bread;apple;banana
aaa;bread;apple;banana
bbb;bread;apple;banana
bbb;bread;app-le;banana
bbb;bread;apple;banana
答案2
用于sed
非贪婪(尽可能短)匹配:
sed 's/-[^;]*;/;/' infile
答案3
withawk
和 usingsplit()
函数:
awk -v FS=';' 'split($1,a,/-/) {$1=a[1];print $1, $2, $3, $4}' OFS=';' file
aaa;bread;apple;banana
aaa;bread;apple;banana
aaa;bread;apple;banana
bbb;bread;apple;banana
bbb;bread;app-le;banana
bbb;bread;apple;banana
答案4
awk '{gsub(/-[0-9]*/,"",$1);print }' filename
sed 's/-[0-9]*//g' filename
Python
#!/usr/bin/python
import os
import re
m=re.compile(r'-[0-9]*')
k=open('filename','r')
for i in k:
m=re.sub(o,"",i)
print m.strip()
输出
aaa;bread;apple;banana
aaa;bread;apple;banana
aaa;bread;apple;banana
bbb;bread;apple;banana
bbb;bread;apple;banana
bbb;bread;apple;banana