我有一个客户电子邮件列表,并且想要删除一些以 .br 结尾的电子邮件。我通常会执行以下命令:
sed -i '/.br/d' customers.csv
但这也会删除类似的客户电子邮件[email protected]
。
客户详细信息示例如下:
"Phone Number","[email protected]","NAME"
如何仅删除以 结尾的客户电子邮件.br
?
答案1
使用磨坊主( mlr
) 将文件读取为无标头 CSV 文件,然后对其进行过滤,以便仅.br
保留第二个字段不以以下字符结尾的记录:
mlr --csv -N filter '$2 !=~ "\.br$"' file
如果要引用输出中的所有字段,请--quote-all
在 后面添加-N
。如果您有标头,请删除-N
并使用标头名称代替$2
,例如$email !=~ "\.br$"
。
测试:
$ cat file
"Phone Number","[email protected]","NAME"
"Phone Number2","[email protected]","NAME2"
"Phone Number3","[email protected]","NAME3"
"Phone Number","[email protected]","NAME"
"Phone Number","[email protected]","NAME.br"
$ mlr --csv -N filter '$2 !=~ "\.br$"' file
Phone Number,[email protected],NAME
Phone Number3,[email protected],NAME3
Phone Number,[email protected],NAME
Phone Number,[email protected],NAME.br
答案2
你需要逃离.
这样它就不会匹配任何字符,以确保它不会匹配类似“[电子邮件受保护]” 例如。您还可以查找.br
出现的后一个@
。
尝试
sed -i '/".*\@[^"]*\.br"/d' customer.csv
这是一个运行示例:
~$ echo '"Phone Number","[email protected]","NAME"
> "Phone Number2","[email protected]","NAME2"
> "Phone Number3","[email protected]","NAME3"
> "Phone Number","[email protected]","NAME"
> "Phone Number","[email protected]","NAME.br"' > customers.csv
~$ cat customers.csv
"Phone Number","[email protected]","NAME"
"Phone Number2","[email protected]","NAME2" <-- should get deleted
"Phone Number3","[email protected]","NAME3"
"Phone Number","[email protected]","NAME"
"Phone Number","[email protected]","NAME.br"
~$ sed -i '/".*@.*\.br"/d' customer.csv
~$ cat customers.csv
"Phone Number","[email protected]","NAME"
"Phone Number3","[email protected]","NAME3"
"Phone Number","[email protected]","NAME"
"Phone Number","[email protected]","NAME.br"