替换引号内的特定字符

替换引号内的特定字符

我正在尝试屏蔽日志文件中的一些敏感数据。

我首先需要使用匹配的模式从文件中过滤出特定行,然后对于这些特定行,我需要替换双引号内的任何文本,但保留任何不在双引号内的文本。

在文件中,与模式匹配的所有行(包含双引号)、双引号内的任何内容都需要以任何 AZ 替换为 X、任何 az 替换为 x、任何数字 0-9 替换为 0 的方式进行替换。

一行中可以有多个带引号的字符串。内部引号也可以是特殊字符,例如“,”、“-”、“.”、“@”,这些字符应按原样保留。

示例文件内容(本例中的过滤词是“KEYWORD”):

2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Replace This"}}} -> {:entry1 {:entry2 {:value "Replace ALSO this."}}}
2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "REplace. THIS 12345"}}}
2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}}
2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}}

作为输入的该文件将被处理为以下输出:

2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Xxxxxxx Xxxx"}}} -> {:entry1 {:entry2 {:value "Xxxxxxx XXXX xxxx."}}}
2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "XXxxxxx. XXXX 00000"}}}
2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}}
2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}}

需要在文件中更新更改的行,或者应将经过这些修改的整个文件扔到标准输出中(还有那些没有关键字、行顺序等的行。应保留详细信息。

是否可以使用 bash 脚本/命令行工具(如 grep 和/或 sed)来完成此任务?

答案1

awk '/KEYWORD/{
    n=split($0,a,"\"")
    for(i=2;i<=n;i=i+2){
        gsub(/[A-Z]/,"X",a[i])
        gsub(/[a-z]/,"x",a[i])
        gsub(/[0-9]/,"0",a[i])
    }
    sep=""
    for (i=1;i<=n;i++){
        printf "%s%s",sep,a[i]
        sep="\""
    }
    printf "\n"
    next
}
1' file

例如,在更新的输入文件上

2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Replace This"}}} -> {:entry1 {:entry2 {:value "Replace ALSO this."}}}
2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "REplace. THIS 12345"}}}
2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}}
2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}}

这个 awk 产生所需的输出

2020-04-18 15:01:12 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "Xxxxxxx Xxxx"}}} -> {:entry1 {:entry2 {:value "Xxxxxxx XXXX xxxx."}}}
2020-04-18 15:01:13 [EVENT] :log-event-with-KEYWORD: {:entry1 {:entry2 {:value "XXxxxxx. XXXX 00000"}}}
2020-04-18 15:01:15 [EVENT] :this_has--the-KEYWORD: {:entry1 {:entry2 {:value "[email protected]"}}} -> {:entry1 {:entry2 {:value "[email protected]"}}}
2020-04-18 15:01:18 [EVENT] :log-event-without-keyword: {:entry1 {:entry2 {:value "Do NOT replace this."}}} -> {:entry1 {:entry2 {:value "Do-NoT replace this either"}}}

答案2

使用sed

sed -E '/KEYWORD/{
        :lower s/("[^"]*)[a-z]([^"]*")/\1_\2/; t lower;
        :upper s/("[^"]*)[A-Z]([^"]*")/\1-\2/; t upper;
        :digit s/("[^"]*)[0-9]([^"]*")/\1*\2/; t digit;
}; y/*_-/0xX/' infile

/KEYWORD/{...}仅当一行与字符串匹配时,才会运行块中的代码集KEYWORD

("[^"]*)[###]([^"]*")与 a 以及此后的任何内容匹配,"直到找到第一个小写[a-z]/大写[A-Z]/数字[0-9]字符,该字符由任何内容流动,直到另一个引号匹配。

每个部分都会一遍又一遍地循环,直到所有这些字符都被小写转换为_,大写转换为-,数字转换为*(笔记:如果您的文件中可能出现这些字符,请选择不同的字符;原因是我们没有直接替换为xorX或 ,0因为使用后它会导致 sed 无限循环sed 的循环替换每个小/大/数字字符)。

完成后,这些字符*_-将转换为0xX.

向上述命令添加-i选项以更新输入文件中的更改,例如sed -i -E ....


更新:修改问题的命令:

sed -E '/KEYWORD/{
        :lower s/^(([^"]*("[^"]*"){0,1})*)("[^"]*)[a-z]([^"]*")/\1\4_\5/; t lower;
        :upper s/^(([^"]*("[^"]*"){0,1})*)("[^"]*)[A-Z]([^"]*")/\1\4+\5/; t upper;
        :digit s/^(([^"]*("[^"]*"){0,1})*)("[^"]*)[0-9]([^"]*")/\1\4*\5/; t digit;
}; y/*_+/0xX/' infile

答案3

使用珀尔:

$ perl -ne 'if ( $_ =~ /KEYWORD/){
  ($first,$matched,$last) = ($1,$2,$3) if ( $_ =~ /^(.*)?\"(.*)\"(.*)$/ );
  $matched =~ tr/[a-z]/x/;$matched =~ tr/[A-Z]/X/;$matched =~ tr/0-9/0/;
  print $first."\"".$matched."\"".$last."\n";
  }
  else { print }' <<inputFile>>

编辑:如果模式出现多次。以下将起作用;

$ perl -ne ' {
  if ( $_ =~ /KEYWORD/ ){
  $line=$_;$val=1;
  while($val) {
  ($first,$matched,$last) =  ($1,$2,$3) if ( $line =~ m/(.*?)\"(.*?)\"(.*)$/ );
  $val =  $line =~ s/\".*?\"/_/;
  $matched =~ tr/[a-z]/x/;$matched =~ tr/[A-Z]/X/;$matched =~ tr/0-9/0/;
  $matched = "_".$matched."_";
  $line=$first.$matched.$last;
  }
  $line =~ s/[_]*_/"/g;
  print "$line\n";
  }else { print } }' <<inputFile>>

相关内容