打印与模式匹配的文本块

打印与模式匹配的文本块

仅使用 shell 脚本,如何搜索文本文件并列出包含某些文本的所有整行块(简单的 grep 标准)。

"-----------------"该文本文件具有由(准确地说,每个块以"\n\n\n--------------------"......大约 50 个字符“-”)分隔的行块。

样本可以是:

-------------------------------
Abracadabra, blablablalbalba
blablablabla, banana



-------------------------------
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text


-------------------------------  
Text, sample text, sample text, sample text
banana. Sample text, sample text, sample text, sample text
Text, sample text, sample text, sample text

让我们考虑“香蕉”这个词作为搜索条件。因此,列出的块将是:

-------------------------------
Abracadabra, blablablalbalba
blablablabla, banana


-------------------------------
Text, sample text, sample text, sample text
banana. Sample text, sample text, sample text, sample text
Text, sample text, sample text, sample text

编辑:

测试答案以尝试 awk,例如:awk 'BEGIN{RS="\n------------"}/INFO/{print}'其中 INFO 是搜索的内容。我无法获得整个街区。因此,下面是一个真实的样本和结果:

真实样品(包括前 3 行新行):


-------------------------------------------------
单独目录:adis、IWZLM (/home/interx/adis/src/IWZLM.SRC)
Gerando rotina em linguagem C:
(yla5 adis IWZLM -if)
.INFO =>Rotina BLOQUEADA(状态“M”):Geracao ignorada(使用 -is para ignorar checagem do 状态)

[ OK-I ] IWZLM (adis) - Lista 外行:Geracao ignorada do codigo em C.



-------------------------------------------------
目录中的单独目录:adis、ADISA (/home/interx/adis/src/ADISA.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISA -if)
.ERRO:Falha inesperada

编译程序:
(ycomp adis ADISA -exe adis/exe/ADISA.temp.exe )
adis/exe/ADISA.temp.exe => adis/exe/ADISA

[ OK ] ADISA (adis) - 菜单 A:Gerada e compilada com sucesso。



-------------------------------------------------
目录分别为:adis、ADISD1 (/home/interx/adis/src/ADISD1.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISD1 -if)
.ATENCAO:定义本地化

编译程序:
(ycomp adis ADISD1 -exe adis/exe/ADISD1.temp.exe )
adis/exe/ADISD1.temp.exe => adis/exe/ADISD1

[ OK ] ADISD1 (adis) - 菜单:Gerada e compilada com sucesso。

我无法获取整个块,只能获取包含“INFO”的行,就像普通的 grep 一样,无论是否设置 ORS:

$ cat file  | awk 'BEGIN{RS="\n------------"}/INFO/{print}' 
.INFO =>Rotina BLOQUEADA (status 'M'): Geracao ignorada (use -is para ignorar checagem do status)

笔记: 它是 AIX 7.1 中的 awk,而不是 gawk。

答案1

awk '
{
  if (/-------------------------------------------------/) {
    if (hold ~ /INFO/) {
      print hold;
    }
    hold="";
  } else {
    hold=hold "\n" $0
  }
} 
END {
  if (hold ~ /INFO/) {
    print hold;
  }
}' file

这使用“保持”变量(ala sed)来累积分隔块之间的行;一旦遇到新块(或 EOF),仅当它与 /INFO/ 模式匹配时才打印保存的值。

(回复:较旧的评论,我删除了之前不充分的 awk 和 perl 答案来清理这个答案)

答案2

awk如果您不需要-输出中的所有内容,应该很容易:

awk -vRS='----' '/banana/{print}' file

或者pcregrep

pcregrep -M '^-+[^-]*banana[^-]*' file

答案3

如果您不介意缺少前导空行,这里有一个sed解决方案:

sed '/---/b end                      # if line matches pattern branch to : end
//!{H                                # if it doesn't match, append to hold space
$!d                                  # and if not on the last line, delete it
$b end                               # if it's the last line branch to : end
}
: end                                # label end
x                                    # exchange hold buffer and pattern space
/PATTERN/!d                          # if pattern space doesn't match, delete it
' <infile

答案4

其中之一是,为了传递正则表达式,当涉及反斜杠时,必须对其进行转义。它针对提供的输入进行了测试真实样品

解析代码

#!/usr/bin/nawk -f
BEGIN{ORS=RS="\n\n\n"}   # the record separator is considering three \n
$0~var1{print}           # when record contains var1 print record 

执行

## the pattern is passed as var1 and is considering the occurrence of OK as a word
parrsel -v var1=paragraphs -vvar1='\\<OK\\>' data

-------------------------------------------------
Diretório separado do nome o arquivo: adis, IWZLM (/home/interx/adis/src/IWZLM.SRC)
Gerando rotina em linguagem C:
(yla5 adis IWZLM -if)
.INFO =>Rotina BLOQUEADA (status 'M'): Geracao ignorada (use -is para ignorar checagem do status)

[  OK-I ] IWZLM (adis) - Lista lay: Geracao ignorada do codigo em C.



-------------------------------------------------
Diretório separado do nome d arquivo: adis, ADISA (/home/interx/adis/src/ADISA.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISA -if)
.ERRO: Falha inesperada

Compilando o programa:
(ycomp adis ADISA -exe adis/exe/ADISA.temp.exe )
adis/exe/ADISA.temp.exe => adis/exe/ADISA

[  OK   ] ADISA (adis) - Menu A : Gerada e compilada com sucesso.

相关内容