使用单独的单词列表从文件中提取单词

Question

您可以使用 awk 函数的返回值index来确定 in 中的行是否b.txt包含中的子字符串a.txt。

index(in, find)

    Search the string in for the first occurrence of the string find, and return 
the position in characters where that occurrence begins in the string in.

例如：

awk '
  NR==FNR{strings[$1]; next}
  {
    m = ""
    for(s in strings){
      if(index($0,s) > 0) m = (m=="") ? s : m ", " s
    }
  }
  m != "" {print $0, ">", m}
' a.txt b.txt
threetwo > three, two
onetwothree > three, two, one
twozero > two

a.txt请注意， awk 中不保证数组遍历顺序（在本例中为由构造的子字符串数组）。

Answer 1

您可以使用 awk 函数的返回值index来确定 in 中的行是否b.txt包含中的子字符串a.txt。

index(in, find)

    Search the string in for the first occurrence of the string find, and return 
the position in characters where that occurrence begins in the string in.

例如：

awk '
  NR==FNR{strings[$1]; next}
  {
    m = ""
    for(s in strings){
      if(index($0,s) > 0) m = (m=="") ? s : m ", " s
    }
  }
  m != "" {print $0, ">", m}
' a.txt b.txt
threetwo > three, two
onetwothree > three, two, one
twozero > two

a.txt请注意， awk 中不保证数组遍历顺序（在本例中为由构造的子字符串数组）。

使用单独的单词列表从文件中提取单词

答案1

相关内容