仅匹配两个文件中的一行并返回文件 2 中的整列

Question 1

使用egrep应该有帮助。尝试：

grep -E '(905894|1197693|3703749|92108275|114940633)' file2

这将检查中提供的模式是否出现file1。测试时，我得到了这样的结果：

[rkahil@xxxxxx ~]$ grep -E '(905894|1197693|3703749|92108275|114940633)' file2
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

Answer

使用egrep应该有帮助。尝试：

grep -E '(905894|1197693|3703749|92108275|114940633)' file2

这将检查中提供的模式是否出现file1。测试时，我得到了这样的结果：

[rkahil@xxxxxx ~]$ grep -E '(905894|1197693|3703749|92108275|114940633)' file2
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

Question 2

要匹配数字，您可以使用 grep 返回一行：

$ grep 883625 file2
1 mapping   883625  SNV 1   A   G       Homozygous  23

如果你想输出所有file2包含数字的行file1：

$ grep -f file1 file2
Mapping  Reference Position Type    Length  Reference   Allele  Linkage Zygosity    Count
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

即-fgrep 选项扫描您的file1并尝试在中查找匹配项file2。在这里，标题也与中的第一行匹配file1。从man grep：

-f FILE, --file=FILE
     Obtain  patterns  from  FILE,  one per line.  The empty file
     contains zero patterns, and therefore matches nothing.

Answer

要匹配数字，您可以使用 grep 返回一行：

$ grep 883625 file2
1 mapping   883625  SNV 1   A   G       Homozygous  23

如果你想输出所有file2包含数字的行file1：

$ grep -f file1 file2
Mapping  Reference Position Type    Length  Reference   Allele  Linkage Zygosity    Count
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

即-fgrep 选项扫描您的file1并尝试在中查找匹配项file2。在这里，标题也与中的第一行匹配file1。从man grep：

-f FILE, --file=FILE
     Obtain  patterns  from  FILE,  one per line.  The empty file
     contains zero patterns, and therefore matches nothing.

Question 3

使用`grep`文件中的读取模式

我们可以使用整行作为file1固定字符串模式 - 该选项-F告诉grep不要将模式解释为正则表达式。
Grep 可以选择-f读取文件的模式，每行一个模式。这正是我们所拥有的，因此可以直接从文件中读取模式。

$ grep -F -f file1 file2       
Mapping  Reference Position Type    Length  Reference   Allele  Linkage Zygosity    Count
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

避免匹配子字符串

上面的命令使用来自的模式file1，例如905894。如果其中有一行file2has 9058940，则905894会匹配，因为它匹配前六个字符。那是错误的。所以我们需要让它只匹配整个单词。我们可以更改模式以匹配单词的开头和结尾，例如'\b905894\b'，但grep有一个针对这种常见情况的选项：-w：

grep -F -w -f file1 file2

排除标头

我们不需要标头，即使显示它们在技术上是正确的，因为标头Reference Position出现在两个文件中。

使用tail -n +2，我们选择从第二行开始的行，并用作-文件名来读取tailstdin 的输出：

$ tail -n +2 file1 | grep -F -w -f - file2
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

在 bash 中，这几乎是相同的：

grep -F -w -f <(tail -n +2 file1) file2

Answer

使用`grep`文件中的读取模式

我们可以使用整行作为file1固定字符串模式 - 该选项-F告诉grep不要将模式解释为正则表达式。
Grep 可以选择-f读取文件的模式，每行一个模式。这正是我们所拥有的，因此可以直接从文件中读取模式。

$ grep -F -f file1 file2       
Mapping  Reference Position Type    Length  Reference   Allele  Linkage Zygosity    Count
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

避免匹配子字符串

上面的命令使用来自的模式file1，例如905894。如果其中有一行file2has 9058940，则905894会匹配，因为它匹配前六个字符。那是错误的。所以我们需要让它只匹配整个单词。我们可以更改模式以匹配单词的开头和结尾，例如'\b905894\b'，但grep有一个针对这种常见情况的选项：-w：

grep -F -w -f file1 file2

排除标头

我们不需要标头，即使显示它们在技术上是正确的，因为标头Reference Position出现在两个文件中。

使用tail -n +2，我们选择从第二行开始的行，并用作-文件名来读取tailstdin 的输出：

$ tail -n +2 file1 | grep -F -w -f - file2
1 mapping   905894  SNV 1   C   T       Heterozygous    41
1 mapping   1197693 SNV 1   G   A       Heterozygous    23

在 bash 中，这几乎是相同的：

grep -F -w -f <(tail -n +2 file1) file2

仅匹配两个文件中的一行并返回文件 2 中的整列

答案1

答案2

答案3

使用`grep`文件中的读取模式

避免匹配子字符串

排除标头

相关内容

答案1

答案2

答案3

使用grep文件中的读取模式

避免匹配子字符串

排除标头

相关内容

使用`grep`文件中的读取模式