我有一个文本文件(大小超过 1GB)并且包含如下行:
10830110bcdf9002a6ade209c5cafbc02e90f84696b04c166c7029c427d1ef4a56580dbbce84a0574ba1fc17c8035ccec4679e5dcb6a6a331ebdb15d6cc0661378f409c3
1083021106e581c71003b987a75f18543cf5858b9fcfc5e04c0dddd79cd18764a865ba86d027de6d1900dc171e4d90a0564abbce99b812b821bd0d7d37aad72ead19c17
10840110dbd43121ef0c51a8ba62193eac247f57f1909e270eeb53d68da60ad61519f19cfb0511ec2431ca54e2fcabf6fa985615ec06def5ba1b753e8ad96d0564aa4c
1084011028375c62fd132d5a4e41ffef2419da345b6595fba8a49b5136de59a884d878fc9789009843c49866a0dc97889242b9fb0b8c112f1423e3b220bc04a2d7dfbdff
10880221005f0e261be654e4c52034d8d05b5c4dc0456b7868763367ab998b7d5886d64fbb24efd14cea668d00bfe8048eb8f096c3306bbb31aaea3e06710fa8c0bb8fca71
108501103461fca7077fc2f0d895048606b828818047a64611ec94443e52cc2d39c968363359de5fc76df48e0bf3676b73b1f8fea5780c2af22c507f83331cc0fbfe6ea9
1085022100a4ce8a09d1f28e78530ce940d6fcbd3c1fe2cb00e7b212b893ce78f8839a11868281179b4f2c812b8318f8d3f9a598b4da750a0ba6054d7e1b743bb67896ee62
1086022100638681ade4b306295815221c5b445ba017943ae59c4c742f0b1442dae4902a56d173a6f859dc6088b6364224ec17c4e2213d9d3c96bd9992b696d7c13b234b50
我有这些字符串:
10830110
1083021
10840110
10840110
1088022100
10850110
1085022100
1086022100
我需要从文本文件的每一行的开头删除这些字符串,以便最终结果保存到新文件中,上面的行将如下所示:
bcdf9002a6ade209c5cafbc02e90f84696b04c166c7029c427d1ef4a56580dbbce84a0574ba1fc17c8035ccec4679e5dcb6a6a331ebdb15d6cc0661378f409c3
106e581c71003b987a75f18543cf5858b9fcfc5e04c0dddd79cd18764a865ba86d027de6d1900dc171e4d90a0564abbce99b812b821bd0d7d37aad72ead19c17
dbd43121ef0c51a8ba62193eac247f57f1909e270eeb53d68da60ad61519f19cfb0511ec2431ca54e2fcabf6fa985615ec06def5ba1b753e8ad96d0564aa4c
28375c62fd132d5a4e41ffef2419da345b6595fba8a49b5136de59a884d878fc9789009843c49866a0dc97889242b9fb0b8c112f1423e3b220bc04a2d7dfbdff
5f0e261be654e4c52034d8d05b5c4dc0456b7868763367ab998b7d5886d64fbb24efd14cea668d00bfe8048eb8f096c3306bbb31aaea3e06710fa8c0bb8fca71
3461fca7077fc2f0d895048606b828818047a64611ec94443e52cc2d39c968363359de5fc76df48e0bf3676b73b1f8fea5780c2af22c507f83331cc0fbfe6ea9
a4ce8a09d1f28e78530ce940d6fcbd3c1fe2cb00e7b212b893ce78f8839a11868281179b4f2c812b8318f8d3f9a598b4da750a0ba6054d7e1b743bb67896ee62
638681ade4b306295815221c5b445ba017943ae59c4c742f0b1442dae4902a56d173a6f859dc6088b6364224ec17c4e2213d9d3c96bd9992b696d7c13b234b50
答案1
如果我理解正确的话,您有一个名为的文件file1
,其中包含如下行:
10830110bcdf9002a6ade209c5cafbc02e90f84696b04c166c7029c427d1ef4a56580dbbce84a0574ba1fc17c8035ccec4679e5dcb6a6a331ebdb15d6cc0661378f409c3
1083021106e581c71003b987a75f18543cf5858b9fcfc5e04c0dddd79cd18764a865ba86d027de6d1900dc171e4d90a0564abbce99b812b821bd0d7d37aad72ead19c17
10840110dbd43121ef0c51a8ba62193eac247f57f1909e270eeb53d68da60ad61519f19cfb0511ec2431ca54e2fcabf6fa985615ec06def5ba1b753e8ad96d0564aa4c
1084011028375c62fd132d5a4e41ffef2419da345b6595fba8a49b5136de59a884d878fc9789009843c49866a0dc97889242b9fb0b8c112f1423e3b220bc04a2d7dfbdff
10880221005f0e261be654e4c52034d8d05b5c4dc0456b7868763367ab998b7d5886d64fbb24efd14cea668d00bfe8048eb8f096c3306bbb31aaea3e06710fa8c0bb8fca71
108501103461fca7077fc2f0d895048606b828818047a64611ec94443e52cc2d39c968363359de5fc76df48e0bf3676b73b1f8fea5780c2af22c507f83331cc0fbfe6ea9
1085022100a4ce8a09d1f28e78530ce940d6fcbd3c1fe2cb00e7b212b893ce78f8839a11868281179b4f2c812b8318f8d3f9a598b4da750a0ba6054d7e1b743bb67896ee62
1086022100638681ade4b306295815221c5b445ba017943ae59c4c742f0b1442dae4902a56d173a6f859dc6088b6364224ec17c4e2213d9d3c96bd9992b696d7c13b234b50
以及一个名为的文件file2
,其中包含如下行:
10830110
1083021
10840110
10840110
1088022100
10850110
1085022100
1086022100
并且您想要创建一个名为 的新文件file3
,其中包含从中file1
减去(从每行的开头)匹配行中的字符串,file2
如下所示:
bcdf9002a6ade209c5cafbc02e90f84696b04c166c7029c427d1ef4a56580dbbce84a0574ba1fc17c8035ccec4679e5dcb6a6a331ebdb15d6cc0661378f409c3
106e581c71003b987a75f18543cf5858b9fcfc5e04c0dddd79cd18764a865ba86d027de6d1900dc171e4d90a0564abbce99b812b821bd0d7d37aad72ead19c17
dbd43121ef0c51a8ba62193eac247f57f1909e270eeb53d68da60ad61519f19cfb0511ec2431ca54e2fcabf6fa985615ec06def5ba1b753e8ad96d0564aa4c
28375c62fd132d5a4e41ffef2419da345b6595fba8a49b5136de59a884d878fc9789009843c49866a0dc97889242b9fb0b8c112f1423e3b220bc04a2d7dfbdff
5f0e261be654e4c52034d8d05b5c4dc0456b7868763367ab998b7d5886d64fbb24efd14cea668d00bfe8048eb8f096c3306bbb31aaea3e06710fa8c0bb8fca71
3461fca7077fc2f0d895048606b828818047a64611ec94443e52cc2d39c968363359de5fc76df48e0bf3676b73b1f8fea5780c2af22c507f83331cc0fbfe6ea9
a4ce8a09d1f28e78530ce940d6fcbd3c1fe2cb00e7b212b893ce78f8839a11868281179b4f2c812b8318f8d3f9a598b4da750a0ba6054d7e1b743bb67896ee62
638681ade4b306295815221c5b445ba017943ae59c4c742f0b1442dae4902a56d173a6f859dc6088b6364224ec17c4e2213d9d3c96bd9992b696d7c13b234b50
可以在 bash 中这样做:
while read p; do
read l <&4
n=${l#"$p"}
echo "${n}" >> file3
done <"file2" 4<"file1"
另一方面,如果您没有file2
,而是有一组类似的字符串10830110, 1083021 ... and 1086022100
,您想在它们出现在任何行的开头时删除它们,file1
并将修改后的行保存到名为 的新文件中,file3
那么可以使用 sed 来完成,如下所示:
sed 's/^10830110\|^1083021\|^10840110\|^10840110\|^1088022100\|^10850110\|^1085022100\|^1086022100//g' file1 > file3