想象一个包含随机文本和两个唯一标记的文本文件
01 text text text
02 text text text
03 __DELETE_THIS_FIRST__
04 text text text
05 text text text
06 text text text
07 text text text
08 __DELETE_THIS_LINE_SECOND__
09 a few
10 interesting
11 lines follow
12 __DELETE_THIS_LINE_THIRD__
13 text text text
14 text text text
15 text text text
16 text text text
17 __DELETE_THIS_LINE_FIRST__
18 text text text
19 text text text
20 text text text
21 text text text
22 __DELETE_THIS_LINE_SECOND__
23 even
24 more
25 interesting lines
26 __DELETE_THIS_LINE_THIRD__
我想要一个 Python 表达式,将 END 标记和 thr THIRD 标记之间的有趣行移动到以前的BEGIN 标记并删除所有三个标记。这应该导致:
01 text text text
02 text text text
09 a few
10 interesting
11 lines follow
04 text text text
05 text text text
06 text text text
07 text text text
13 text text text
14 text text text
15 text text text
16 text text text
23 even
24 more
25 interesting lines
18 text text text
19 text text text
20 text text text
21 text text text
这三个标记始终是三元组,并且在文件中出现多次。 FIRST 标记始终出现在 SECOND 标记之前,而 SECOND 标记始终出现在 THIRD 标记之前
有任何想法吗?
有关的:126325
答案1
这是一个 Python 脚本,可以完成这项工作。
#! /usr/bin/env python
buffer = []
markerBuffer = []
beginFound = False
endFound = False
begin_marker = "__DELETE_THIS_LINE_FIRST__"
end_marker = "__DELETE_THIS_LINE_SECOND__"
line_count_marker = "__DELETE_THIS_LINE_THIRD__"
with open('hello.txt') as inFile:
with open('hello_cleaned.txt', 'w') as outFile:
for line in inFile:
if begin_marker in line and delete_marker in line:
beginFound = True
continue
if end_marker in line and delete_marker in line:
assert beginFound is True
endFound = True
continue
if beginFound and not endFound:
markerBuffer.append(line)
continue
if beginFound and endFound and line_count_marker not in line:
buffer.append(line)
continue
if beginFound and endFound and line_count_marker in line:
for mLine in markerBuffer:
buffer.append(mLine)
markerBuffer = []
beginFound = False
endFound = False
continue
if not beginFound and not endFound:
buffer.append(line)
continue
for line in buffer:
outFile.write(str(line))