我正在尝试精简一个庞大的数据库,以便使用 JSON 文件的相关信息。它有一些非常长的行(每行约 400 个字符)和几千个条目,其中我需要省略从及(
以后的所有内容、从http
及以后的所有内容或从MISSING
及以后的所有内容(具体取决于行)。
大多数行不包含()[]
信息,但所有行都包含http
信息。http
信息总是跟在()
包含信息的行之后。
这是一个例子,出于显而易见的原因,我缩短了长度。
PCSH10160 Attack of the Toy Tanks (3.61+!) [3.69] http://zeu
PCSH10162 Paradox Soul http://zeus.dl.playstation.net/cdn
PCSH10146 Hoggy2 http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394 Mekabolt http://zeus.dl.playstation.net/cdn/EP0
PCSH10186 Himno http://zeus.dl.playstation.net/cdn/HP2
PCSG01285 MELLKISS http://zeus.dl.playstation.net/cdn/JP0
PCSB01365 Habroxia http://zeus.dl.playstation.net/cdn/EP5
PCSE01423 Color Slayer http://zeus.dl.playstation.net/cdn
PCSE01396 Habroxia http://zeus.dl.playstation.net/cdn/UP4
PCSG01127 Sen no Hatou, Tsukisome no Kouki http://zeus.dl
PCSB01396 Tic-Tac-Letters by POWGI http://zeus.dl.playsta
PCSH10203 Gravity Duck http://zeus.dl.playstation.net
PCSH10175 Crossovers by POWGI http://zeus.dl.playstation
PCSH10169 Mixups by POWGI (3.61+!) [3.69] http://zeus.dl
PCSH10167 One Word by POWGI http://zeus.dl.playstation
PCSH10166 Word Search by POWGI http://zeus.dl.playsta
PCSH10179 Word Wheel by POWGI http://zeus.dl.playstation
PCSH10180 Wordsweeper by POWGI http://zeus.dl.playsta
PCSH10168 Word Sudoku by POWGI http://zeus.dl.playsta
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
最终结果应该是
PCSH10160 Attack of the Toy Tanks
PCSH10162 Paradox Soul
PCSH10146 Hoggy2
PCSB01394 Mekabolt
PCSH10186 Himno
PCSG01285 MELLKISS
PCSB01365 Habroxia
PCSE01423 Color Slayer
PCSE01396 Habroxia
PCSG01127 Sen no Hatou, Tsukisome no Kouki
PCSB01396 Tic-Tac-Letters by POWGI
PCSH10203 Gravity Duck
PCSH10175 Crossovers by POWGI
PCSH10169 Mixups by POWGI
PCSH10167 One Word by POWGI
PCSH10166 Word Search by POWGI
PCSH10179 Word Wheel by POWGI
PCSH10180 Wordsweeper by POWGI
PCSH10168 Word Sudoku by POWGI
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack
我并不关心 ID 和标题之间的间距,因为它可以手动修复。
哎呀。我搞错了。运行提供的表达式后,我注意到有几行包含单词,MISSING
后面跟着各种信息。有没有办法将其包含在表达式中,与(
and一起http
?
或者作为一个单独的表达,它只需要考虑大小写,因为我担心“缺失”这个词出现在标题的某个地方,并且它会超出该点。
PCSG00742 Kiss Ato
PCSG00744 One Piece: Burning Blood - Gold Edition
PCSG00747 Zero Escape: Zero Time Dilemma
PCSG00748 Jikkyou Powerful Pro Yakyuu 2016 MISSING KO5ifR1dQ+d7
PCSG00750 Kai-ri-Sei Million Arthur
PCSG00751 Arcana Famiglia -La Storia Della Arcana Famiglia- Ancora
PCSG00752 Touhou Soujinengi V
PCSG00753 Eikoku Tantei Mysteria: The Crown MISSING KO5ifR1dQ+d7
PCSG00756 I am Setsuna
答案1
我需要省略一切(
及以后,或者一切http
及以后
前:
PCSH10160 Attack of the Toy Tanks (3.61+!) [3.69] http://zeu
PCSH10162 Paradox Soul http://zeus.dl.playstation.net/cdn
PCSH10146 Hoggy2 http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394 Mekabolt http://zeus.dl.playstation.net/cdn/EP0
PCSH10186 Himno http://zeus.dl.playstation.net/cdn/HP2
PCSG01285 MELLKISS http://zeus.dl.playstation.net/cdn/JP0
PCSB01365 Habroxia http://zeus.dl.playstation.net/cdn/EP5
PCSE01423 Color Slayer http://zeus.dl.playstation.net/cdn
PCSE01396 Habroxia http://zeus.dl.playstation.net/cdn/UP4
PCSG01127 Sen no Hatou, Tsukisome no Kouki http://zeus.dl
PCSB01396 Tic-Tac-Letters by POWGI http://zeus.dl.playsta
PCSH10203 Gravity Duck http://zeus.dl.playstation.net
PCSH10175 Crossovers by POWGI http://zeus.dl.playstation
PCSH10169 Mixups by POWGI (3.61+!) [3.69] http://zeus.dl
PCSH10167 One Word by POWGI http://zeus.dl.playstation
PCSH10166 Word Search by POWGI http://zeus.dl.playsta
PCSH10179 Word Wheel by POWGI http://zeus.dl.playstation
PCSH10180 Wordsweeper by POWGI http://zeus.dl.playsta
PCSH10168 Word Sudoku by POWGI http://zeus.dl.playsta
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
后:
PCSH10160 Attack of the Toy Tanks
PCSH10162 Paradox Soul
PCSH10146 Hoggy2
PCSB01394 Mekabolt
PCSH10186 Himno
PCSG01285 MELLKISS
PCSB01365 Habroxia
PCSE01423 Color Slayer
PCSE01396 Habroxia
PCSG01127 Sen no Hatou, Tsukisome no Kouki
PCSB01396 Tic-Tac-Letters by POWGI
PCSH10203 Gravity Duck
PCSH10175 Crossovers by POWGI
PCSH10169 Mixups by POWGI
PCSH10167 One Word by POWGI
PCSH10166 Word Search by POWGI
PCSH10179 Word Wheel by POWGI
PCSH10180 Wordsweeper by POWGI
PCSH10168 Word Sudoku by POWGI
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
笔记:
- 最后一个例子行不正确,但当您针对未截断的文件应用时,它会是正确的。
- 要截断包含 MISSING 的行,请将“查找内容”更改为
\(.*?$|http.*?$|MISSING.*?$
根据评论中的对话,最快的正则表达式是
\h+(?:\(|http|MISSING).+$
进一步阅读
答案2
提高性能(感谢@IsmaelMiguel)并满足新要求。
- Ctrl+H
- 找什么:
\h+(?:\(|http|MISSING).+$
- 用。。。来代替:
LEAVE EMPTY
- 查看 相符
- 查看 环绕
- 查看 正则表达式
- 取消选中
. matches newline
- Replace all
解释:
\h+ # 1 or more horizontal spaces
(?: # non capture group
\( # opening parenthesis
| # OR
http # literally
| # OR
MISSING # literally
) # end group
.+ # 1 or more any character but newline
$ # end of line
截图(之前):
截图(之后):