解析一个大文本文件,然后将每个文件的输出写入单独的文件

解析一个大文本文件,然后将每个文件的输出写入单独的文件

我正在使用以下数据集:一个非常大的纯文本文件(400MB),其中包含从诺曼征服到十九世纪的几乎所有英国法律案件。该文档只是一个非常长的文本文件,其中包含案例,并由每个案例的引用分隔开。

例如,案例的布局如下:

CITATION NUMBER (e.g., 1 Report 10)

THE TEXT OF THE CASE .... blah blah blah...

CITATION NUMBER (e.g., 1 Report 11)

ETC...

因为我想构建一个文档索引和搜索查询功能,我希望任何人都可以免费在线使用它,所以我想首先将文档拆分为每个案例的单独 .txt 文件。

(值得一提的是,引文中始终包含两个字母,例如 ER,它代表 English Reports,然后后面跟着一个不同的数字)。

我怎样才能运行一个按以下逻辑运行的脚本:

  1. 通过查找数字+“ER”+另一个数字的第一个实例来查找第一个引文。
  2. 通过查找 [数字 +“ER”+另一个数字] 查找案例引用的下一个实例
  3. 将第一个引文实例和下一个引文实例之间的所有文本打印到文件,不包括下一个引文实例本身。
  4. 将输出文件命名为找到的引文参考的值。
  5. 对所有后续引用实例重复此过程,直到找不到后续引用实例(即文档末尾)。

======

关于我从哪里开始有什么想法或指导吗?

我在 Linux 中使用 cut 来处理 CSV 文件,我认为我在这里所做的事情是类似的。例如,对于 cut,我可以说:cut -d[citation instance consisting of a pattern of (number)_"ER"_(number) | cat > nameoffile.txt

以下是兰伯特诉兰伯特案,英文报告引文(1 ER 764)从案件的开头开始,但看起来每次分页时都会重复引文,1 ER 765、1 ER 767、1 ER 768, 1 ER 769、1 ER 770。然后,下一个案例从 1 ER 770 开始。此外,在我的文档中,每个分页符都由行空格分隔。

==BEGIN===
`Lambert v Lambert [1767] 2 Brown PC 18, 1 ER 764
Report Date: 1767
[2-Brown-18]  CASE 5 WALTER LAMBERT, Appellant; CATHERINE LAMBERT, Respondent  [18th May 176 7].
[Mew's Dig. vii. 1132.]
[Where a husband by force, etc. compels his wife to execute a deed of separation and thereby to accept of a very small maintenance, much inferior to his rank and fortune; a Court of Equity will relieve the wife against this deed, and refer it to a Master, to settle a proper maintenance.]
    Arthur Humphrey, gent. the respondent's former husband, was in his lifetime entitled to a very valuable interest in the lands of Gortern and Ballybrien, in the county of Hallway, containing 300 acres, by virtue of a lease for three lives from Frederick French, esq. He died in the year 1740, leaving the respondent his widow and several children, whose principal subsistence was to arise from the profits of this lease; and at the time of his death, he owed Mr. French, for rent, £119 18s. 9d. for which arrear Mr. French threatened to bring an ejectment against the respondent. Upon which occasion, in May 1740, she applied to the appellant, who was then considered as a very wealthy man, and proposed to grant him a lease of part of the lands at a low rent, if he would pay off the arrear, to which proposal the appellant readily agreed; and accordingly, the respondent executed a lease of 130 acres, part of the said lands, to the appellant, at the yearly rent of 6s. 4d. by the acre, although they were then worth considerably more; and assigned to him the original lease, as a security for the money which he was to advance without interest.
    In 1741, the appellant came to the respondent's house, and solicited her in marriage; and to induce her to comply, proposed to give up his mortgage of the said lease; and accordingly, on the 1st of September 1741, the appellant executed a writing directed to his son Charles Lambert, in the words following: " I do hereby order " that the lease of Gorteran, assigned to me by the widow Catherine Humphrey, as " a security for what money I paid that was due on the said farm, may be given to "her without any demand to it. September 1st, 1741. Walter Lambert.-To my " son Charles Lambert.-The above lease is in the upper drawer."
    Things remained in this situation till the year 1743, when a marriage was agreed on between the appellant and the respondent, and articles were duly executed, dated the 22d of September 1743, whereby the appellant covenanted, that the respondent should have a provision of £20 a year, and should be acquitted from the said sum of £119 18s. 9d. in case any part thereof should remain due at the appellant's death which provision was to be in full satisfaction of all dower, thirds, or jointure.
    Soon after the execution of these articles, the appellant and respondent intermarried; and from the time of their marriage, for a long series of years, lived together as man and wife in perfect [2-Brown-19] harmony and affection. And the appellant (who for many years had been afflicted with disorders and infirmities) many times expressed his acknowledgments and sense of the respondent's great tenderness and affection for him, and of her care in the preservation of his substance; and often declared he 

2 Brown 20, 1 ER p765
would make an ample provision for her, in case she should survive him. This raised the jealousy of the appellant's children by his former wives, and a scheme was formed to supplant the respondent in his affection and esteem; and one Edward Cloran was pitched upon, as a proper instrument for carrying it into execution. This man acted in the appellant's house in the character of overseer, and by his pretended honesty acquired the confidence of the appellant, to whom he insinuated that the respondent had embezzled his substance, to supply the wants of her children by a former husband. At length his insolence arose to such a pitch, that he abused and beat the respondent's eldest son, who frequently visited at the appellant's house.
    The respondent complained to the appellant of the treatment her son had met with; but the appellant's mind had been so poisoned against the respondent by the false insinuations of Cloran, that instead of redressing the complaint, he flew into a violent passion, called the respondent many opprobrious names, and swore she should never lie in the same bed or room with him; and upon the respondent's expostulating with the appellant, he gave her a violent punch of a bill-hook in her side, threw her down, and seized her violently by the throat. And although a very sickly woman, and advanced in years, she was soon after, by the directions of the appellant, confined in a small cold damp room, and fed with the leavings and fragments of Cloran and one Lynch a thatcher, who frequented the appellant's house, and was called the Governor, with intent to starve her into a compliance with their schemes. This Lynch was employed to bar the room door where the respondent was confined, and to fix an iron chain and padlock to it every night; and the respondent, from the cold and damp of the room, lost the sight of one of her eyes: and the appellant often, during her confinement, told her, that if she would not agree to quit his house, and take a separate maintenance, he would lock up all the doors, and would not leave one living creature in the house but herself; and that she should have neither fire or candle light, or any subsistence whatsoever; and that if she did not take £20 a year, she should be still confined, and should never have so good an over made to her again.
    The respondent in this distressed situation, was obliged to execute the following instrument, which the appellant, or his son Charles, had caused to be drawn up. " I do hereby promise and agree, to pay my wife, Catherine Lambert, otherwise " Rolleston, the sum of £20 sterling yearly, during our separation, to be paid in two " payments; that is to say, £10 every May, and £10 every November. And in case " the said Catherine Lambert, otherwise Rolleston, should survive me, this instrument " to be [2-Brown-20] then void, and she is to have the benefit of our marriage articles, and no " more; which articles are witnessed by her brother Francis Rolleston, esq. and John " Leary. In witness whereof, we have hereunto set our hands and seals, the 29th " day of November 1762, Walter Lambert, Catherine Lambert. Present John Lynch " John Butler."
    At the time the respondent signed the above writing, her treatment was such that she was in dread of her life, and would have signed any paper they produced to her, in order to procure her liberty; but after the execution of the writing, having been visited by some of her friends, she was advised by them not to quit the appellant's house at any rate, until she was actually turned out; for that the provision made for her by the appellant, was too poor for the wife of a man of so considerable a fortune and thereupon the respondent absolutely refused to quit her husband's house: upon which, the appellant and his son Charles Lambert redoubled their cruelty to the respondent; kept her more closely confined in the damp room, and turned off a servant for bringing her a little turf for fire to warm herself. But finding that all this barbarous treatment did not produce the intended effect of making the respondent quit the house, the said Charles Lambert prevailed on the appellant to order all the furniture and kitchen utensils to be removed into the brewhouse, and to quit his own house, and to reside with him, leaving the respondent confined under the controul and dominion of the two instruments of his cruelty, Cloran and Lynch.
    In some time afterwards, the appellant, with his son Charles, returned to his house where the respondent was still confined, and seised all the papers belonging to the respondent, which they could find; and immediately after, Cloran, by the directions of the appellant and his son Charles, forcibly dragged the respondent out of the appellant's house, and greatly abused and cut her.
    Not yet satisfied with what had been done, the appellant and his confederates 

2 Brown 21, 1 ER p766
carried their cruelty to the respondent to the most infamous extremity. They stabbed her reputation, which had ever been unblemished; they traduced her as a thief; and even dared to deny her marriage with the appellant, weakly imagining to apologise to the world for the wanton cruelty exercised against her, and maliciously intending to deprive her of all resources from the friendship of her relations and friends, to enable her to seek for justice, or even to procure the means of necessary subsistence. But the respondent's character was too well established to suffer material ally by this wicked device; and her friends, convinced of her innocence, and shocked at the treatment she had received, advised her to seek redress in a Court of Justice.
    Accordingly, on the 2d of December 1763, the respondent, by her brother and next friend Francis Rolleston, esq. exhibited her bill in the Court of Chancery in Ireland against her said husband, the appellant, and also against Charles Lambert and Megg his wife, [2-Brown-21] John Lambert and Mary his wife, Thomas Lambert and Mary his wife, Mary Lambert widow of Peter Lambert, who were the sons of the appellant; and against Robert Hamilton, esq. Brother-in-law to the appellant, and the said Cloran; stating several of the matters and acts of cruelty before mentioned and also charging her marriage and cohabitation with the appellant for nineteen years, during which time, the appellant had frequently represented to his friends and relations, her care and tenderness of him: her being introduced by the appellant to, and visited by ladies of the first distinction in the country, as his wife. That she was always called mother by the appellant's sons and daughters-in-law, and was received as such by them; and that she stood sponsor to several of the appellant's grandchildren. That she from time to tine, received letters from his sons, addressed to her as their mother-in-law, and written in a dutiful manner. That the appellant joined the respondent in making two leases of her farm of Gortern, in each of which leases there was a proviso, that the respondent should take the profits to and for her sole use, notwithstanding her coverture. That the appellant executed several wills which were all in his own handwriting; the first of which was dated the 3d of December 1751; another dated the 17th of March 1752; another dated the 15th of June 1756 another dated in March 1757; and another dated the 25th of November 1758; in each of which wills he made certain provisions for the respondent, and in every one of them called her his wife, and even willed to her her said farm of Gortern and Ballybrien. That the appellant, upon the marriage of his second son John Lambert, with the daughter of Sir Henry Burke, in the year 1756, having occasion to levy a fine of some part of his estate, which was to be settled on the marriage, applied to the respondent to Join In levying such fine; and in order to induce her so to do, he signed the following writing: " I do hereby assure my wife Catherine Lambert, that " she shall not suffer in any shape by her levying fines for my son John Lambert. " Witness my hand, September 10th, 1756, Walter Lambert. Present William " Nethercott." That she accordingly joined in the said fine, as the wife of the appellant. That the appellant had, notwithstanding his agreement with the respondent, received the issues and profits of her said farm, and converted the same to his own use for upwards of ten years, which amounted to £1500, and that the respondent had, by means of the waste which the appellant had committed thereon, lost her said farm the lives for which the same was held having drops, and Colonel French having refused to grant a renewal thereof, which the respondent charged would be well worth to her and her children, upwards of £3000 if the same had been renewed. That the appellant was possessed of a real estate of £1500 a year, and of a personal estate to the amount of £12,000 and upwards. And the bill prayed, that the deed of the 29th of November 1762, whereby the appellant agreed to give the respondent his wife the sum of £20 a year, by way of a separate maintenance, might be set aside; and that the appellant might be compelled to give the respondent [2-Brown-22] such maintenance from the time of her separation from him, as the Court should judge reasonable to support her as his wife, and to continue as long as she should live separate from him and that her bill should be taker as a bill of discovery, against such of the defendants as it was improper to pray relief against; and that the respondent might have such other and further relief, as the nature of her case required.
    To this bill the appellant put in two answers, and admitted his agreeing in the year 1740, to discharge the arrear due to Colonel French, and the execution of the deed of the 25th of November 1740. He admitted, that he frequently called upon 

2 Brown 23, 1 ER p767
the respondent at her house, but never solicited her to marry him; and although in his first answer he said, he did not believe that he had executed the writing of the 1st of September 1741, yet in his second answer he recollected, that he was persuaded and accordingly did execute such writing. He denied his marriage with the respondent, but said, that in 1743, Francis Rolleston, esq. the respondent's brother by the contrivance of the respondent as he believed, proposed to him to marry the respondent, representing her as a careful, industrious, good, humane woman and that he consented to such proposal, and admitted that thereupon such articles of the Id of September 1743, as are before stated, were drawn up and executed by him. He said, that before such marriage could be had, he met two persons, who, on his asking them some questions concerning the respondent, represented her to him as a turbulent troublesome woman, and that thereupon he determined not to marry her, and that in some time after he acquainted her with his resolution; but said, that the respondent thereupon made frequent applications to him, and requested that he would permit her to live in his house; that he then looking upon her to be a person capable of managing his family affairs, agreed to it, at the same time informing her, that for several reasons he never would marry her; and that she accordingly came to his house, and cohabited with him. That he was prevailed upon by her, to agree that she should go by his name, and be called his wife, and that she for some time behaved in a manner very agreeable to him and his friends; but that she afterwards behaved otherwise, and he was so much ashamed, that he would not let any of his family or friends know that he was not married to her, as he had before consented that she should pass for his wife. He said, that the respondent, as he was informed by Edmund Claron, embezzled his substance, and exercised her supposed authority in his family in a most arbitrary manner; that she misbehaved to his children and relations, and ill-treated the appellant himself, on his not listening to a charge which she had made against Cloran; he denied striking the respondent with a bill-book, or seizing her by the throat. He said, that on his refusing to let her lie in his room, she made choice of a bed-chamber for herself; and that he being informed, that the respondent frequently went about the house at unseasonable hours in the night, and sent away his goods, he ordered a padlock to be fixed on her chamber door and that it should [2-Brown-23] be locked every night after she went to bed, and believed the same was accordingly done; but he said he did not mean thereby to confine her He said he was prevailed upon to give her £20 a year, provided she would remove from his house, and live separate from him, and that thereupon the deed of the 29th of November 1762 was drawn, and that the respondent freely executed the same. He said he never gave directions that the respondent should be treated with any cruelty, nor did he believe that she received such treatment. He said he expected that she would have left the house immediately on the perfection of the deed of separation, but she put off her departure from time to time, and he not thinking himself safe in the house with her, went to his son's house, where he continued several weeks expecting the respondent would withdraw; but on his return he found she still continued there, and upon his telling her that he never would cohabit with her, she went away voluntarily. He admitted that he was possessed of an estate of £1500 a year, and of a very considerable personal estate, but refused to discover how much. He admitted his making the wills before mentioned, and believed, that in every of them he called her his wife. He admitted, that he joined with the respondent in making leases of her farm; and that the respondent joined with him in levying the fine stated in the bill, but said he did not believe he executed any instrument to induce her to join therein. He admitted, that he treated the respondent as his wife, and introduced her to all his relations, friends, and acquaintance as such, and that she stood sponsor to several of his grandchildren, and believed she was esteemed in the country to be a prudent, virtuous woman. He denied that any waste committed by him was the cause of the respondent's losing her farm; but admitted, that she had lost the benefit of the said lease. He said, that on the 3d of September 1760, he made a will, which was of his own hand writing, and admitted he therein stiled her his wife, and devised to her £30 a year, in addition to the £20 a year mentioned in the articles, and he bequeathed to her £100. He admitted, that for some part of the time, the respondent was constant in her care and seeming tenderness for him, and that he often expressed his acknowledgments and sense of her care and tenderness, 

2 Brown 24, 1 ER p768
and often declared he would make a good provision for her, in case she survived him. And finally, he insisted, that the matters sought by the bill to be relieved in, were properly cognizable in the Ecclesiastical Court.
    The several other defendants also put in their answers, and all of them admitted, that the appellant and respondent lied together as man and wife.
    Issue having been joined in the cause, several witnesses were examined on both sides. The respondent, on her part, produced many witnesses, persons of character and reputation, and proved every material part of her case; and even as to the actual solemnisation of the marriage, it was proved by the Rev. Dean Crowe, that he had been sent for in order to marry the appellant to the [2-Brown-24] respondent. That the day being exceedingly wet, the then Bishop of Clonfert prevailed on the Dean not to venture his life on that day, by undertaking such a journey. That on the next day he went to Gortern, in order to marry them. That the appellant then told the Dean, that he intended he should be the person to marry him to the respondent, but as he did not come when sent for, he had that ceremony performed by another, and at the same time introduced the respondent to the Dean as his wife. This evidence was confirmed by the deposition of Samuel Simpson, esq. who, amongst other particulars swore, that the winter before the marriage, the appellant had told him, as a secret that he intended to marry the respondent. That some time afterwards he met the appellant, and having heard that the appellant had been privately married to the respondent, he as heard the appellant, whether he might wish him joy ? and that the appellant told him, he might, for that he was married to the respondent; and at the same time he told the deponent, that he had a resentment against Dean Crowe for not coming to marry him when sent for, and as he did not chose to wait, that he procured another clergyman for that purpose. The evidence on the part of the appellant went to prove several instances of the respondent's ill behaviour, that she wasted his substance, embezzled his effects, procured false keys to his locks, stole away his papers, and attempted his life. Two papers were also proved to have been accidentally dropped by the respondent out of an handkerchief, which were certificates of her marriage in her own hand writing, with the appellant's name subscribed thereto, but which was not of his hand writing.
    Publication having passed, the cause came on to be heard before the Lord Chan-cellor of Ireland, on the 17th of November 1766, and to be further heard on the 18th, 19th, and 20th of the same month, when his Lordship was pleased to decree, that the deed of the 29th of November 1762, so far as the same might prevent the respondent's recovering a maintenance, during the separation between her and the appellant should be set aside; and it was referred to a Master, to enquire into and report the circumstances of the estate and fortune, both of the appellant and respondent, and what would be proper to allow the respondent annually for her maintenance, during the said separation.
    The respondent being in very great distress, and being likely to meet with every possible delay to retard the proceedings before the Master, on the 22d of November 1766, applied to the Court, upon an affidavit, stating her distress, for a sum of money to maintain her, and to enable her to carry on the suit; whereupon, and upon hearing counsel on behalf of the appellant, his Lordship was pleased to order the appellant to pay the respondent, in a month, the sum of £200, subject to the further order of the Court.
From this decree and order the appellant appealed, insisting (W. de Grey, A. Forrester, D. Graham), that no actual marriage was proved to have been solemnised [2-Brown-25] between him and the respondent, and he had positively denied it upon oath. That had there really been a marriage, the respondent might have proved it by various kinds of evidence, which she not only had not attempted, but from her own bill, and the evidence produced in support of it, it clearly appeared there never was any marriage between them. The bill charged the marriage to have been in the year 1742 but did not state the day, or the month, or the place where, or the person by whom the ceremony was performed, nor whether any one was present at it. Simpson, her own witness, contradicted her, by fixing the marriage to be in the year 1740; and both were contradicted by the articles made previous to the supposed marriage, which could not mistake, and were not made till September 174.3; thereby plainly proving both the charge and the testimony to be false. Dean Crowe, another of the respon-

2 Brown 26, 1 ER p769
dent's witnesses, said, he was sent for to marry them, and that the appellant told him, he had been married the day before, but he did not recollect what year this was in. Besides, the respondent's attempting to support her pretended marriage by fictitious and forged evidence, were clear proofs against the reality of it. That cohabitation and acknowledgment of marriage may be sufficient, as between the reputed husband and the creditors of the supposed wife, to oblige him to pay her debts; but would not be good as between him and her, to entitle her to dower out of his estate. And in this case, where the question was between the reputed husband and wife, evidence of cohabitation and acknowledgment was not sufficient for the Court of Chancery, if it had jurisdiction at all, to found a decree for alimony. The time when it was solemnized, the place where, and the person by whom the marriage was performed, ought to be fully and indisputably proved; and where the marriage was denied, the Court of Chancery, until it was clearly established, could have no jurisdiction to set aside the agreement of November 1762; for if there was no marriage, there could be no constraint or force in the execution of that agreement.
    [...] On the part of the respondent it was contended (F. Norton, A. Wedderburn), that her marriage was established by every imaginable circumstance; and the deed of separation itself, which was sought to be set aside, was conclusive against the appellant as to the fact of the marriage, which he had been induced to dispute by the same artifices, that had prevailed upon him to treat the respondent in the barbarous manner he had done. But a mere denial of the marriage under such circumstances, and opposed by a course of twenty years public cohabitation, could not even raise a doubt upon the question. The respondent's bill was filed to set aside a deed extorted from her by the most infamous means, of which the Court of Chancery undoubtedly had cognizance: the husband in his answer had declared, that he never would cohabit with her; the reference to the Master to enquire what would be proper to be allowed for her maintenance, and the subsequent order of the 22d of November 1766, were consequential to the original relief; and it would have been absurd in such a case, to have turned the respondent round to sue in another jurisdiction for alimony; especially as the Court had done no more in this case, than it would have done upon a supplicavit, where the husband had refused to maintain his wife. That the Court had as yet made no order with regard to the quantum of maintenance, and as to the £200 directed to be paid to the respondent by the order of November 1766, it, could not be thought too large, either with respect to the appellant's fortune, or the respondent's condition; who had lost by his misconduct her own separate fortune, and been for above four years destitute of any provision, and engaged in a most expensive litigation. As therefore the decree and order were equitable and just, and the appeal frivolous, vexatious, and oppressive, it ought to be dismissed with most exemplary costs.
    Accordingly, after hearing counsel on this appeal, it was ORDERED and ADJUDGED, that the same should be dismissed, and the decree and order therein complained of, affirmed: and it was further ORDERED, that the appellant should pay the respondent £200 for her costs in respect of the said appeal. (Jour. vol. 31. p. 604.)

答案1

我会使用该csplit命令。

csplit -z citations '/ v .*[0-9] ER [0-9]/' '{*}'

将在包含以下字符序列的每一行上拆分文件:

空格、v、空格、任何其他字符、数字、空格、E、R、空格、数字

并将每个分割部分存储在它自己的文件名中。

文件分割后,可以将它们移动到正确的名称。

完整的解决方案脚本,接受文件名参数,或读取标准输入:

#!/bin/sh

csplit -z "${1:--}" '/ v .*[0-9] ER [0-9]/' '{*}'

find . -maxdepth 1 -name 'xx*' |
while read filename
do
    mv "$filename" "$(head -1 $filename)"
done

答案2

如果某行与不是页码行的模式(空格、E、R、空格)匹配,以下perl脚本将提取案例名称,并将所有后续行写入(直到再次找到该模式并因此找到新文件名)一个名为.$outfileER(m/ ER (?!p\d+)/)out/$outfile.txt

#! /usr/bin/perl

use strict;

my $outfile='/dev/stdout';
open(OUTFILE,">","$outfile") || die "couldn't open $outfile for write: $!\n";

while (<>) {
    chomp;
    if (m/ ER (?!p\d+)/) {
       $outfile = substr($_,0,200);
       open(OUTFILE,">>","./out/$outfile.txt") || die "couldn't open ./out/$outfile.txt for write: $!\n";;

       if (-s "./out/$outfile.txt") {
           print OUTFILE "\n\n-=-=-=-=-=-=-=-=\n\n";
       }
    };
    print OUTFILE $_,"\n";
}

输出太长,无法在此处显示,但我根据您提供的输入对其进行了测试,并且它按预期工作。如果您可以使文件的全部或部分(包括更多案例)可供下载,我可以进一步测试(并可能完善)脚本。

使用list-of-cases-volume-1.txt(剥离行号并保存为cases2.txt),输出(以Lambert*案例为例)为:

$ mkdir -p out/
$ ./hef.pl cases2.txt
$ ls -1 out/Lambert*
out/Lambert v Aeretree 1 Lord Raymond 223, 91 ER 1045.txt
out/Lambert v Atkins and Another 2 Campbell 272, 170 ER 1153.txt
out/Lambert v Cook 1 Lord Raymond 237, 91 ER 1055.txt
out/Lambert v Oakes 1 Lord Raymond 443, 91 ER 1194.txt
out/Lambert v Pack 1 Salkeld 127, 91 ER 120.txt
out/Lambert v Peyton [1860] 7 House of Lords Cases 423, 11 ER 169.txt

某些输入行(12861 行中的 4176 行)用于重复的案例名称,因此我修改了上面的脚本,将该案例的额外行附加到现有文件中,并-=-=-=-=-=-=-=-=作为分隔符。

有些案例标题太长,无法用作文件名,因此我过去将substr($_,0,200)文件名限制为前 200 个字符。另一种替代方案是使用案例名称的 md5sum 哈希作为文件名,这会导致非人类敏感的文件名。案例名称仍位于文件的第一行。

一,希望是最后的评论。修改上面的脚本以使用 perl 模块和 postgres 或 mysql 数据库将所有这些记录存储在可搜索数据库中并不困难DBI....将案例名称作为标题索引字段,并将文本放在文本域。

答案3

Perl 非常适合这个...以下内容未经测试,但应该可以工作。

编辑:修改为读取固定文件。

#!/usr/bin/perl

open IN, "< this_is_input";
open OUT, "> before_ER";
while(<IN>) {
  if(/^\d+\sER\s\d+/$) {
     # Line containing <number><spaces>ER<spaces><number> only
     chomp;
     close OUT;
     open OUT, "> $_";
  }
  else {
    print $OUT;
  }
}

相关内容