我有一个使用 .bib 文件 ( ) 的 LaTeX 项目(一篇研究论文)old.bib
。我收到了一个新的 .bib 文件 ( new.bib
),其中包含了 BibTeX 条目,old.bib
信息更完整,但有时使用不同的条目名称。
我该如何new.bib
与合并,old.bib
以便new.bib
使用 中的条目名称来表示和old.bib
中都存在的条目?请注意,我不想在 中添加任何条目,而只想更改 中也存在的条目的条目名称。new.bib
old.bib
new.bib
old.bib
示例(实际上我会有更长的.bib 文件):
输入:
new.bib
:
@inproceedings{lesterpaper,
title={The Power of Scale for Parameter-Efficient Prompt Tuning},
author={Lester, Brian and Al-Rfou, Rami and Constant, Noah},
booktitle={Empirical Methods in Natural Language Processing},
pages={3045--3059},
publisher = {Association for Computational Linguistics},
year={2021}
}
@article{bommasani2023holistic,
title={Holistic Evaluation of Language Models},
author={Bommasani, Rishi and Liang, Percy and Lee, Tony},
journal={Annals of the New York Academy of Sciences},
year={2023},
publisher={Wiley Online Library}
}
old.bib
:
@inproceedings{lester,
title={The Power of Scale for Parameter-Efficient Prompt Tuning},
author={Lester, Brian and Al-Rfou, Rami and Constant, Noah},
booktitle={EMNLP},
publisher = {Association for Computational Linguistics},
year={2021}
}
@inproceedings{tokpo2022text,
title={Text Style Transfer for Bias Mitigation using Masked Language Modeling},
author={Tokpo, Ewoenam Kwaku and Calders, Toon},
booktitle={NAACL: HLT-SRW},
pages={163--171},
publisher = {Association for Computational Linguistics},
year={2022}
}
输出新new.bib
:
@inproceedings{lester,
title={The Power of Scale for Parameter-Efficient Prompt Tuning},
author={Lester, Brian and Al-Rfou, Rami and Constant, Noah},
booktitle={Empirical Methods in Natural Language Processing},
pages={3045--3059},
publisher = {Association for Computational Linguistics},
year={2021}
}
@article{bommasani2023holistic,
title={Holistic Evaluation of Language Models},
author={Bommasani, Rishi and Liang, Percy and Lee, Tony},
journal={Annals of the New York Academy of Sciences},
year={2023},
publisher={Wiley Online Library}
}
new.bib
示例中新内容的唯一变化是 Bibtex 条目名称lesterpaper
更改为lester
。
答案1
Reddit 用户指挥官 写道这个 Python 脚本基于 Python 库BibTex 解析器根据标题、作者和年份合并两个 Bibtex 文件:
import bibtexparser
#Define bibtex files
old_bib = "old.bib"
new_bib = "new.bib"
#Open bibtex files
with open(old_bib) as bibtex_file:
old_bib_database = bibtexparser.load(bibtex_file)
old_entries = {entry['ID']: entry for entry in old_bib_database.entries}
with open(new_bib) as bibtex_file:
new_bib_database = bibtexparser.load(bibtex_file)
new_entries = {entry['ID']: entry for entry in new_bib_database.entries}
#Compare bibtex files
for new_id, new_entry in new_entries.items():
for old_id, old_entry in old_entries.items():
#Parameters for bibtex entry comparison
if old_entry.get('title') == new_entry.get('title') and \
old_entry.get('author') == new_entry.get('author') and \
old_entry.get('year') == new_entry.get('year'):
new_entries[new_id]['ID'] = old_id
#Aggregate merged bibtex information
new_bib_database.entries = list(new_entries.values())
#Throw the bibtex into a new file
with open('merged.bib', 'w') as bibtex_file:
bibtexparser.dump(new_bib_database, bibtex_file)
这样可行。