将书目按作者(字母顺序)分组

将书目按作者(字母顺序)分组

编辑:我已经简化并澄清了这个问题

我正在处理一个非常大的参考书目(500+ 条目),使用默认nty排序:(sort)author- title- date。我想用标题或(伪)分段命令挑出一个或多个作者。

亚里士多德,标题
西塞罗,标题...
___ 伊曼纽尔康德
康德,I.,标题
康德,I.,标题...
Knuth,德国,标题...
___ 弗雷德雷西尼采
尼采,F.,标题...

biblatex有工具可以做到这一点吗?

  1. 印刷一个书目,但有一个机制可以识别特定(排序)作者的第一篇文章并在其之前打印任意元素?
  2. 设置一个过滤器,将我的条目分类为字母段sortauthor是 A-Kans,Kant-Nietzschd,Nietzsche-Z)然后使用该过滤器打印一些部分书目?(每个都以适当的printnote或开头heading。)

到目前为止,我的尝试都是第二种。现在我可以轻松编写一个SourceMap检查作者是否正是康德或尼采:

\DeclareSourcemap{
\maps[datatype=bibtex]{ 
  \map{
    \step[fieldsource=author, match={Kant, Immanuel}, final]
    \step[fieldset=keywords, fieldvalue={by-kant}]
    }
}}

但我还没有找到一种方法来创建一个过滤器,可以判断作者在。。。之间按字母顺序排列,即康德和尼采之前:

\step[fieldsource=sortauthor, alphabetizebefore={Kant, Immanuel}],

是否biblatex具有检查(严格)的内置选项或公式按字母顺序排列

我强烈不希望构建一系列正则表达式,因为这样很费力:

% 检查是否严格介于“Kant”和“Nietzsche”之间
IF ((第一个字母是 LM)
OR (第一个字母 = K AND 第二个字母是 BZ)
OR (第二个字母 = A AND 第三个字母是 OZ)
OR (第三个字母 = N AND 第四个字母是 UZ))
OR ((第​​一个字母 = N AND 第二个字母是 AH)
OR (第二个字母 = I AND 第三个字母是 AD)
OR (第三个字母 = E AND 第四个字母是 AS)
... OR (...))

并可能存在陷阱(当名称包含非严格字母顺序的字符时,排序会出现不确定性,例如“D'Alembert”)。

当然,一定有办法访问相同的本机机制.bbl 输出中哪些sortauthors 被比较和排序?并将其用于过滤/分割输出?

例子

\documentclass{article}

\usepackage[citestyle=authoryear,bibstyle=numeric]{biblatex}
\addbibresource{biblatex-examples.bib}

%\DeclareSourcemap{     % How to implement this?
%\maps[datatype=bibtex]{    
%  \map{
%    \step[fieldsource=sortauthor, alphabetizebefore={Kant, Immanuel}]  % How?
%    \step[fieldset=keywords, fieldvalue={before-kant}]
%    }
%}}
%\DeclareSourcemap{     % How to implement this?
%\maps[datatype=bibtex]{    
%  \map{
%    \step[fieldsource=sortauthor, alphabetizeafter={Kant, Immanuel}]   % How?
%    \step[fieldsource=sortauthor, alphabetizebefore={Nietzsche, Friedrich}]    % How?
%    \step[fieldset=keywords, fieldvalue={before-nietzsche}]
%    }
%}}
%\DeclareSourcemap{     % How to implement this?
%\maps[datatype=bibtex]{    
%  \map{
%    \step[fieldsource=sortauthor, alphabetizeafter={Nietzsche, Friedrich}] % How?
%    \step[fieldset=keywords, fieldvalue={after-nietzsche}]
%    }
%}}

\defbibnote{by-kant}{\bfseries Kant}
\defbibnote{by-nietzsche}{\bfseries Nietzsche}

\begin{document}

Authors before Kant.
\cite{aristotle:anima}, \cite{cicero}

Kant
\cite{kant:kpv}
\cite{kant:ku}

Authors before Nietzsche
\cite{knuth:ct:a}

Nietzsche
\cite{nietzsche:ksa}

\section*{Split bibliography}
\printbibliography[keyword=before-kant,heading=none]
\printbibliography[keyword=before-nietzsche,heading=none,prenote=by-kant]
\printbibliography[keyword=after-nietzsche,heading=none,prenote=by-nietzsche]
\end{document}

答案1

如果可以借助一些外部工具对 bib 文件进行预处理,那么这是可能的。给定一个 bib 文件,可以使用 Python 确定作者姓名并相应地对条目进行排序。然后,使用 biblatex 的类别特征。

主要组件的源代码如下所示。完整源代码请参见https://github.com/xziyue/latex-auto-categorized-bib

解释

  • Python 脚本分析参考书目文件并生成供 LaTeX 读取的输出文件。输出文件将如下所示:

    ...
    \BibAuthorInfo{google-llc}{android_formats}
    \BibAuthorInfo{greenwald-j}{greenwald_2019}
    \BibAuthorInfo{grinstein-eric}{grinstein2018audio}
    \BibAuthorInfo{grobman-s}{grobman_2019}
    \BibAuthorInfo{gu-quanquan}{gu2011linear}
    \BibAuthorInfo{guan-haiying}{guan2019mfc}
    ...
    \BibAllAuthors{adobe-inc,almutairi-zaynab,altinisik-enes,apple-inc,avidemux-contributors,ba-lei-jimmy,bansal-vipin,bartusiak-r-emily,bayram-sevinc,bestagini-paolo,bhagtani-kratika,bharati-a,bianchi-tiziano,...}
    

    它包括与每个作者相关的书目项目,以及根据姓名排序的作者列表。

  • 在 LaTeX 端,如果--shell-escape启用,则编译时可以自动调用 Python 脚本。否则,用户也可以手动运行 Python 脚本(如果 bib 文件不经常更改)。示例中自动运行 Python 脚本并读回输出。

    \immediate\write18{python3 bib_categorizer.py \jobname-bibinfo.tex example.bib}
    % load the generate file
    \input{\jobname-bibinfo.tex}
    
  • 将为\BibAuthorInfo每个作者姓名创建一个新类别。

  • 用户可以使用\PrintBibBetween命令打印排序列表中两个作者之间的书目条目。在示例中,\PrintBibBetween{ito-keith}{kabir-mohsin-muhammad}使用 。

LaTeX 源代码(test.tex)

\documentclass{article}
\usepackage[citestyle=authoryear,bibstyle=numeric]{biblatex}
\addbibresource{example.bib}


\ExplSyntaxOn

\cs_new:Npn \BibAuthorInfo #1#2
{
    \DeclareBibliographyCategory{#1}
    \addtocategory{#1}{#2}
}

\clist_new:N \g_bibinfo_all_authors_clist
\clist_new:N \l_bibinfo_tmp_clist
\cs_new:Npn \BibAllAuthors #1
{
    \clist_gset:Nn \g_bibinfo_all_authors_clist {#1}
}


\bool_new:N \l_bibinfo_start_found_bool
\bool_new:N \l_bibinfo_end_found_bool
\bool_new:N \l_bibinfo_loop_end_bool
\tl_new:N \l_bibinfo_tmpa_tl
\tl_new:N \l_bibinfo_tmpb_tl
\tl_new:N \l_bibinfo_tmpc_tl
\cs_new:Npn \PrintBibBetween #1#2
{
    \clist_set_eq:NN \l_bibinfo_tmp_clist \g_bibinfo_all_authors_clist
    \bool_set_false:N \l_bibinfo_start_found_bool
    \bool_set_false:N \l_bibinfo_end_found_bool
    \bool_set:Nn \l_bibinfo_loop_end_bool {\clist_if_empty_p:N \l_bibinfo_tmp_clist}

    \bool_until_do:nn {\l_bibinfo_loop_end_bool}
    {
        \clist_pop:NN \l_bibinfo_tmp_clist \l_bibinfo_tmpa_tl
        \exp_args:NV \str_if_eq:nnT  \l_bibinfo_tmpa_tl {#1}
        {
            \bool_set_true:N \l_bibinfo_start_found_bool
        }
        \exp_args:NV \str_if_eq:nnT \l_bibinfo_tmpa_tl {#2}
        {
            \bool_set_true:N \l_bibinfo_end_found_bool
        }
        \bool_if:nT {\l_bibinfo_start_found_bool}
        {
            \tl_set:Nx \l_bibinfo_tmpc_tl {\exp_not:N\printbibliography[category={\l_bibinfo_tmpa_tl},heading=none]}
            \tl_use:N \l_bibinfo_tmpc_tl
        }
        \bool_set:Nn \l_bibinfo_loop_end_bool {\clist_if_empty_p:N \l_bibinfo_tmp_clist || \l_bibinfo_end_found_bool}
    }

    \bool_if:nF {\l_bibinfo_start_found_bool}
    {
        \GenericError{}{Cannot found bib start item "#1"}{}{}
    }

    \bool_if:nF {\l_bibinfo_end_found_bool}
    {
        \GenericError{}{Cannot found bib end item "#2"}{}{}
    }
}


\ExplSyntaxOff


% make sure --shell-escape is enabled (if you want this process to be done automatically when compiling in LaTeX)
% call Python script to process bibliography files
% if there are more than one file, append to the argument list
\immediate\write18{python3 bib_categorizer.py \jobname-bibinfo.tex example.bib}
% load the generate file
\input{\jobname-bibinfo.tex}

\begin{document}

Is is working?

\nocite{*}

\PrintBibBetween{ito-keith}{kabir-mohsin-muhammad}


\end{document}

Python 脚本(bib_categorizer.py)

Python 脚本需要bibtexparser包。

import bibtexparser
import sys
import os
import re
import string

assert len(sys.argv) >= 3, 'invalid number of arguments'
output_filename = sys.argv[1]
input_filenames = sys.argv[2:]


# only allow certain characters in names
def clean_name_segments(s:str)->str:
    new_s = ''
    cand = string.ascii_letters + ' '
    for c in s:
        if c in cand:
            new_s += c

    return new_s

# return a list of authors
# each author name is represented using a list, where name segments are ordered from first name to last name
def process_author_names(s:str)->list:
    s = s.strip()

    braces_match = re.match('\{(.*)\}', s)
    if braces_match:
        return [[clean_name_segments(braces_match.group(1))]]
    
    ret = []

    parts = s.split(' and ')
    for part in parts:
        if ',' in part:
            last, _, first = part.partition(',')
            ret.append(first.split(' ') + [last])
        else:
            ret.append(part.split(' '))

    for item in ret:
        new_item = [clean_name_segments(x.strip()) for x in item]
        new_item = [x for x in new_item if x]
        item.clear()
        item.extend(new_item)

    return ret
    

def get_author_sort_key(e:list):
    return tuple(map(lambda x: x.lower(), reversed(e)))

author_lut = dict()

for fn in input_filenames:
    assert os.path.exists(fn), f'input file "{fn}" does not exist'

    with open(fn) as f:
        bib = bibtexparser.load(f)
    
    for entry in bib.entries:
        entry_name = entry['ID']

        if 'author' not in entry:
            print(f'Entry {entry_name} does not have author field, it is skipped')
            continue
        
        authors = process_author_names(entry['author'].replace('\r', ' ').replace('\n', ' '))

        # sort based on first author
        first_author_key = get_author_sort_key(authors[0])
        if first_author_key not in author_lut:
            author_lut[first_author_key] = []

        author_lut[first_author_key].append(entry_name)

author_sorted = sorted(list(author_lut.keys()))

output_lines = []
author_name_ids = []
for ind, author_name_seg in enumerate(author_sorted):
    author_bib_items = author_lut[author_name_seg]
    author_name_id = (' '.join(author_name_seg)).replace(' ', '-')
    author_name_ids.append(author_name_id)
    output_lines.append(r'\BibAuthorInfo{%s}{%s}' % (author_name_id, ','.join(author_bib_items)))

output_lines.append('\BibAllAuthors{%s}' % ','.join(author_name_ids))

with open(output_filename, 'w') as f:
    f.write('\n'.join(output_lines))

改进了作者排序

def get_author_sort_key(e:list):
    ret = [x.lower() for x in e]
    if len(ret) <= 2:
        return tuple(reversed(ret))
    else:
        ret_ = [ret[-1]] + [[ret[0]] + ret[1:-1]
        return tuple(ret_)

相关内容