问题

问题

问题

我现在正在为下学期准备 NLP 课程材料。我想以以下格式包含文本

Spanish  Farm  Minister  Loyola  de  Palacio  had  earlier  accused  
7        0     0         1       2   2        0    0        0        

Fischler  at  an  EU  farm  ministers  '  meeting  of  causing  unjustified  
1         0   0   3   0     0          0  0        0   0        0            

alarm  through  "  dangerous  generalisation  .  "  
0      0        0  0          0               0  0  

其中标记和数字之间的对齐由 Python 脚本创建。

但是直接复制粘贴这些文字会破坏对齐。我尝试过把 token 和数字放到一个表格中,让规则不可见。但结果看起来很丑。

我认为tikz它非常适合这个应用程序。有人能帮我吗?

编辑

我正在寻找可以复制 Python 输出的 LaTeX 解决方案。

用于创建比对的 Python 脚本

from collections import defaultdict

def print_tuple(tuple_list, max_char_length=50):
    # tuple_list: [(a1, b1, c1, d1,...), (a2, b2, c2, d2,...), ...]

    # if any of the tuple has more than n_token tokens, ignore extra tokens
    n_token = min(map(len, tuple_list))
    token_list_dict = defaultdict(list)

    length = 0
    full_string = ""
    string_format = ""
    for tup in tuple_list:    
        # length    
        max_len = max(map(len, tup))
        length += max_len

        # print format
        string_format += "{:<%d" % (max_len + 2) + "}"

        for i in range(n_token): token_list_dict[i].append(tup[i])

        if length >= max_char_length:
            # append
            for token_list in token_list_dict.values():
                full_string += "%s\n" % string_format.format(*token_list)
            full_string += "\n"

            # reset
            length = 0
            string_format = ""
            token_list_dict = defaultdict(list)
    
    # when remaining tokens is shorter than max_char_length, append remaining tokens
    for token_list in token_list_dict.values():
        full_string += "%s\n" % string_format.format(*token_list)

    print(full_string)

sample = [('Spanish', '7'), ('Farm', '0'), ('Minister', '0'), ('Loyola', '1'),
          ('de', '2'), ('Palacio', '2'), ('had', '0'), ('earlier', '0'),
          ('accused', '0'), ('Fischler', '1'), ('at', '0'), ('an', '0'),
          ('EU', '3'), ('farm', '0'), ('ministers', '0'), ("'", '0'),
          ('meeting', '0'), ('of', '0'), ('causing', '0'), ('unjustified', '0'),
          ('alarm', '0'), ('through', '0'), ('"', '0'), ('dangerous', '0'),
          ('generalisation', '0'), ('.', '0'), ('"', '0')]
print_tuple(sample)

答案1

以下是两种不同的方法

使用三个tabular环境

在此处输入图片描述

listings

在此处输入图片描述

\documentclass{article}
\usepackage{geometry}
\usepackage{listings}

\begin{document}

\begin{tabular}{@{}*{9}{l}}
Spanish  &Farm  &Minister  &Loyola  &de  &Palacio  &had  &earlier  &accused \\  
7        &0     &0         &1       &2   &2        &0    &0        &0        
\end{tabular}\smallskip

\begin{tabular}{@{}*{11}{l}}
Fischler  &at  &an  &EU  &farm  &ministers  &'  &meeting  &of  &causing  &unjustified \\ 
1         &0   &0   &3   &0     &0          &0  &0        &0   &0        &0            
\end{tabular}\smallskip

\begin{tabular}{@{}*{7}{l}}
alarm  &through  &"  &dangerous  &generalisation  &.  &" \\ 
0      &0        &0  &0          &0               &0  &0        
\end{tabular}

\begin{lstlisting}
Spanish  Farm  Minister  Loyola  de  Palacio  had  earlier  accused  
7        0     0         1       2   2        0    0        0        

Fischler  at  an  EU  farm  ministers  '  meeting  of  causing  unjustified  
1         0   0   3   0     0          0  0        0   0        0            

alarm  through  "  dangerous  generalisation  .  "  
0      0        0  0          0               0  0  
\end{lstlisting}



\end{document}

答案2

在此处输入图片描述

使用的解决方案TikZ。如果每对字/数字由单独的环境绘制tikzpicture

代码

\documentclass[11pt, a4paper]{article}
\usepackage{tikz}
\begin{document}

\noindent
\foreach \stg/\i in {Farm/7, Minister/0, Loyola/1, de/2,
  Palacio/2, had/0, earlier/0, accused/0, Fischler/1, at/0, an/0,
  EU/3, farm/0, ministers/0, '/0, meeting/0, of/0, causing/0,
  unjustified/0, alarm/0, through/0, "/0, dangerous/0,
  generalisation/0, "/0, ./0}{%
  \begin{tikzpicture}[baseline=-6ex,
    every node/.style={text depth=0, anchor=west}]
    \path (0, 0) node {\stg};
    \path (0, -3ex) node {\i};
  \end{tikzpicture}
}
\end{document}

相关内容