xmpincl 能处理 unicode 文本吗?如果不行,还有其他选择吗?

xmpincl 能处理 unicode 文本吗?如果不行,还有其他选择吗?

我正在尝试使用xmpincl包裹以机器可读的方式将论文的 Creative Commons 许可信息包含在 PDF 中。过去,这种方法效果很好,但对于当前文档,作者姓名中的一个 unicode 字符会出错。

具体来说,creativecommons 网站生成的 XMP 文件包含归属名称

<cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName>

并且 xmpincl+pdflatex 在编译过程中不会抱怨,但如果我实际查看文件中的元数据(使用 exiftool,或仅在文本编辑器中检查 pdf),重音符号é会完全乱码:

Attribution Name: Rodrigo Guti\unhbox \voidb@x \bgroup \let \unhbox \voidb@x \setbox 
                  \@tempboxa \hbox {e\global \mathchardef \accent@spacefactor 
                  \spacefactor }\let \begingroup \endgroup \relax \let \ignorespaces 
                  \relax \accent 19 e\egroup \spacefactor \accent@spacefactor rrez-Cuevas 
                  and Emilio Pisanty

有没有办法将 unicode 字符包含在 XMP 元数据中,使用 xmpincl 或其他方式?

(如果不是,那么这是一个大问题。只能处理 ascii 作者姓名的系统,说白了,就是歧视性的。)


编辑:这是MWE:

\documentclass[pra]{revtex4-2}

\usepackage{xmpincl}
\includexmp{metadata}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{hyperref}

\begin{document}
\title{MWE}
\author{Rodrigo Guti\'{e}rrez-Cuevas}
\author{Emilio Pisanty}
\begin{abstract}
Test.
\end{abstract}
\maketitle
Test.
\end{document}

使用文件 metadata.xmp 读取

<?xpacket begin='' id=''?>
<x:xmpmeta xmlns:x='adobe:ns:meta/'>
  <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
    <rdf:Description rdf:about=''
      xmlns:cc='http://creativecommons.org/ns#'>
      <cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end='r'?>

我保留了 hyperref 和 inputenc/fontenc 包只是因为它们在我的原始文档中,但它们不会影响输出。

答案1

xmpincl包确实存在\immediate\write\xmpinclWrite{\mcs@xmpinclStart}类似的错误。

为什么这是错误的?因为宏\mcs@xmpinclStart被展开了一路

如果我修改代码来添加

\newcommand\xmpincl@write[2]{\immediate\write#1{\unexpanded\expandafter{#2}}}

然后将出现的全部替换\immediate\write\xmpincl@write.xmpi结果文件为

<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?> 
<x:xmpmeta xmlns:x='adobe:ns:meta/'> 
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'> 
<rdf:Description rdf:about='' 
xmlns:cc='http://creativecommons.org/ns#'> 
<cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName> 
</rdf:Description> 
</rdf:RDF> 
</x:xmpmeta> 
<?xpacket end='r'?> 

这是 的修改副本xmpincl.sty。我还删除了一些\immediate什么都不做的内容。有几个未受保护的结束行,但它们应该不是问题。

%%
%% This is file `xmpincl.sty',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% xmpincl.dtx  (with options: `package')
%% 
%% This is a generated file.
%% 
%% Copyright (C) 2005 by Maarten Sneep <[email protected]>
%% 
%% This work is licensed under the CC-GNU GPL, the human readable license
%% can be found here, with a link to the full text on this page.
%% http://creativecommons.org/licenses/GPL/2.0/
%% 
\NeedsTeXFormat{LaTeX2e}[1999/12/01]
\ProvidesPackage{xmpincl}
    [2008/05/10 v2.2 Include XMP data in pdflatex -- modified by egreg]
\RequirePackage{ifpdf}
\ifpdf\else
\PackageWarningNoLine{xmpincl}%
  {Only pdflatex is supported by the xmpincl package}
\newcommand{\includexmp}[1]{%
  \PackageError{xmpincl}%
  {latex is not supported by the \protect\includexmp\space package}%
  {You tried to include XMP metadata in DVI production.\MessageBreak
   That doesn't work, and I friendly tried to warn you.\MessageBreak
   Just continue and pretend nothing is wrong,\MessageBreak
   but please remove the package or switch to pdflatex.}
}
\relax\expandafter\endinput
\fi
\RequirePackage{ifthen}
\newcommand\xmpincl@write[2]{\immediate\write#1{\unexpanded\expandafter{#2}}}%<-- added
\newcommand*{\mcs@xmpincl@patchFile}[1]{
\begingroup
\newwrite\xmpinclWrite
\newread\xmpinclRead
\openin\xmpinclRead #1.xmp
\immediate\openout\xmpinclWrite #1.xmpi
\newcommand{\mcs@xmpinclStart}%
  {<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?> }
\newcommand{\mcs@xmpinclStartAlt}%
  {<?xpacket begin='' id=''?> }
\newcommand{\mcs@xmpinclEnd}%
  {<?xpacket end='r'?> }
\catcode`\#=12
\catcode`\~=12
\catcode`\&=12
\read\xmpinclRead to\xmpinclReadln%
\ifthenelse{%
  \equal{\mcs@xmpinclStart}{\xmpinclReadln}%
  \or%
  \equal{\mcs@xmpinclStartAlt}{\xmpinclReadln}%
}%
{%
  \xmpincl@write\xmpinclWrite{\mcs@xmpinclStart}%<--- modified
}%
{%
  \xmpincl@write\xmpinclWrite{\mcs@xmpinclStart}%<--- modified
  \xmpincl@write\xmpinclWrite{\xmpinclReadln}%<--- modified
}%
\loop%
  \read\xmpinclRead to\xmpinclReadln%
  \ifthenelse{%
    \equal{\mcs@xmpinclEnd}{\xmpinclReadln}%
    }{% Note: no if.
    }{%
    \if\par\xmpinclReadln\else%
      \xmpincl@write\xmpinclWrite{\xmpinclReadln}%<--- modified
    \fi%
  }%
  \ifeof\xmpinclRead\else%
\repeat
\xmpincl@write\xmpinclWrite{\mcs@xmpinclEnd}%<--- modified
\closein\xmpinclRead
\immediate\closeout\xmpinclWrite
\endgroup
}
\newcommand{\includexmp}[1]{%
  \IfFileExists{#1.xmp}{
    \mcs@xmpincl@patchFile{#1}
    \begingroup
      \pdfcompresslevel=0
      \immediate\pdfobj stream attr {/Type /Metadata /Subtype /XML}
      file{#1.xmpi}
      \pdfcatalog{/Metadata \the\pdflastobj\space 0 R}
    \endgroup
  }
  {\newcommand{\mcs@xmpincl@filename}{#1.xmp}
    \PackageError{xmpincl}%
    {The file \mcs@xmpincl@filename\space was not found}
    {The file \mcs@xmpincl@filename\space The metadata file
     wasn't found.\MessageBreak Oops.}
  }
}
\endinput
%%
%% End of file `xmpincl.sty'.

以下是 的一个expl3版本xmpincl。请注意,PDF 内部专家警告说,这与hyperxmp和不兼容pdfx。问题在于如何添加目录条目,因为其他包可能会尝试添加,/Metadata而组合将失败。

\NeedsTeXFormat{LaTeX2e}[2020/10/01]
\ProvidesPackage{xmpincl3}
    [2021/03/19 v0.1 Include XMP data in pdflatex]

\RequirePackage{expl3}

\ExplSyntaxOn

\msg_new:nnn { xmpincl3 } { not-pdf } {Only~pdf~mode~is~supported~by~the~xmpincl3~package}

\msg_new:nnnn { xmpincl3 } { not-pdf-error }
 { DVI~mode~is~not~supported~by~the~xmpincl3~package }
 {
   You~tried~to~include~XMP~metadata~in~DVI~production.^^J
   That~doesn't~work,~and~I~friendly~tried~to~warn~you.^^J
   Just~continue~and~pretend~nothing~is~wrong,^^J
   but~please~remove~the~package~or~switch~to~pdflatex.
 }

\msg_new:nnnn { xmpincl3 } { file-not-found }
 { The~file~#1~was~not~found }
 { The~metadata~file~#1~wasn't~found.^^J Oops. }

\sys_if_output_pdf:F
 {
  \msg_warning:nn { xmpincl3 } { not-pdf }
  \NewDocumentCommand{\includexmp}{m}{ \msg_error:nn { xmpincl3 } { not-pdf-error } }
 }
\sys_if_output_pdf:F { \endinput }

\ior_new:N \g__xmpincl_input_ior
\iow_new:N \g__xmpincl_output_iow
\str_const:Nn \c__xmpincl_start_str { <?xpacket~begin=''~id='W5M0MpCehiHzreSzNTczkc9d'?> }
\str_const:Nn \c__xmpincl_alt_str { <?xpacket~begin=''~id=''?> }
\str_const:Nn \c__xmpincl_end_str  { <?xpacket~end='r'?> }
\str_new:N \l__xmpincl_first_str

\cs_new_protected:Nn \__xmpincl_writeline:n
 {
  \iow_now:Nn \g__xmpincl_output_iow { #1 }
 }
\cs_generate_variant:Nn \__xmpincl_writeline:n { V }

\cs_new_protected:Nn \__xmpincl_patch:n
 {
  \ior_open:Nn \g__xmpincl_input_ior { #1.xmp }
  \iow_open:Nn \g__xmpincl_output_iow { #1.xmpi }
  \ior_str_get:NN \g__xmpincl_input_ior \l__xmpincl_first_str
  \bool_lazy_or:nnTF
   { \str_if_eq_p:NN \c__xmpincl_start_str \l__xmpincl_first_str }
   { \str_if_eq_p:NN \c__xmpincl_alt_str \l__xmpincl_first_str }
   { \__xmpincl_writeline:V \c__xmpincl_start_str }
   {
    \__xmpincl_writeline:V \l__xmpincl_first_str
   }
  \ior_str_map_inline:Nn \g__xmpincl_input_ior
   {
    \str_if_eq:VnTF \c__xmpincl_end_str { ##1 }
     {
      \ior_map_break:
     }
     {
      \str_if_eq:nnF { ##1 } { } { \__xmpincl_writeline:n { ##1 } }
     }
   }
  \__xmpincl_writeline:V \c__xmpincl_end_str
  \iow_close:N \g__xmpincl_output_iow
  \ior_close:N \g__xmpincl_input_ior
 }

\NewDocumentCommand{\includexmp}{m}
 {
  \file_if_exist:nTF {#1.xmp}
   {
    \__xmpincl_patch:n { #1 }
    \pdf_object_unnamed_write:nx{fstream}{{/Type~/Metadata~/Subtype~/XML}{#1.xmpi}}
    \pdfcatalog{/Metadata~\the\pdflastobj\space 0~R}
  }
  {
   \msg_error:nnn { xmpincl } { file-not-found } { #1 }
  }
 }
\endinput

它本质上是的直接复制xmpincl.sty,没有特殊字符的问题,因为文件行被视为“字符串”进行检查。

相关内容