我正在尝试使用xmpincl
包裹以机器可读的方式将论文的 Creative Commons 许可信息包含在 PDF 中。过去,这种方法效果很好,但对于当前文档,作者姓名中的一个 unicode 字符会出错。
具体来说,creativecommons 网站生成的 XMP 文件包含归属名称
<cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName>
并且 xmpincl+pdflatex 在编译过程中不会抱怨,但如果我实际查看文件中的元数据(使用 exiftool,或仅在文本编辑器中检查 pdf),重音符号é
会完全乱码:
Attribution Name: Rodrigo Guti\unhbox \voidb@x \bgroup \let \unhbox \voidb@x \setbox
\@tempboxa \hbox {e\global \mathchardef \accent@spacefactor
\spacefactor }\let \begingroup \endgroup \relax \let \ignorespaces
\relax \accent 19 e\egroup \spacefactor \accent@spacefactor rrez-Cuevas
and Emilio Pisanty
有没有办法将 unicode 字符包含在 XMP 元数据中,使用 xmpincl 或其他方式?
(如果不是,那么这是一个大问题。只能处理 ascii 作者姓名的系统,说白了,就是歧视性的。)
编辑:这是MWE:
\documentclass[pra]{revtex4-2}
\usepackage{xmpincl}
\includexmp{metadata}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{hyperref}
\begin{document}
\title{MWE}
\author{Rodrigo Guti\'{e}rrez-Cuevas}
\author{Emilio Pisanty}
\begin{abstract}
Test.
\end{abstract}
\maketitle
Test.
\end{document}
使用文件 metadata.xmp 读取
<?xpacket begin='' id=''?>
<x:xmpmeta xmlns:x='adobe:ns:meta/'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about=''
xmlns:cc='http://creativecommons.org/ns#'>
<cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='r'?>
我保留了 hyperref 和 inputenc/fontenc 包只是因为它们在我的原始文档中,但它们不会影响输出。
答案1
该xmpincl
包确实存在\immediate\write\xmpinclWrite{\mcs@xmpinclStart}
类似的错误。
为什么这是错误的?因为宏\mcs@xmpinclStart
被展开了一路。
如果我修改代码来添加
\newcommand\xmpincl@write[2]{\immediate\write#1{\unexpanded\expandafter{#2}}}
然后将出现的全部替换\immediate\write
为\xmpincl@write
,.xmpi
结果文件为
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about=''
xmlns:cc='http://creativecommons.org/ns#'>
<cc:attributionName>Rodrigo Gutiérrez-Cuevas and Emilio Pisanty</cc:attributionName>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='r'?>
这是 的修改副本xmpincl.sty
。我还删除了一些\immediate
什么都不做的内容。有几个未受保护的结束行,但它们应该不是问题。
%%
%% This is file `xmpincl.sty',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% xmpincl.dtx (with options: `package')
%%
%% This is a generated file.
%%
%% Copyright (C) 2005 by Maarten Sneep <[email protected]>
%%
%% This work is licensed under the CC-GNU GPL, the human readable license
%% can be found here, with a link to the full text on this page.
%% http://creativecommons.org/licenses/GPL/2.0/
%%
\NeedsTeXFormat{LaTeX2e}[1999/12/01]
\ProvidesPackage{xmpincl}
[2008/05/10 v2.2 Include XMP data in pdflatex -- modified by egreg]
\RequirePackage{ifpdf}
\ifpdf\else
\PackageWarningNoLine{xmpincl}%
{Only pdflatex is supported by the xmpincl package}
\newcommand{\includexmp}[1]{%
\PackageError{xmpincl}%
{latex is not supported by the \protect\includexmp\space package}%
{You tried to include XMP metadata in DVI production.\MessageBreak
That doesn't work, and I friendly tried to warn you.\MessageBreak
Just continue and pretend nothing is wrong,\MessageBreak
but please remove the package or switch to pdflatex.}
}
\relax\expandafter\endinput
\fi
\RequirePackage{ifthen}
\newcommand\xmpincl@write[2]{\immediate\write#1{\unexpanded\expandafter{#2}}}%<-- added
\newcommand*{\mcs@xmpincl@patchFile}[1]{
\begingroup
\newwrite\xmpinclWrite
\newread\xmpinclRead
\openin\xmpinclRead #1.xmp
\immediate\openout\xmpinclWrite #1.xmpi
\newcommand{\mcs@xmpinclStart}%
{<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?> }
\newcommand{\mcs@xmpinclStartAlt}%
{<?xpacket begin='' id=''?> }
\newcommand{\mcs@xmpinclEnd}%
{<?xpacket end='r'?> }
\catcode`\#=12
\catcode`\~=12
\catcode`\&=12
\read\xmpinclRead to\xmpinclReadln%
\ifthenelse{%
\equal{\mcs@xmpinclStart}{\xmpinclReadln}%
\or%
\equal{\mcs@xmpinclStartAlt}{\xmpinclReadln}%
}%
{%
\xmpincl@write\xmpinclWrite{\mcs@xmpinclStart}%<--- modified
}%
{%
\xmpincl@write\xmpinclWrite{\mcs@xmpinclStart}%<--- modified
\xmpincl@write\xmpinclWrite{\xmpinclReadln}%<--- modified
}%
\loop%
\read\xmpinclRead to\xmpinclReadln%
\ifthenelse{%
\equal{\mcs@xmpinclEnd}{\xmpinclReadln}%
}{% Note: no if.
}{%
\if\par\xmpinclReadln\else%
\xmpincl@write\xmpinclWrite{\xmpinclReadln}%<--- modified
\fi%
}%
\ifeof\xmpinclRead\else%
\repeat
\xmpincl@write\xmpinclWrite{\mcs@xmpinclEnd}%<--- modified
\closein\xmpinclRead
\immediate\closeout\xmpinclWrite
\endgroup
}
\newcommand{\includexmp}[1]{%
\IfFileExists{#1.xmp}{
\mcs@xmpincl@patchFile{#1}
\begingroup
\pdfcompresslevel=0
\immediate\pdfobj stream attr {/Type /Metadata /Subtype /XML}
file{#1.xmpi}
\pdfcatalog{/Metadata \the\pdflastobj\space 0 R}
\endgroup
}
{\newcommand{\mcs@xmpincl@filename}{#1.xmp}
\PackageError{xmpincl}%
{The file \mcs@xmpincl@filename\space was not found}
{The file \mcs@xmpincl@filename\space The metadata file
wasn't found.\MessageBreak Oops.}
}
}
\endinput
%%
%% End of file `xmpincl.sty'.
以下是 的一个expl3
版本xmpincl
。请注意,PDF 内部专家警告说,这与hyperxmp
和不兼容pdfx
。问题在于如何添加目录条目,因为其他包可能会尝试添加,/Metadata
而组合将失败。
\NeedsTeXFormat{LaTeX2e}[2020/10/01]
\ProvidesPackage{xmpincl3}
[2021/03/19 v0.1 Include XMP data in pdflatex]
\RequirePackage{expl3}
\ExplSyntaxOn
\msg_new:nnn { xmpincl3 } { not-pdf } {Only~pdf~mode~is~supported~by~the~xmpincl3~package}
\msg_new:nnnn { xmpincl3 } { not-pdf-error }
{ DVI~mode~is~not~supported~by~the~xmpincl3~package }
{
You~tried~to~include~XMP~metadata~in~DVI~production.^^J
That~doesn't~work,~and~I~friendly~tried~to~warn~you.^^J
Just~continue~and~pretend~nothing~is~wrong,^^J
but~please~remove~the~package~or~switch~to~pdflatex.
}
\msg_new:nnnn { xmpincl3 } { file-not-found }
{ The~file~#1~was~not~found }
{ The~metadata~file~#1~wasn't~found.^^J Oops. }
\sys_if_output_pdf:F
{
\msg_warning:nn { xmpincl3 } { not-pdf }
\NewDocumentCommand{\includexmp}{m}{ \msg_error:nn { xmpincl3 } { not-pdf-error } }
}
\sys_if_output_pdf:F { \endinput }
\ior_new:N \g__xmpincl_input_ior
\iow_new:N \g__xmpincl_output_iow
\str_const:Nn \c__xmpincl_start_str { <?xpacket~begin=''~id='W5M0MpCehiHzreSzNTczkc9d'?> }
\str_const:Nn \c__xmpincl_alt_str { <?xpacket~begin=''~id=''?> }
\str_const:Nn \c__xmpincl_end_str { <?xpacket~end='r'?> }
\str_new:N \l__xmpincl_first_str
\cs_new_protected:Nn \__xmpincl_writeline:n
{
\iow_now:Nn \g__xmpincl_output_iow { #1 }
}
\cs_generate_variant:Nn \__xmpincl_writeline:n { V }
\cs_new_protected:Nn \__xmpincl_patch:n
{
\ior_open:Nn \g__xmpincl_input_ior { #1.xmp }
\iow_open:Nn \g__xmpincl_output_iow { #1.xmpi }
\ior_str_get:NN \g__xmpincl_input_ior \l__xmpincl_first_str
\bool_lazy_or:nnTF
{ \str_if_eq_p:NN \c__xmpincl_start_str \l__xmpincl_first_str }
{ \str_if_eq_p:NN \c__xmpincl_alt_str \l__xmpincl_first_str }
{ \__xmpincl_writeline:V \c__xmpincl_start_str }
{
\__xmpincl_writeline:V \l__xmpincl_first_str
}
\ior_str_map_inline:Nn \g__xmpincl_input_ior
{
\str_if_eq:VnTF \c__xmpincl_end_str { ##1 }
{
\ior_map_break:
}
{
\str_if_eq:nnF { ##1 } { } { \__xmpincl_writeline:n { ##1 } }
}
}
\__xmpincl_writeline:V \c__xmpincl_end_str
\iow_close:N \g__xmpincl_output_iow
\ior_close:N \g__xmpincl_input_ior
}
\NewDocumentCommand{\includexmp}{m}
{
\file_if_exist:nTF {#1.xmp}
{
\__xmpincl_patch:n { #1 }
\pdf_object_unnamed_write:nx{fstream}{{/Type~/Metadata~/Subtype~/XML}{#1.xmpi}}
\pdfcatalog{/Metadata~\the\pdflastobj\space 0~R}
}
{
\msg_error:nnn { xmpincl } { file-not-found } { #1 }
}
}
\endinput
它本质上是的直接复制xmpincl.sty
,没有特殊字符的问题,因为文件行被视为“字符串”进行检查。