是否可以生成包含不可复制文本的 PDF?我的意思是,当您想从 PDF 中复制文本时,您无法复制它,或者您复制的是无意义的字符。
答案1
除了将所有文本转换为图像之外,据我所知,一种方法是销毁字体的 Cmap。我们可以使用cmap
包和一个特殊的 cmap 文件来实现此目的。此 cmap 文件是在 VerbatimOut 环境中生成的。
(警告:生成不可复制的 PDF 没有什么意义。OCR 现在非常容易。)
% pdflatex is required
\documentclass{article}
\usepackage[resetfonts]{cmap}
\usepackage{fancyvrb}
\begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
\end{VerbatimOut}
\usepackage{lipsum}
\begin{document}
\lipsum
\end{document}
答案2
路特克斯允许在define_font
回调中操作字体。Luaotfload 在字体加载器完成其工作后立即安装了一个额外的钩子,从而进一步简化了此过程:回调
luaotfload.patch_font
。通常,它用于严肃且有建设性的任务,例如设置几个字体尺寸或确保数据结构中的向后兼容性。当然,它也可能被滥用于诸如禁用复制和粘贴之类的肮脏黑客行为。
在应用回调时patch_font
,字体已经定义并可供使用。所有必要的表都已创建并放置在 Luatex 需要它们的位置。其中包括characters
保存有关字形的预处理信息的表。在下面的代码中,我们修改了tounicode
每个字形的字段,以便它映射到可打印 ASCII 范围内的某个随机位置。请注意,这不会影响字形的形状和度量,因为它们与实际代码点无关。因此,PDF 将包含无法复制的清晰文本。
打包文件obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
使用演示:
%% we will need these packages
\input luatexbase.sty
\input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
\pdfcompresslevel0
\pdfobjcompresslevel0
%% load obfuscation code
\RequireLuaModule {obfuscate}
%% convenience macro
\def \packagecmd #1{\directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
\def \beginobfuscate {\packagecmd {obfuscate_begin ()}}
\def \endobfuscate {\packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
\beginobfuscate
\font \mainfont = "file:Iwona-Regular.otf:mode=base"
\font \italicfont = "file:Iwona-Italic.otf:mode=base"
\endobfuscate
%% ... while fonts defined outside will have the mapping intact
\font \boldfont = "file:Iwona-Bold.otf:mode=base"
\font \bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
\mainfont
obfuscated text before {\italicfont obfuscated too} and after \par
obfuscated text before {\boldfont not obfuscated} and after \par
obfuscated text before {\bolditalicfont not obfuscated} and after \par
\bye
PDF 查看器中的结果:
将此与以下输出进行对比pdftotext
:
\rf2yC'I_J I_dI r_f\{_ 9;H`bp<<L& <99 '5J 'fI_{
\rf2yC'I_J I_dI r_f\{_ not obfuscated '5J 'fI_{
\rf2yC'I_J I_dI r_f\{_ not obfuscated '5J 'fI_{
但请立即忘记这一切,并且不要混淆生产文本——不要对你的读者刻薄!
编辑 因为慷慨的业力捐赠者特别要求一个上下文解决方案,所以我会把它作为奖励。它更优雅,因为它依赖于字体好东西 允许将后处理器应用于特定字体的机制,之后可以像普通字体功能一样使用。
\startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
\stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
\definefontfeature [obfuscate] [obfuscate=yes]
\definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
\definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
\definefont [boldfont] [file:Iwona-Bold.otf]
\definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
\starttext
\mainfont
obfuscated text before {\italicfont obfuscated too} and after \par
obfuscated text before {\boldfont not obfuscated} and after \par
obfuscated text before {\bolditalicfont not obfuscated} and after \par
\stoptext
答案3
评论
我使用一个小脚本,将所有字体转换为路径。该脚本使用第一个参数作为 -file 的输入.pdf
,并将输出写入具有相同名称和扩展名的文件中-rst.pdf
你需要Ghostscript以便运行我的脚本。
执行
运行于bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=ps2write -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
现在使用 ps2write (而不是 pswrite)这里。
结果
答案4
如果内容可以被查看,那么它就可以被复制。无论使用何种加密和限制,内容在某个时候都必须被公开才能发挥作用。对于所有数字内容和大多数大于纳米级的物理内容来说,情况可能都是如此……
例如,PDF:
- 光栅化:Printscreen => OCR
- 任何保护:重新输入
- 内容保护:开源阅读器的修改版本
网页内容:
- 右键弹出:Opera=>阻止页面接收内容菜单事件
- 右键单击弹出窗口:任何现代键盘上的“菜单”按钮
- Flash:下载 SWF 文件,使用免费软件进行反编译
- 查看页面源代码,使用 Chrome/Opera/Firefox 调试器获取所需内容的 URL
音频(例如 HDCP):
- 电视上的耳机插孔 => 电脑上的线路输入插孔
- 焊接到前置放大器 => PC 上的线路输入插座
视频(例如 HDCP):
- 很多很多的选择...快速的谷歌搜索就会显示出来。
某人的笔记本电脑/U盘上的加密内容:
- 数字暴力破解:暴力破解加密密钥
- 物理蛮力:http://imgs.xkcd.com/comics/security.png