如何找出 TTF 文件中定义了哪些 unicode 代码点？

Question 1

奥特芬信息看起来很有希望：

-u, --unicode
  Print each Unicode code point supported by the font, followed by
  the glyph number representing that code point (and, if present,
  the name of the corresponding glyph).

例如 DejaVuSans-Bold 知道 fl 连字(fl)：

$ otfinfo -u /usr/share/fonts/TTF/DejaVuSans-Bold.ttf |grep ^uniFB02
uniFB02 4899 fl

Answer

奥特芬信息看起来很有希望：

-u, --unicode
  Print each Unicode code point supported by the font, followed by
  the glyph number representing that code point (and, if present,
  the name of the corresponding glyph).

例如 DejaVuSans-Bold 知道 fl 连字(fl)：

$ otfinfo -u /usr/share/fonts/TTF/DejaVuSans-Bold.ttf |grep ^uniFB02
uniFB02 4899 fl

Question 2

我找到了一个Python库，字体工具(皮皮）可以通过一些 python 脚本来完成此操作。

这是一个简单的脚本，列出了所有指定字形的字体：

#!/usr/bin/env python3

from fontTools.ttLib import TTFont
import sys

char = int(sys.argv[1], base=0)

print("Looking for U+%X (%c)" % (char, chr(char)))

for arg in sys.argv[2:]:
    try:
        font = TTFont(arg)

        for cmap in font['cmap'].tables:
            if cmap.isUnicode():
                if char in cmap.cmap:
                    print("Found in", arg)
                    break
    except Exception as e:
        print("Failed to read", arg)
        print(e)

第一个参数是代码点（十进制或带有 0x 的十六进制），其余的是要查找的字体文件。

我没有费心尝试让它适用于.ttc文件（它需要一些额外的参数）。

注意：我首先尝试了 otfinfo 工具，但只得到了基本的多语言平面字符（<= U+FFFF）。 python 脚本可以正常找到扩展平面字符。

Answer