如何使用 FDO (PGO) + LTO 在 Linux 上构建最新的 HandBrake?

如何使用 FDO (PGO) + LTO 在 Linux 上构建最新的 HandBrake?

将 CFLAGS 和 CXXFLAGS 传递到最新版本(撰写本文时为 v1.3.3)的 HandBrake 版本将起作用,直到您添加-flto以下内容:失败整个构建。

如何使用 LTO 选项构建 HandBrake-flto并将其作为延伸目标,并使用 FDO(反馈定向优化,又名 FDO,又名 PGO)?

HandBrake 中的大多数编解码器都是使用“手动编码”汇编开发的,因此许多人断言编译器优化收益不会那么多。
我想测试和挑战这个断言!

答案1

编辑 01/08/2021...以下所有内容都是针对手刹 v1.3.3。 请参阅我的新答案手刹 v1.4.0

我在 GitHub 上回答了一个与我提出的问题类似的问题,并认为答案可以更好地为 stackexchange 上类似问题的公众服务,而不是被埋在 github 问题单中...... https://github.com/HandBrake/HandBrake/issues/1072#issuecomment-865630524

此外,观察到的好处也将为那些愿意付出努力的人服务,并为他们节省大量编码/转码时间。他们可以在完成工作后对工作进行基准测试来证明这一断言。

大部分过程都是根据此处描述的注释推导出来并进行实验的...... https://github.com/griff/HandBrake/blob/master/doc/BUILD-Linux

如上面的链接所述,不建议使用 CFLAGS/CXXFLAGS 来引导编译或构建。建议使用内置配置机制来设置 gcc 标志。

如何?

Handbrake 只是很多“crontrib”的前端。要查看每个 contrib 模块的构建方式,您可以在构建之前利用构建或目标目录中每个 contrib 的“make”报告。

要获取构建目录,您需要通过以下方式进行初始配置...

$  ./configure --build=build --optimize=speed

如果你还没有的话。

制作报告

例如,假设您正在名为“build”的文件夹中构建 HandBrake(如上面配置命令中的值),然后:

$  cd ./build
$  make report.help
  AVAILABLE MAKEFILE VARS REPORTS
  ----------------------------------------------------------------
  report.main            global general vars
  report.gcc             global gcc vars (inherited by module GCC)
  report.var             usage: make report.var name=VARNAME
  x265.report            X265-scoped vars
  x265_8.report          X265_8-scoped vars
  x265_10.report         X265_10-scoped vars
  x265_12.report         X265_12-scoped vars
  libdav1d.report        LIBDAV1D-scoped vars
  ffmpeg.report          FFMPEG-scoped vars
  libdvdread.report      LIBDVDREAD-scoped vars
  libdvdnav.report       LIBDVDNAV-scoped vars
  libbluray.report       LIBBLURAY-scoped vars
  nvenc.report           NVENC-scoped vars
  libhb.report           LIBHB-scoped vars
  test.report            TEST-scoped vars
  gtk.report             GTK-scoped vars
  pkg.report             PKG-scoped vars

在上面第一列的每一行上,您将看到每个报告。然后您可以通过以下方式访问报告

$  make <report_name>

替换<report_name>为您想要的报告的位置。

值得注意的是,即使在每个报告中,上述内容也存在层次结构和继承性。

report.gcc

可以作为 gcc 标志的根。

就我而言,我之前选择使用“速度”来配置构建......

$  ./configure --build=build --optimize=speed

哪个映射到

GCC.args.O.speed

在里面report.gcc

该报告中的另一个重要关键是

GCC.args.extra

基本上“可能”在前者之后附加额外的编译器选项标志。正如您所知道的 gcc,如果选项之间存在冲突,则使用最后一个。由于我们无法轻易判断多个模块是否正在使用其中一个或两个,因此我倾向于确保第一个模块中的内容也包含在后者中。但后者可以包含更多!您可以通过检查报告来查看默认值。

您可以通过创建名为“的文本文件配置来覆盖上述内容自定义定义文件” 在 handbrake 源文件夹的根目录中(如果您 git 克隆它,则在 HandBrake 的顶部文件夹中,您基本上可以在其中执行 git pull 命令)。

/HandBrake$ ls -h
AUTHORS.markdown  CODE_OF_CONDUCT.md  CONTRIBUTING.md  download  gtk      macosx         pkg              scripts      THANKS.markdown
build             configure           COPYING          gccFDO    libhb    make           preset           SECURITY.md  TRANSLATION.markdown
build2            contrib             custom.defs    graphics  LICENSE  NEWS.markdown  README.markdown  test         win

FDO(又名 PGO)

我在我的系统中进行 FDO(反馈导向优化,又名 FDO,又名 PGO - 配置文件引导优化),所以我通常首先构建,custom.defs定义为

$ cat custom.defs 
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-generate=../gccFDO -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-generate=../gccFDO -fprofile-update=atomic

然后运行 ​​HandBrake,使用不同的编解码器、过滤器和设置对多个视频进行转码;需要几天时间来生成配置文件。然后我使用生成的配置文件...

$ cat custom.defs 
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training

在一个全新的构建目录上。进行分析的良好常见嫌疑点是典型目标编码类型的典型源类型。我的典型目标类型是带有 AAC 音频的 x265_10bit:

  1. 从 x264 到 x265_10bit
  2. 从 x265 到 x265_10bit
  3. 从各种形式的 AC3 到您使用的典型 AAC
  4. 从各种形式的 DTS 到您使用的典型 AAC
  5. 您使用的任何典型的预处理、过滤、去噪等。

正如您可以想象的那样,根据您的硬件,这可能需要一段时间!我的分析花了一周时间!

您可以使用我上面为每个模块描述的报告过程来微调每个模块的编译器标志和优化,并通过在文件中用custom_defs您想要的值引用它们来覆盖键,就像上面的默认值示例一样GCC.args.*

为了使上述所有操作都起作用,请记住不要导出 CFLAGS 或 CXXFLAGS。您可以通过以下方式检查 bash 会话中设置的标志:

$  export -p | grep FLAGS

LTO + FDO:

链接时间优化 LTO 与 FDO 结合起来非常出色,可以在 google 上轻松研究许多程序和基准测试。

GCC.args.*不幸的是,当将 LTO 设置为使用-fltoFFMPEG 模块或设置 LTO 的默认值时;失败整个构建。这是一个布尔“或”。它会在其中一个或另一个或两者上失败!

然而,LTO 可以添加到所有其他模块!

这是我的custom.defs...

$ cat custom.defs
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_8.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_10.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_12.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBHB.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDAV1D.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GTK.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GTK.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBBLURAY.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
TEST.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
TEST.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
NVENC.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
NVENC.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training

编辑 01/08/2021... 以上所有内容都是针对 Handbrake 完成的v1.3.3

对于 v1.4.0,上述过程对我来说失败了 请参阅我对 v1.4.0 的其他答案。

答案2

我已经重新尝试了最新的标签版本手刹 v1.4.0在 GCC-11 和 CLANG-12 中。需要对所需的配置进行一些更改才能成功构建。例如,GCC-11 构建无法成功构建某些模块,因为它无法在训练后解析配置文件的路径(绝对路径中的 gcda 文件)。

下面是 GCC-11 和 CLANG-12 v1.4.0 的训练和 FDO 配置,与之前针对 Handbrake v1.3.3 的答案的过程不同。

GCC-11:

GCC-11 的配置和构建命令:

./configure --harden --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0 && cd ./build-v1.4.0 && time make -j$(( $(nproc) + 1 ));

训练/分析阶段 HANDBRAKE V1.4.0 --> GCC-11 custom.defs 文件:

GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic
X265.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_8.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_10.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_12.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBHB.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
GTK.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDREAD.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDNAV.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBBLURAY.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
TEST.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
FDKAAC.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
ZIMG.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto

FDO STAGE HANDBRAKE V1.4.0 --> GCC-11 custom.defs 文件:

GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training
X265.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_8.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_10.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_12.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBHB.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
GTK.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDREAD.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDNAV.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBBLURAY.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
TEST.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
FDKAAC.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
ZIMG.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto

LLVM-12/CLANG-12/LLD-12:

Clang 对 PGO 的处理与 GCC 略有不同。显而易见的是,与 GCC 和相关的默认工具相比,使用 Clang/LLVM/LLD 解析模块的绝对路径没有问题。然而,Clang 有一个额外的合并步骤来合并 FDO 所需的原始配置文件。

因此有 3 个步骤:

  1. 培训/简介阶段
  2. 合并原始配置文件数据
  3. FDO阶段

详细步骤命令。步骤 1 和 3 的 custom.defs 文件分别在下面的三个步骤之后列出。本节纯粹是为了说明每个步骤所需的命令,而不是 custom.defs。因此,在运行配置和构建命令之前,您需要确保 custom.defs 已就位:

  1. LLVM-12/CLANG-12/LLD-12 的配置和构建命令:
LDFLAGS="-fuse-ld=lld" ./configure --ar /usr/bin/llvm-ar --ranlib /usr/bin/llvm-ranlib --strip /usr/bin/llvm-strip --cc /usr/bin/clang --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0-CLANG && cd ./build-v1.4.0-CLANG && time LDFLAGS="-fuse-ld=lld" make -j$(( $(nproc) + 1 ));

构建后,按照 GCC 的正常方式进行训练/配置文件,或者如果您尝试了 v1.3.3 的早期说明。

  1. 训练/分析后,合并原始分析数据。将 和 路径替换为您的构建的正确位置。
llvm-profdata merge -output=<Absolute-Path>/handbrake.profdata <Absolute-Path-To-Profile-Files>/default_*.profraw
  1. FDO Build,这是与步骤 1 完全相同的一行命令。区别在于 custom.defs 文件。
LDFLAGS="-fuse-ld=lld" ./configure --ar /usr/bin/llvm-ar --ranlib /usr/bin/llvm-ranlib --strip /usr/bin/llvm-strip --cc /usr/bin/clang --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0-CLANG && cd ./build-v1.4.0-CLANG && time LDFLAGS="-fuse-ld=lld" make -j$(( $(nproc) + 1 ));

训练/分析阶段 HANDBRAKE V1.4.0 --> LLVM-12/CLANG-12/LLD-12 custom.defs 文件:

记得更换<配置文件的绝对路径>使用正确的绝对路径。

GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic
X265.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
GTK.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
TEST.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
FDKAAC.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
ZIMG.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin

FDO STAGE HANDBRAKE V1.4.0 --> LLVM-12/CLANG-12/LLD-12 custom.defs 文件:

记得更换<合并配置文件的绝对路径>使用正确的绝对路径。

GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata
X265.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
GTK.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
TEST.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
FDKAAC.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
ZIMG.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin

好啦,现在您可以构建了手刹 v1.4.0 带 PGO+LTO反对 GCC 或 LLVM/CLANG/LLD。请随意选择两者中的哪一个满足您的喜好或基准! :-)

相关内容