将 SRT 字幕与针对 iOS 转码的 MP4（HEVC/x265/mov_text）合并时出现问题：“pts 没有值”和“超出 mov/mp4 格式的范围”

Question 1

似乎在 SRT 文件的开头添加一个从视频开头到字幕开始位置的假字幕可以解决这个问题。

显然，这个解决方案是一种“黑客手段”，但它确实有效。

在我打算在 SRT 文件开头添加一个假字幕之后，我意识到 SRT 字幕——根据SRT 规格— 允许使用 HTML 标签。我知道我添加了以下虚假的副标题，一切正常！

0
00:00:00,000 --> 00:16:75,000
<b></b>

1
00:52:33,123 --> 00:52:50,123
It was a dark and stormy night…

就这样！只需添加一个空的粗体标签即可让一切正常，并合并字幕……

但是——正如一开始所说的——这显然是一种黑客行为，我愿意听取更多了解 FFmpeg 的人的意见。我只能假设这个问题没有反映出期望的行为，一定有更优雅的方式来处理这种情况。或者这是一个错误（而不是功能），应该报告？

Answer

似乎在 SRT 文件的开头添加一个从视频开头到字幕开始位置的假字幕可以解决这个问题。

显然，这个解决方案是一种“黑客手段”，但它确实有效。

在我打算在 SRT 文件开头添加一个假字幕之后，我意识到 SRT 字幕——根据SRT 规格— 允许使用 HTML 标签。我知道我添加了以下虚假的副标题，一切正常！

0
00:00:00,000 --> 00:16:75,000
<b></b>

1
00:52:33,123 --> 00:52:50,123
It was a dark and stormy night…

就这样！只需添加一个空的粗体标签即可让一切正常，并合并字幕……

但是——正如一开始所说的——这显然是一种黑客行为，我愿意听取更多了解 FFmpeg 的人的意见。我只能假设这个问题没有反映出期望的行为，一定有更优雅的方式来处理这种情况。或者这是一个错误（而不是功能），应该报告？

Question 2

我在处理一个媒体文件时遇到了同样的问题，我花了大半天时间尝试不同的命令，以及不同的重建 pts 的方法，但都无济于事，直到我终于偶然发现了这篇文章。令我非常难过的是，这里提出的解决方法对我来说不起作用。

我在 Windows 11 上运行由 gyan.dev 为 Windows 编译的二进制文件，但是我也尝试通过 WSL（适用于 Linux 的 Windows 子系统）运行 ubuntu 并通过 apt 安装相同的命令。

插入空标签时我继续收到相同的错误：

0
00:00:00,000 --> 00:18:00,000
<b></b>

1
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

[mp4 @ 0000023630f5cf80] Packet duration: 2601472000 / dts: 2601472000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2601681999 / dts: 2605143000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2602474998 / dts: 2608605000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2603642997 / dts: 2611441000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value

如果我插入一个可见字符（例如一个句号），那么 srt 文件将顺利合并。当然，在视频的前 40 分钟里，屏幕上只会显示一个句号。

0
00:00:00,000 --> 00:18:00,000
.

1
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

我尝试了粗体标签、斜体标签以及我能想到的任何其他标签，甚至无法识别的段落标签。任何空标签都不会触发保留时间并导致“pts 没有值”错误。

无奈之下，我决定采用数量大于质量的方法，添加 10 个条目，每个条目间隔 2 分钟，持续时间很短，这样我就不用看到一个固定可见的句号字符，而是看到一个间歇性的句号字符，而且这种方法很有效。因此，我一时兴起，尝试将前 10 个条目的开始时间和结束时间设置为相同，结果句号字符被隐藏，字幕合并时没有出现错误：

1
00:02:00,000 --> 00:02:00,000
.

2
00:04:00,000 --> 00:04:00,000
.

3
00:06:00,000 --> 00:06:00,000
.

4
00:08:00,000 --> 00:08:00,000
.

5
00:10:00,000 --> 00:10:00,000
.

6
00:12:00,000 --> 00:12:00,000
.

7
00:14:00,000 --> 00:14:00,000
.

8
00:16:00,000 --> 00:16:00,000
.

9
00:18:00,000 --> 00:18:00,000
.

10
00:20:01,000 --> 00:20:01,000
.

11
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

就像这里的原始帖子一样，这是一种黑客式的解决方法，而不是正确的解决方案，但也像这里的原始帖子一样，我找不到正确的解决方案。希望这能帮助其他遇到此问题并像我一样偶然发现此帖子的人。

如果您有一个较大的 srt，又不愿意手动增加，我让 GPT 为我编写了这个小的 Python 脚本（我本来可以自己编写，但此时我已经几乎解决了这个问题，并认为它足够简单，GPT 可以处理）

def add_entries_to_srt(existing_file_path, new_file_path, num_entries=10, duration=0.1, buffer_time=120):
    with open(existing_file_path, 'r', encoding='utf-8') as existing_file:
        existing_content = existing_file.read()

    # Parse existing entries
    entries = existing_content.strip().split('\n\n')
    existing_entries_count = len(entries)

    # Generate new entries
    new_entries = []
    for i in range(1, num_entries + 1):
        start_time = i * (duration + buffer_time)
        end_time = start_time + duration
        entry_text = f"{i}\n{format_time(start_time)} --> {format_time(end_time)}\n<i>.</i>"
        new_entries.append(entry_text)

    # Increment entry numbers of existing entries
    for i in range(existing_entries_count):
        entry_lines = entries[i].split('\n')
        entry_number = int(entry_lines[0])
        entry_lines[0] = str(entry_number + num_entries)
        entries[i] = '\n'.join(entry_lines)

    # Combine new and existing entries
    combined_entries = '\n\n'.join(new_entries + entries)

    # Write to the new file
    with open(new_file_path, 'w', encoding='utf-8') as new_file:
        new_file.write(combined_entries)

def format_time(seconds):
    minutes, seconds = divmod(seconds, 60)
    hours, minutes = divmod(minutes, 60)
    return f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d},000"

# Replace 'existing_subtitle.srt' with your actual file path
existing_file_path = r'C:\Temp\ffmpeg\Subtitles.srt'
new_file_path = r'C:\Temp\ffmpeg\Subtitles.EDIT.srt'

add_entries_to_srt(existing_file_path, new_file_path)

编辑： 事实证明，根据您的播放器，设置前 10 个条目的开始和结束时间相同会导致该时段被隐藏或可见 5 秒。设置 1 毫秒的偏移量似乎更可靠地保持该时段隐藏。

1
00:02:00,000 --> 00:02:00,001
.

2
00:04:00,000 --> 00:04:00,001
.

3
00:06:00,000 --> 00:06:00,001
.

4
00:08:00,000 --> 00:08:00,001
.

5
00:10:00,000 --> 00:10:00,001
.

6
00:12:00,000 --> 00:12:00,001
.

7
00:14:00,000 --> 00:14:00,001
.

8
00:16:00,000 --> 00:16:00,001
.

9
00:18:00,000 --> 00:18:00,001
.

10
00:20:01,000 --> 00:20:01,001
.

11
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

Answer

我在处理一个媒体文件时遇到了同样的问题，我花了大半天时间尝试不同的命令，以及不同的重建 pts 的方法，但都无济于事，直到我终于偶然发现了这篇文章。令我非常难过的是，这里提出的解决方法对我来说不起作用。

我在 Windows 11 上运行由 gyan.dev 为 Windows 编译的二进制文件，但是我也尝试通过 WSL（适用于 Linux 的 Windows 子系统）运行 ubuntu 并通过 apt 安装相同的命令。

插入空标签时我继续收到相同的错误：

0
00:00:00,000 --> 00:18:00,000
<b></b>

1
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

[mp4 @ 0000023630f5cf80] Packet duration: 2601472000 / dts: 2601472000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2601681999 / dts: 2605143000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2602474998 / dts: 2608605000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value
[mp4 @ 0000023630f5cf80] Packet duration: 2603642997 / dts: 2611441000 is out of range
[mp4 @ 0000023630f5cf80] pts has no value

如果我插入一个可见字符（例如一个句号），那么 srt 文件将顺利合并。当然，在视频的前 40 分钟里，屏幕上只会显示一个句号。

0
00:00:00,000 --> 00:18:00,000
.

1
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

我尝试了粗体标签、斜体标签以及我能想到的任何其他标签，甚至无法识别的段落标签。任何空标签都不会触发保留时间并导致“pts 没有值”错误。

无奈之下，我决定采用数量大于质量的方法，添加 10 个条目，每个条目间隔 2 分钟，持续时间很短，这样我就不用看到一个固定可见的句号字符，而是看到一个间歇性的句号字符，而且这种方法很有效。因此，我一时兴起，尝试将前 10 个条目的开始时间和结束时间设置为相同，结果句号字符被隐藏，字幕合并时没有出现错误：

1
00:02:00,000 --> 00:02:00,000
.

2
00:04:00,000 --> 00:04:00,000
.

3
00:06:00,000 --> 00:06:00,000
.

4
00:08:00,000 --> 00:08:00,000
.

5
00:10:00,000 --> 00:10:00,000
.

6
00:12:00,000 --> 00:12:00,000
.

7
00:14:00,000 --> 00:14:00,000
.

8
00:16:00,000 --> 00:16:00,000
.

9
00:18:00,000 --> 00:18:00,000
.

10
00:20:01,000 --> 00:20:01,000
.

11
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

就像这里的原始帖子一样，这是一种黑客式的解决方法，而不是正确的解决方案，但也像这里的原始帖子一样，我找不到正确的解决方案。希望这能帮助其他遇到此问题并像我一样偶然发现此帖子的人。

如果您有一个较大的 srt，又不愿意手动增加，我让 GPT 为我编写了这个小的 Python 脚本（我本来可以自己编写，但此时我已经几乎解决了这个问题，并认为它足够简单，GPT 可以处理）

def add_entries_to_srt(existing_file_path, new_file_path, num_entries=10, duration=0.1, buffer_time=120):
    with open(existing_file_path, 'r', encoding='utf-8') as existing_file:
        existing_content = existing_file.read()

    # Parse existing entries
    entries = existing_content.strip().split('\n\n')
    existing_entries_count = len(entries)

    # Generate new entries
    new_entries = []
    for i in range(1, num_entries + 1):
        start_time = i * (duration + buffer_time)
        end_time = start_time + duration
        entry_text = f"{i}\n{format_time(start_time)} --> {format_time(end_time)}\n<i>.</i>"
        new_entries.append(entry_text)

    # Increment entry numbers of existing entries
    for i in range(existing_entries_count):
        entry_lines = entries[i].split('\n')
        entry_number = int(entry_lines[0])
        entry_lines[0] = str(entry_number + num_entries)
        entries[i] = '\n'.join(entry_lines)

    # Combine new and existing entries
    combined_entries = '\n\n'.join(new_entries + entries)

    # Write to the new file
    with open(new_file_path, 'w', encoding='utf-8') as new_file:
        new_file.write(combined_entries)

def format_time(seconds):
    minutes, seconds = divmod(seconds, 60)
    hours, minutes = divmod(minutes, 60)
    return f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d},000"

# Replace 'existing_subtitle.srt' with your actual file path
existing_file_path = r'C:\Temp\ffmpeg\Subtitles.srt'
new_file_path = r'C:\Temp\ffmpeg\Subtitles.EDIT.srt'

add_entries_to_srt(existing_file_path, new_file_path)

编辑： 事实证明，根据您的播放器，设置前 10 个条目的开始和结束时间相同会导致该时段被隐藏或可见 5 秒。设置 1 毫秒的偏移量似乎更可靠地保持该时段隐藏。

1
00:02:00,000 --> 00:02:00,001
.

2
00:04:00,000 --> 00:04:00,001
.

3
00:06:00,000 --> 00:06:00,001
.

4
00:08:00,000 --> 00:08:00,001
.

5
00:10:00,000 --> 00:10:00,001
.

6
00:12:00,000 --> 00:12:00,001
.

7
00:14:00,000 --> 00:14:00,001
.

8
00:16:00,000 --> 00:16:00,001
.

9
00:18:00,000 --> 00:18:00,001
.

10
00:20:01,000 --> 00:20:01,001
.

11
00:43:21,472 --> 00:43:24,933
Mysterious translated text here

将 SRT 字幕与针对 iOS 转码的 MP4（HEVC/x265/mov_text）合并时出现问题：“pts 没有值”和“超出 mov/mp4 格式的范围”

为什么如果我仅选择视频后 50% 部分字幕出现的时间，就可以合并字幕，而如果我运行命令合并整个视频，却不能合并字幕？

答案1

似乎在 SRT 文件的开头添加一个从视频开头到字幕开始位置的假字幕可以解决这个问题。

显然，这个解决方案是一种“黑客手段”，但它确实有效。

答案2

相关内容