我正在尝试从 iPhone 视频中重建物体的速度信息,因此我对确定帧之间的时间差很感兴趣。据我所知,iPhone(HEVC 编码,.mov 文件)以可变帧速率拍摄。由于需要高精度,我无法在恒定帧速率视频的假设下工作。到目前为止,我只能恢复有效反映 CFR 视频的 pts_time 值,但根据其他帖子,这似乎是可能的。我正在 macOS Monterey 12.6 上使用 ffmpeg 5.1.1。
这是我尝试过的:
- 从命令行运行“已确认”的 VFR 视频
mediainfo input.mov
w 输出:
Video
ID : 1
Format : HEVC
Format/Info : High Efficiency Video Coding
Format profile : Main@L5@Main
Codec ID : hvc1
Codec ID/Info : High Efficiency Video Coding
Duration : 1 s 372 ms
Source duration : 1 s 947 ms
Bit rate : 32.2 Mb/s
Width : 3 840 pixels
Height : 2 160 pixels
Display aspect ratio : 16:9
Rotation : 180°
Frame rate mode : Variable
Frame rate : 25.685 FPS
Minimum frame rate : 7.500 FPS
Maximum frame rate : 75.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Bits/(Pixel*Frame) : 0.151
Stream size : 4.78 MiB (91%)
Source stream size : 5.23 MiB (99%)
Title : Core Media Video
Encoded date : UTC 2022-08-27 18:44:20
Tagged date : UTC 2022-08-27 18:44:20
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
Codec configuration box : hvcC
Audio
ID : 2
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 1 s 372 ms
Source duration : 1 s 440 ms
Bit rate mode : Variable
Bit rate : 181 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 30.3 KiB (1%)
Source stream size : 31.9 KiB (1%)
Title : Core Media Audio
Encoded date : UTC 2022-08-27 18:44:20
Tagged date : UTC 2022-08-27 18:44:20
Other #1
Type : meta
Duration : 1 s 372 ms
Source duration : 9 s 245 ms
Stream size : 0.00 Byte
Source stream size : 10.0 Bytes
Other #2
Type : meta
Duration : 1 s 372 ms
Source duration : 9 s 245 ms
Stream size : 0.00 Byte
Source stream size : 8.00 Bytes
- 运行
ffmpeg -i input.mov -vf vfrdet -an -f null -
(显示它不是 VFR?)w 输出:
MacBook-Pro-6:timing_validation $ ffmpeg -i 30_4k-3ft.mov -vf vfrdet -an -f null -
ffmpeg version 5.1.1 Copyright (c) 2000-2022 the FFmpeg developers
built with Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '30_4k-3ft.mov':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2022-08-27T18:44:20.000000Z
com.apple.quicktime.make: Apple
com.apple.quicktime.model: iPhone 13 Pro Max
com.apple.quicktime.software: 15.0
com.apple.quicktime.creationdate: 2022-08-27T11:21:43-0700
Duration: 00:00:01.37, start: 0.000000, bitrate: 32207 kb/s
Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 3840x2160, 22539 kb/s, 25.68 fps, 30 tbr, 600 tbn (default)
Metadata:
creation_time : 2022-08-27T18:44:20.000000Z
handler_name : Core Media Video
vendor_id : [0][0][0][0]
encoder : HEVC
Side data:
displaymatrix: rotation of -180.00 degrees
Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 181 kb/s (default)
Metadata:
creation_time : 2022-08-27T18:44:20.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
Stream #0:2[0x3](und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
Metadata:
creation_time : 2022-08-27T18:44:20.000000Z
handler_name : Core Media Metadata
Stream #0:3[0x4](und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
Metadata:
creation_time : 2022-08-27T18:44:20.000000Z
handler_name : Core Media Metadata
Stream mapping:
Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
com.apple.quicktime.creationdate: 2022-08-27T11:21:43-0700
com.apple.quicktime.make: Apple
com.apple.quicktime.model: iPhone 13 Pro Max
com.apple.quicktime.software: 15.0
encoder : Lavf59.27.100
Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 3840x2160, q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
Metadata:
creation_time : 2022-08-27T18:44:20.000000Z
handler_name : Core Media Video
vendor_id : [0][0][0][0]
encoder : Lavc59.37.100 wrapped_avframe
Side data:
displaymatrix: rotation of -0.00 degrees
frame= 1 fps=0.0 q=-0.0 size=N/A time=00:00:00.03 bitrate=N/A speed=0.0622x frame= 16 fps= 15 q=-0.0 size=N/A time=00:00:00.53 bitrate=N/A speed=0.498x frame= 41 fps= 28 q=-0.0 Lsize=N/A time=00:00:01.36 bitrate=N/A speed=0.921x
video:19kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[Parsed_vfrdet_0 @ 0x7ff2c8836fc0] VFR:0.000000 (0/40)
- 运行
ffprobe -select_streams v:0 -show_entries packet=pts_time,duration_time,stream_index input.mov
显示与 CFR 有几个偏差(尽管 duration_time 没有反映这一点……),但总 pts_time 值与之前输出中显示的视频持续时间不匹配:
[PACKET]
stream_index=0
pts_time=-0.433333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.300000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.366667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.166667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.233333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.033333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=-0.100000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.100000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.033333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.000000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.066667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.233333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.166667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.133333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.200000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.366667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.300000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.266667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.333333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.500000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.433333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.400000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.466667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.633333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.566667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.533333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.600000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.766667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.700000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.666667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.733333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.900000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.833333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.800000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.866667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.033333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.966667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=0.933333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.000000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.166667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.100000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.066667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.133333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.300000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.233333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.200000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.266667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.433333
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.366667
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.333333
duration_time=0.033333
[/PACKET]
一些问题:
在此过程中是否存在丢帧的情况或者存在其他假设?
帧间时间差的精度有要求吗?如果精确到毫秒就好了。
如果这不现实,还有其他库/模块可以做到这一点吗?对任何可以从命令行或 python/C++ 运行的开源软件感兴趣。