我正在处理大量 zip 文件(电影节的 DCP 影片),大小在 4GB 到 40GB 之间,这些文件是由不同的人发送给我的,他们使用不同的程序来压缩他们发送的文件夹。每个文件夹通常有 5-10 个文件,其中 1-5 个文件超过 3.7GB。
其中一些档案可以毫无问题地提取,但对于一些档案,我收到以下错误:
7z e ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (506E3),ASM,AES-NI)
Scanning the drive for archives:
1 file, 8033862438 bytes (7662 MiB)
Extracting archive: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip
ERROR: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip
Can not open the file as archive
Can't open as archive: 1
Files: 0
Size: 0
Compressed: 0
unzip ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip
Archive: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip
warning [ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
creating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ASSETMAP.xml
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_cpl.xml
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_pkl.xml
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_Reel_1_j2c.mxf
error: invalid compressed data to inflate
file #6: bad zipfile offset (local header sig): 3671207290
(attempting to re-compensate)
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_Reel_1_pcm.mxf
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/VOLINDEX.xml
zip -FF ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV.zip --out ore.zip
Fix archive (-FF) - salvage what can
Found end record (EOCDR) - says expect single disk archive
Scanning for entries...
copying: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ (0 bytes)
copying: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ASSETMAP.xml (588 bytes)
copying: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_cpl.xml (631 bytes)
copying: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_pkl.xml (613 bytes)
copying: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_Reel_1_j2c.mxf
zip warning: no end of stream entry found: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_Reel_1_j2c.mxf
zip warning: rewinding and scanning for later entries
unzip ore.zip
Archive: ore.zip
warning [ore.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
creating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ASSETMAP.xml
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_cpl.xml
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_pkl.xml
file #5: bad zipfile offset (local header sig): 2432
(attempting to re-compensate)
file #5: bad zipfile offset (local header sig): 2432
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV_Reel_1_pcm.mxf
inflating: ORE_SHR-1-30_F_XX-XX_AT-XX_20_2K_XX_20190312_FF_SMPTE_OV/VOLINDEX.xml
到目前为止,我只能在 Windows10 上使用 7z 提取档案。
我明白……
Headers-Error
Warning: 32-Bit overflow in headers
...但是,在 Windows10 上使用 7z 仍然是我可以提取档案的唯一方法 - 尽管出现了 Header-Error/Warning
我正在使用 Linux 计算机,但我也尝试在 MacOS 上使用“unzip”,我希望能够在那里提取档案,而不是每次发生这种情况时都走到 Windows 机器上。
- 为什么同一个文件在 7z Windows10 上可以运行,但在 MacOS 上的 7z 或 Linux 上的 7z 上却不能运行?
- 我怎样才能在 Linux 机器上提取档案?
附加信息:
答案1
如果 ZIP 文件大于 4 GB 且由 OneDrive 或 Windows 创建,则可以尝试另一个旨在修复标题的实用程序:
https://github.com/pmqs/Fix-OneDrive-Zip
显然,对于如何处理 Zip64 文件的“文件计数”标头存在一些分歧。Microsoft 产品期望文件计数标头应为 0,而其他系统则期望它为 1。Windows 版 7-zip 的较新版本(版本 19 以上)可以很好地处理此问题,但 Linux 只有一个较旧的 POSIX 端口 p7zip,即 2016 年的版本 16,它没有办法处理此问题。上面链接的实用程序是一个基于 Perl 的脚本,它可以更改标头信息,以便文件计数从 0 更改为 1,这样实用程序就可以处理文件。
如果您好奇,有人在这里深入研究了这个问题:
https://www.bitsgalore.org/2020/03/11/does-microsoft-onedrive-export-large-ZIP-files-that-are-corrupt