我在 at91sam9g20 上运行 Linux 版本 3.4.8。
我想获取一个大记录并将其拆分为多个文件。我尝试了多种方法,但似乎都无法正常工作,例如
tar -c -M --tape-length=102400 --file=disk1.tar mytest.tar.g
z
tar: invalid option -- M
BusyBox v1.20.2 (2012-09-24 16:21:25 CEST) multi-call binary.
Usage: tar -[cxthvO] [-X FILE] [-T FILE] [-f TARFILE] [-C DIR] [FILE]...
Create, extract, or list files from a tar file
Operation:
c Create
x Extract
t List
f Name of TARFILE ('-' for stdin/out)
C Change to DIR before operation
v Verbose
O Extract to stdout
h Follow symlinks
exclude File to exclude
X File with names to exclude
T File with names to include
busybox 似乎有一个 tar 的精简版本,不允许某些参数。
当我尝试拆分时,我得到以下信息:
/:# split -sh: split: 未找到
有没有使用 busybox 命令集将大文件拆分为多个文件的方法?
Currently defined functions:
[, [[, addgroup, adduser, ar, arping, ash, awk, basename, blkid,
bunzip2, bzcat, cat, catv, chattr, chgrp, chmod, chown, chroot, chrt,
chvt, cksum, clear, cmp, cp, cpio, crond, crontab, cut, date, dc, dd,
deallocvt, delgroup, deluser, devmem, df, diff, dirname, dmesg, dnsd,
dnsdomainname, dos2unix, du, dumpkmap, echo, egrep, eject, env,
ether-wake, expr, false, fdflush, fdformat, fgrep, find, fold, free,
freeramdisk, fsck, fuser, getopt, getty, grep, gunzip, gzip, halt,
hdparm, head, hexdump, hostid, hostname, hwclock, id, ifconfig, ifdown,
ifup, inetd, init, insmod, install, ip, ipaddr, ipcrm, ipcs, iplink,
iproute, iprule, iptunnel, kill, killall, killall5, klogd, last, less,
linux32, linux64, linuxrc, ln, loadfont, loadkmap, logger, login,
logname, losetup, ls, lsattr, lsmod, lsof, lspci, lsusb, lzcat, lzma,
makedevs, md5sum, mdev, mesg, microcom, mkdir, mkfifo, mknod, mkswap,
mktemp, modprobe, more, mount, mountpoint, mt, mv, nameif, netstat,
nice, nohup, nslookup, od, openvt, passwd, patch, pidof, ping,
pipe_progress, pivot_root, poweroff, printenv, printf, ps, pwd, rdate,
readlink, readprofile, realpath, reboot, renice, reset, resize, rm,
rmdir, rmmod, route, run-parts, runlevel, sed, seq, setarch,
setconsole, setkeycodes, setlogcons, setserial, setsid, sh, sha1sum,
sha256sum, sha512sum, sleep, sort, start-stop-daemon, strings, stty,
su, sulogin, swapoff, swapon, switch_root, sync, sysctl, syslogd, tail,
tar, tee, telnet, test, tftp, time, top, touch, tr, traceroute, true,
tty, udhcpc, umount, uname, uniq, unix2dos, unlzma, unxz, unzip,
uptime, usleep, uudecode, uuencode, vconfig, vi, vlock, watch,
watchdog, wc, wget, which, who, whoami, xargs, xz, xzcat, yes, zcat
答案1
您可以使用 busybox 的dd
小程序及其bs
,count
和skip
参数将大文件拆分为多个块。
dd
联机帮助页部分来自busybox
:
dd [if=FILE] [of=FILE] [ibs=N] [obs=N] [bs=N] [count=N] [skip=N]
[seek=N] [conv=notrunc|noerror|sync|同步]Copy a file with converting and formatting if=FILE Read from FILE instead of stdin of=FILE Write to FILE instead of stdout bs=N Read and write N bytes at a time ibs=N Read N bytes at a time obs=N Write N bytes at a time count=N Copy only N input blocks skip=N Skip N input blocks seek=N Skip N output blocks conv=notrunc Don't truncate output file conv=noerror Continue after read errors conv=sync Pad blocks with zeros conv=fsync Physically write data out before finishing
所以基本上你会做这样的事情:
$ dd if=bigfile of=part.0 bs=1024 count=1024 skip=0
$ dd if=bigfile of=part.1 bs=1024 count=1024 skip=1024
$ dd if=bigfile of=part.2 bs=1024 count=1024 skip=2048
对于每个part.X
文件dd
写入count * bs bytes
忽略skip
输入文件中的第一个字节。
一个非常基本的单行代码(结合了sed
busybox中的xargs
和dd
applet)可能如下所示:
seq 0 19 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'
生成part.X
最多20 个1048576 bytes
大小的文件。
分割示例bigfile
:
$ ls -l
total 2940
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile
$ seq 0 20 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
881+1 records in
881+1 records out
0+0 records in
0+0 records out
[...]
$ ls -l
total 5968
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.0
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.1
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.10
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.11
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.12
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.13
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.14
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.15
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.16
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.17
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.18
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.19
-rw-rw-r-- 1 user user 902848 Apr 27 13:43 part.2
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.3
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.4
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.5
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.6
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.7
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.8
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.9
可以轻松完成恢复cat
(或dd
再次使用seek
参数)。 0字节文件可以被跳过:
$ cat part.0 part.1 part.2 > bigfile.res
$ diff bigfile bigfile.res
根据您的需要,您不应该使用seq
和计算大文件的具体大小并在 shell 脚本中完成所有操作。
答案2
我相信,您可以为基于 ARM 的微控制器下载静态链接的 busybox 二进制文件:http://www.busybox.net/downloads/binaries/latest/
据我所知,那里提供的所有二进制文件都包括此处描述的所有实用程序:http://www.busybox.net/downloads/BusyBox.html
因此,您将能够以非常简单的方式使用 split 实用程序。