我目前运行一个 OwnCloud 服务器,其中有大约 25 个帐户,2.6 TB,并且还在适度增长。由于数据将在未来几十年内保存,因此 OwnCloud 数据位于镜像 ZFS 文件系统上,以保持数据完整性。我使用 rsnapshot 在 8 TB 驱动器(ext 文件系统)上保留每晚、每周和每月的快照,该驱动器会定期与另一个异地保存的 8 TB 驱动器交换。
将 8 TB 驱动器连接到任何 Linux 机器的简单性对于文件或系统恢复很有吸引力。这已经运行了 15 个月。尚未需要从备份中恢复,但 2 个故障驱动器已在 ZFS 上换出。
使用 ZFS 快照和/或在备份驱动器上使用 ZFS 来提高文件完整性是否有显著优势?什么是“最佳实践”或我当前的系统是否足以满足现在和将来的需求?
答案1
ZFSsend/recv
具有“更改感知”功能:这意味着在后续备份中只会传输更改的块。相比之下rsnapshot
,需要遍历全部元数据来发现任何可能被改变的文件,然后需要读取全部修改后的文件来提取任何更改,send/recv
显然要快得多。与其重新发明轮子,我建议你看看syncoid
安排定期增量备份。
也就是说,rsnapshot
这是一款很棒的软件,当send/recv
不适用时我会广泛使用它(即:当目标在不同于 ZFS 的上运行和/或我需要将其连接到不具备 ZFS 功能的系统时)。
答案2
我使用 rsnapshot 来保存每小时/每日的数据,然后将每月的数据委托给 zfs 快照。这是使用 cron 作业来实现的,让 zfs 每周对数据集进行快照。
我发现以下事实具有显著优势:每周的数据都由 zfs 控制,数据易于访问,处于 ZFS 的完整性之下,而且我可以减少使用 Rsnapshot 保存的间隔。
我可以向您展示我已设置的内容。
# weekly; run before monthly, runs 2:35 friday. Make sure this
# runs after rsnapshot (particularly the hourly, which is
# frequently running). The benefit using zfs rsnapshots is you can
# list the snapshots to see the filespace difference by each weekly,
# and additionally, you can now easily remove the 0B weeklys as
# clutter.
2 * * 5 /bin/nice -17 /usr/local/sbin/zfs-rsnapshot
# WEEKLY; runs 4:03 mondays
03 04 * * 1 /bin/nice -17 /bin/rsnapshot weekly
# DAILY; runs 5:03 daily
03 05 * * * /bin/nice -17 /bin/rsnapshot daily
# HOURLY; run sync first, runs 03 mins after each hour (6am, 12pm, 18, and midnight) (sync has taken up to 45 mins to complete thus far, there may
# have been other proceses running at the time. And the hourly copy took just under 10 minutes. Total was about 52m runtime.
03 06,12,18,00 * * * /bin/nice -17 /bin/rsnapshot sync && /bin/nice -17 /bin/rsnapshot hourly
我遇到的问题是,出于某种原因,快照上的数据集似乎在增长(每次增长 3、4、5GB),我很难找出原因。假设我读得没错。
zfs list
NAME USED AVAIL REFER MOUNTPOINT
...
nas/live/rsnapshot 279G 1.16T 84.2G /nas/live/rsnapshot
~$ zfs list -o space
NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
nas 1.16T 12.0T 0B 59.6K 0B 12.0T
nas/live 1.16T 775G 45.9G 72.0K 0B 729G
nas/live/rsnapshot 1.16T 279G 195G 84.2G 0B 0B
zfs list -t snap | grep rsnap
NAME USED AVAIL REFER MOUNTPOINT
nas/live/rsnapshot@rsnap-weekly-2021-1001 449M - 96.6G -
nas/live/rsnapshot@rsnap-weekly-2021-1008 171K - 96.6G -
nas/live/rsnapshot@rsnap-weekly-2021-1009 171K - 96.6G -
nas/live/rsnapshot@rsnap-weekly-2021-1015 3.57G - 93.0G -
nas/live/rsnapshot@rsnap-weekly-2021-1022 5.01G - 96.3G -
nas/live/rsnapshot@rsnap-weekly-2021-1029 4.27G - 96.0G -
nas/live/rsnapshot@rsnap-weekly-2021-1105 4.55G - 96.8G -
nas/live/rsnapshot@rsnap-weekly-2021-1111 590M - 97.5G -
nas/live/rsnapshot@rsnap-weekly-2021-1112 712M - 97.6G -
nas/live/rsnapshot@rsnap-weekly-2022-0401 3.95G - 95.6G -
nas/live/rsnapshot@rsnap-weekly-2022-0408 2.92G - 95.6G -
nas/live/rsnapshot@rsnap-weekly-2022-0415 5.02G - 95.8G -
nas/live/rsnapshot@rsnap-weekly-2022-0422 4.26G - 95.9G -
nas/live/rsnapshot@rsnap-weekly-2022-0429 2.29G - 96.1G -
nas/live/rsnapshot@rsnap-weekly-2022-0506 2.26G - 96.5G -
nas/live/rsnapshot@rsnap-weekly-2022-0513 2.23G - 96.3G -
nas/live/rsnapshot@rsnap-weekly-2022-0520 3.09G - 96.1G -
nas/live/rsnapshot@rsnap-weekly-2022-0527 4.67G - 103G -
nas/live/rsnapshot@rsnap-weekly-2022-0603 4.45G - 102G -
nas/live/rsnapshot@rsnap-weekly-2022-0610 4.26G - 116G -
nas/live/rsnapshot@rsnap-weekly-2022-0617 3.94G - 118G -
nas/live/rsnapshot@rsnap-weekly-2022-0624 4.40G - 84.4G -
nas/live/rsnapshot@rsnap-weekly-2022-0701 3.08G - 84.4G -
nas/live/rsnapshot@rsnap-weekly-2022-0722 2.16G - 84.2G -
nas/live/rsnapshot@rsnap-weekly-2022-0729 2.97G - 85.0G -
nas/live/rsnapshot@rsnap-weekly-2022-0805 2.71G - 85.3G -
nas/live/rsnapshot@rsnap-weekly-2022-0812 2.13G - 84.4G -
nas/live/rsnapshot@rsnap-weekly-2022-0819 2.76G - 84.4G -
nas/live/rsnapshot@rsnap-weekly-2022-0826 2.16G - 83.9G -
nas/live/rsnapshot@rsnap-weekly-2022-0902 790M - 84.6G -
nas/live/rsnapshot@2022-1105_before_move_live-to-condor 798M - 83.8G -
nas/live/rsnapshot@rsnap-weekly-2022-1111 3.71G - 86.1G -
nas/live/rsnapshot@rsnap-weekly-2022-1118 3.72G - 84.6G -
zfs-rsnapshot
#!/bin/bash
###
## run cronjob to snapshot the last week of rsnapshot files.
# Doing this weekly to limit amount of snapshots,
# therefore will have to keep atleast 7 dailys, and the
# hourlys, to do this, but if the cronjob doesnt run one
# week, you will miss out on 1 day of data for every day that
# the cronjob is out. e.g. if cron stalls for 1 week, thats
# 7 dailys that will have been erased on the rsnapshot
# server during that time. So to protect this data be sure
# to keep the extra days in rsnapshot for anticipated cronjob
# lag.
##
DATE=`date +%Y-%m%d`
#going weekly, so adding that to the name
SNAPNAME="rsnap-weekly-${DATE}"
#dataset name to snapshot
DATASET="nas/live/rsnapshot"
#run it
/sbin/zfs snapshot -r $DATASET@$SNAPNAME
rsnapshot.conf
# SNAPSHOT ROOT DIRECTORY #
snapshot_root /nas/live/rsnapshot/
no_create_root 1
cmd_cp /bin/cp
cmd_rm /bin/rm
cmd_rsync /bin/rsync
#cmd_ssh /bin/ssh
cmd_logger /bin/logger
cmd_du /bin/du
cmd_rsnapshot_diff /bin/rsnapshot-diff
#cmd_preexec /path/to/preexec/script
#cmd_postexec /path/to/postexec/script
# BACKUP LEVELS / INTERVALS
#NOTE, this one isnt USED automatically, only for manual runs
#retain hourly 6
#retain daily 7
#retain weekly 7
#retain monthly 4
# Incrementing only 4 times daily, be sure to sync first.
interval hourly 4
#dont need 7th daily, since its in weekly.
interval daily 6
########
# NEW #
#######
# MAKING CHANGES HERE FOR ZFS RSNAPSHOT script that moves the monthly, quarterly, and annual
# over to ZFS to omit all the extra redundant rsnapshot copies (oh they were hardlinked
# anyway werent they, oh well, maybe this will end up being cleaner)
# only need two weeks in total, so thats 1 weekly which combined with the daily covers the
# two weeks for cronjob redundancy, giving cron a week of error stalling, before data loss
# of the first rsnapshot daily entrys (one per day gets eaten by rsnapshot, if the cron
# doesnt get it tO ZFS). So, only 1 is needed, which is the redundant week of extra data.
interval weekly 1
# NO MORE ENTRYS ARE NEEDED FOR ZFS RSNAPSHOT
#
#######
# OLD #
#######
## dont need 4th week, since its in monthly.
#interval weekly 3
##only 5, (to cover 6 months, which 1st quarterly contains the 6th nonth)), then move to quarterly
#interval monthly 5
##only 3 since yearly contains the 4th
#interval quarterly 3
## quarterly monthly and quarterly covers first year, add 1 additional year.
#interval yearly 1
verbose 2
# Log level, isame as verbose, but these get written to the logs
loglevel 3
logfile /var/log/rsnapshot
lockfile /var/run/rsnapshot.pid
#stop_on_stale_lockfile 0
#rsync_short_args -a
#rsync_long_args --delete --numeric-ids --relative --delete-excluded
#ssh_args -p 22
#du_args -csh
#one_fs 0
#include ???
#exclude ???
#include_file /path/to/include/file
#exclude_file /path/to/exclude/file
#link_dest 0
# using sync_first, (1), so run this exactly before you run your hourly, as a general rule.
sync_first 1
#use_lazy_deletes 0
#rsync_numtries 0
backup_exec /bin/date "+ backup ended at %c"
我也使用 zfs send 将它们移出现场。
( set -o pipefail && zfs send -Ri @rsnap-weekly-2022-0902 nas/live/rsnapshot@rsnap-weekly-2022-1118 | pv | ssh <ip-of-upstream> zfs recv -Fvs int/live/rsnapshot )
我愿意接受改进,也愿意了解备份每次都会占用大量空间的原因。这可能是我快照的 Linux 系统上的日志和内容发生了变化,还是我遗漏了什么?不确定。欢迎发表评论。