Backup solution for 10TB using ubuntu

Question 1

我将这个盒子设置为 NAS，NextentaStor 社区版（ZFS！）或开放文件管理器。

除非你打算将它用作其他用途，否则为什么要使用完整发行版呢？因为它是专门为占用空间较小而构建的，所以出错的可能性较小；OpenFiler 和 NextentaStor 各有优缺点，但对于纯存储设备而言，它们都是比直接使用 Ubuntu 更好的选择。

Answer

我将这个盒子设置为 NAS，NextentaStor 社区版（ZFS！）或开放文件管理器。

除非你打算将它用作其他用途，否则为什么要使用完整发行版呢？因为它是专门为占用空间较小而构建的，所以出错的可能性较小；OpenFiler 和 NextentaStor 各有优缺点，但对于纯存储设备而言，它们都是比直接使用 Ubuntu 更好的选择。

Question 2

使用 git 似乎不太合适。如果你真的喜欢这种方式，可以看看git bup这是一个 git 扩展，用于在 git repo 中智能地存储大型二进制文件。

话虽如此，我推荐 rsnapshot、rdiff-backup。

我绝对不建议为此使用 LVM 快照1。

写入性能会严重下降
在这些卷上，快照将导致启动时间长达几分钟，甚至几个小时（这里）
空间不足时会出现致命问题
上次我检查时，回滚之类的事情仍然是一个远程承诺
请注意，即使在实时文件系统旁边安装快照也可能非常棘手，因为文件系统依赖于 guid 在 fs 标头中的唯一性
此外，除非使用 iSCSI 或 DBRD（等），否则您将被困在与实时 fs 相同的主机上，这使得备份的用处大大降低（并且会进一步降低性能）

对于这种情况，我更喜欢ZFS（发送、接收）说实话，我认为zfs-fuse目前可能太慢了（但可以测试一下！），但是linux系统似乎进展顺利，可能会给你带来很多工作机会。

1我刚刚检索到我之前写的有关这个主题的一些信息：

_{然而，我已经数不清使用快照时遇到的不同故障模式了。我已经完全停止使用它们了——它太危险了。}

我现在要做的唯一例外是我自己的个人邮件服务器/网络服务器备份，我将使用临时快照进行隔夜备份，该快照的大小始终等于源文件系统的大小，和随后就会被删除。

要牢记的最重要的方面：

如果你有一个包含快照的大型文件系统，写入性能会严重下降
if you have a big(ish) fs that has a snapshot, boot time will be delayed with literally tens of minutes while the disk will be churning and churning during import of the volume group. No messages will be displayed. This effect is especially horrid if root is on lvm2.
if you have a snapshot it is very easy to run out of space. Once you run out of space, the snapshot is corrupt and cannot be repaired.
Snapshots cannot be rolledback/merged at the moment (see http://kerneltrap.org/Linux/LVM_Snapshot_Merging). This means the only way to restore data from a snapshot is to actually copy (rsync?) it over. DANGER DANGER: you do not want to do this if the snapshot capacity is not at least the size of the source fs; If you don't you'll soon hit the brick wall and end up with both the source fs and the snapshot corrupted. (I've been there!)

Answer

使用 git 似乎不太合适。如果你真的喜欢这种方式，可以看看git bup这是一个 git 扩展，用于在 git repo 中智能地存储大型二进制文件。

话虽如此，我推荐 rsnapshot、rdiff-backup。

我绝对不建议为此使用 LVM 快照1。

写入性能会严重下降
在这些卷上，快照将导致启动时间长达几分钟，甚至几个小时（这里）
空间不足时会出现致命问题
上次我检查时，回滚之类的事情仍然是一个远程承诺
请注意，即使在实时文件系统旁边安装快照也可能非常棘手，因为文件系统依赖于 guid 在 fs 标头中的唯一性
此外，除非使用 iSCSI 或 DBRD（等），否则您将被困在与实时 fs 相同的主机上，这使得备份的用处大大降低（并且会进一步降低性能）

对于这种情况，我更喜欢ZFS（发送、接收）说实话，我认为zfs-fuse目前可能太慢了（但可以测试一下！），但是linux系统似乎进展顺利，可能会给你带来很多工作机会。

1我刚刚检索到我之前写的有关这个主题的一些信息：

_{然而，我已经数不清使用快照时遇到的不同故障模式了。我已经完全停止使用它们了——它太危险了。}

我现在要做的唯一例外是我自己的个人邮件服务器/网络服务器备份，我将使用临时快照进行隔夜备份，该快照的大小始终等于源文件系统的大小，和随后就会被删除。

要牢记的最重要的方面：

如果你有一个包含快照的大型文件系统，写入性能会严重下降
if you have a big(ish) fs that has a snapshot, boot time will be delayed with literally tens of minutes while the disk will be churning and churning during import of the volume group. No messages will be displayed. This effect is especially horrid if root is on lvm2.
if you have a snapshot it is very easy to run out of space. Once you run out of space, the snapshot is corrupt and cannot be repaired.
Snapshots cannot be rolledback/merged at the moment (see http://kerneltrap.org/Linux/LVM_Snapshot_Merging). This means the only way to restore data from a snapshot is to actually copy (rsync?) it over. DANGER DANGER: you do not want to do this if the snapshot capacity is not at least the size of the source fs; If you don't you'll soon hit the brick wall and end up with both the source fs and the snapshot corrupted. (I've been there!)

Question 3

I'd highly suggest not using git for this. Sure it may work, but it's pretty sub-optimal.

You could you rsync, LVM, and snapshots if you'd like. My preferred backup method for instances like this is to use rnapshot or rdiff-backup. They can leverage the optimizations that rsync gives you, while providing an incremental set of backups at the same time.

Answer

I'd highly suggest not using git for this. Sure it may work, but it's pretty sub-optimal.

You could you rsync, LVM, and snapshots if you'd like. My preferred backup method for instances like this is to use rnapshot or rdiff-backup. They can leverage the optimizations that rsync gives you, while providing an incremental set of backups at the same time.

Question 4

I've set up a smaller backup server using BackupPC. It's in the Ubuntu repositories, setting it up is a snap. Uses rsync for transfer, does file-level deduplication.

It will keep version history, and you can specify how many to keep going in the past. As they get older, it auto removes some of them. The assumption is that the further in the past that you go, the lest granularity you will need. It can be adjusted to whatever you want, though.

Check it out, it's really good.

https://help.ubuntu.com/community/BackupPC

http://backuppc.sourceforge.net/faq/BackupPC.html

Answer