I have about 500GB of data (photos, documents, videos etc.) that I have accumulated over the years. Currently, I keep them on my computer and rsync all additions / changes once a month or so to an external hard drive. Do I need to be worried about data loss (sectors going bad, bit rot, bit flip, whatever it is called)?
To clarify,
-
None of this is commercially important; I just don’t want to get into a situation where I look up an old family photo or video twenty years down the line and it has got corrupted.
-
Both my computer and the external HD are HDDs. They are fairly cheap here (and very cheap if second hand). Buying SSDs or dedicated hardware would be expensive.
The 3 2 1 rule is always the gold standard.
I’d recommend at least adding an offsite backup. Set up rclone with a mounted folder (client side encryption is recommended) and sync the files to that as well.
I use Backblaze for about $6/TB/mo, pro-rated for whatever amount is actually used.
second, for the small amount a backblaze account would be cheap and more than enough. If OP is worried about security then enabling a crypt endpoint in rclone is moderately trivial.
3-2-1 OP. 3 copies of your data, across 2 different storage mediums, with at least 1 offsite.
6$ is about 500 rupees. I can get another HDD for double that price.
I do copy some important files to Google Drive, but I don’t pay for it, and I don’t rely on it.
If you don’t pay for it, you can’t rely on it
Right, which is why I prefer to rely on local backups. Much cheaper in the long run.
I used to work with a guy who was religious about backing up his files to an external drives. Until someone broke into his house and stole his computer AND his external drives. He lost everything.
It’s always a good idea to have an off-site backup (e.g. in case of fires, robbery, natural disasters, etc). If you prefer to manage them yourself, an option is to find someone else who also needs an off-site backup and exchange disk space. You do your off-site on their machine, and they do theirs on yours. With external HDDs, you can just have someone else hold on to it for you at a different location. You can come up with fancier schemes to reduce the chances of data loss or to make the process simpler if you care to do so.
I recommend kopia. It lets you backup automatically to a primary location, copy that data periodically to a secondary location, and it has a command that you can use to verify all the data is actually what it was when the backup was created.
Thank you. On that note, when backing up, is there a way to compare the two versions, see if one has become corrupted, and copy the good version to both? It would be sad if your primary copy got corrupted, and you overwrote all other copies with it.
Kopia uses content addressable storage. So basically when it copies things, it only copies what data is new. Files that haven’t changed will not be overwritten.
You kind of need to run the verification command on both the source and the “backup copy” for maximum paranoia. If you’re running it on a local copy, that should be a relatively fast process as you don’t need to download stuff.
You’d basically connect on the command line to the copy you just updated via sync-to and then ask kopia to verify 100% of the file integrity … it should then run through everything and make sure it matches what’s supposed to be there. I’m not sure how you fix it if it detects something wrong, I’ve yet to run into that … I’m sure there’s a way 🙂
You could also use two backup drives and sync to both, then if you get an error restoring a particular file from one, you could in theory restore it from the other. A ZFS cluster with redundant copies and/or a RAID-1, RAID-5 or RAID-6 style setup could also help … but most people aren’t going to run an entire NAS just to turn it on periodically and backup their data “offline”. Most people are going to be better served (IMO) by using cloud storage like B2 (where bitflips aren’t really a concern) or a NAS (where bitflips similarly are a minimal concern, ideally in another location) with a periodically updated offline copy (on say an external hard drive) should be enough to protect most people’s data well.
Also going to like to what I’m talking about: