Using rsync for backups, because it's not shiny and new

mesa@piefed.social · 6 months ago

Using rsync for backups, because it's not shiny and new

NuXCOM_90Percent@lemmy.zip · 6 months ago

I would generally argue that rsync is not a backup solution. But it is one of the best transfer/archiving solutions.

Yes, it is INCREDIBLY powerful and is often 90% of what people actually want/need. But to be an actual backup solution you still need infrastructure around that. Bare minimum is a crontab. But if you are actually backing something up (not just copying it to a local directory) then you need some logging/retry logic on top of that.

At which point you are building your own borg, as it were. Which, to be clear, is a great thing to do. But… backups are incredibly important and it is very much important to understand what a backup actually needs to be.

tal@olio.cafe · 6 months ago

I would generally argue that rsync is not a backup solution.

Yeah, if you want to use rsync specifically for backups, you’re probably better-off using something like rdiff-backup, which makes use of rsync to generate backups and store them efficiently, and drive it from something like backupninja, which will run the task periodically and notify you if it fails.

rsync: one-way synchronization

unison: bidirectional synchronization

git: synchronization of text files with good interactive merging.

rdiff-backup: rsync-based backups. I used to use this and moved to restic, as the backupninja target for rdiff-backup has kind of fallen into disrepair.

That doesn’t mean “don’t use rsync”. I mean, rsync’s a fine tool. It’s just…not really a backup program on its own.

non_burglar@lemmy.world · 6 months ago

I use rsync and a pruning script in crontab on my NFS mounts. I’ve tested it numerous times breaking containers and restoring them from backup. It works great for me at home because I don’t need anything older than 4 monthly, 4 weekly, and 7 daily backups.

However, in my job I prefer something like bacula. The extra features and granularity of restore options makes a world of difference when someone calls because they deleted prod files.

mesa@piefed.social · 6 months ago

Ive personally used rsync for backups for about…15 years or so? Its worked out great. An awesome video going over all the basics and what you can do with it.

Eldritch@piefed.world · 6 months ago

And I generally enjoy Veronica’s presentation. Knowledgable and simple.

mesa@piefed.social · 6 months ago

Her https://tinkerbetter.tube/w/ffhBwuXDg7ZuPPFcqR93Bd made me learn a new way of looking at data. There was some tricks I havent done before. She has such good videos.

Eldritch@piefed.world · 6 months ago

Yep, I found her through YouTube. Her and action retro’s content is always great.with some Adrian black on the side.

overload@sopuli.xyz · 6 months ago

Veronica is fantastic. Love her video editing, it reminds me more of the early days of YouTube.

Eager Eagle@lemmy.world · edit-2 6 months ago

It works fine if all you need is transfer, my issue with it it’s just not efficient. If you want a “time travel” feature, your only option is to duplicate data. Differential backups, compression, and encryption for off-site ones is where other tools shine.

bandwidthcrisis@lemmy.world · 6 months ago

I have it add a backup suffix based on the date. It moves changed and deleted files to another directory adding the date to the filename.

It can also do hard-link copied so that you can have multiple full directory trees to avoid all that duplication.

No file deltas or compression, but it does mean that you can access the backups directly.

suicidaleggroll@lemmy.world · 6 months ago

If you want a “time travel” feature, your only option is to duplicate data.

Not true. Look at the --link-dest flag. Encryption, sure, rsync can’t do that, but incremental backups work fine and compression is better handled at the filesystem level anyway IMO.

Eager Eagle@lemmy.world · edit-2 6 months ago

Isn’t that creating hardlinks between source and dest? Hard links only work on the same drive. And I’m not sure how that gives you “time travel”, as in, browsing snapshots or file states at the different times you ran rsync.

Edit: ah the hard link is between dest and the link-dest argument, makes more sense.

I wouldn’t bundle fs and backup compression in the same bucket, because they have vastly different reqs. Backup compression doesn’t need to be optimized for fast decompression.

BCsven@lemmy.ca · 6 months ago

Snapper and BTRFS. Its only adjusts changes in data, so time travel is just pointing to what blocks changed and when, and not building a duplicate of the entire file or filesystem. A snapshot is instant, and new block changes belong to the current default.

NekuSoul@lemmy.nekusoul.de · 6 months ago

Agree. It’s neat for file transfers and simple one-shot backups, but if you’re looking for a proper backup solution then other tools/services have advanced virtually every aspect of backups so much it pretty much always makes sense to use one of those instead.

confusedpuppy@lemmy.dbzer0.com · 6 months ago

I use rsync for many of the reasons covered in the video. It’s widely available and has a long history. To me that feels important because it’s had time to become stable and reliable. Using Linux is a hobby for me so my needs are quite low. It’s nice to have a tool that just works.

I use it for all my backups and moving my backups to off network locations as well as file/folder transfers on my own network.

I even made my own tool (https://codeberg.org/taters/rTransfer) to simplify all my rsync commands into readable files because rsync commands can get quite long and overwhelming. It’s especially useful chaining multiple rsync commands together to run under a single command.

I’ve tried other backup and syncing programs and I’ve had bad experiences with all of them. Other backup programs have failed to restore my system. Syncing programs constantly stop working and I got tired of always troubleshooting. Rsync when set up properly has given me a lot less headaches.

state_electrician@discuss.tchncs.de · 6 months ago

Why videos? I feel like an old man yelling at clouds every time something that sounds interesting is presented in a fucking video. Videos are so damn awful. They take time, I need audio and I can’t copy&paste. Why have they become the default for things that should’ve been a blog post?

Wawe@lemmy.world · edit-2 6 months ago

They linked blog post with the video: https://vkc.sh/everyday-rsync/

czardestructo@lemmy.world · 6 months ago

Thank you for putting into words what ive subconsciously been thinking for years. Every search result prioritizes videos at the top and I’m still annoyed every time. Or even worst I have to hunt through a 10 minute video for the 30 seconds of info I needed. Stoohhhhpppp internet of new! Make it good again!

vga@sopuli.xyz · 6 months ago

Ad money.

sugar_in_your_tea@sh.itjust.works · 6 months ago

Blogs can have ads.

kchr@lemmy.sdf.org · 6 months ago

Hear hear. Knowledge should be communicated in an easily shareable way that can also be archived as easily, in contrast to a video requiring hundreds of MB:s.

northernlights@lemmy.today · 6 months ago

Especially for a command line tool

Matthew@midwest.social · 6 months ago

man rsync

atk007@lemmy.world · 6 months ago

Rsnapshot. It uses rsync, but provides snapshot management and multiple backup versioning.

BonkTheAnnoyed@lemmy.blahaj.zone · 6 months ago

Yah, I really like this approach. Same reason I set up Timeshift and Mint Backup on all the user machines in my house. For others rsync + cron is aces.

sugar_in_your_tea@sh.itjust.works · 6 months ago

Yeah it’s slow

What’s slow about async? If you have a reasonably fast CPU and are merely syncing differences, it’s pretty quick.

pathief@lemmy.world · 6 months ago

It’s single thread, one file at a time.

sugar_in_your_tea@sh.itjust.works · 6 months ago

That would only matter if it’s lots of small files, right? And after the initial sync, you’d have very few files, no?

Rsync is designed for incremental syncs, which is exactly what you want in a backup solution. If your multithreaded alternative doesn’t do a diff, rsync will win on larger data sets that don’t have rapid changes.

Newsteinleo@midwest.social · 6 months ago

For a home setup that seems fine. But I can understand why you wouldn’t want this for a whole enterprise.

quick_snail@feddit.nl · 6 months ago

It’s slow?!?

okamiueru@lemmy.world · 6 months ago

That part threw me off. Last time i used it, I did incremental backups of a 500 gig disk once a week or so, and it took 20 seconds max.

Biscuit@ani.social · 6 months ago

Yes but imagine… 18 seconds.

HereIAm@lemmy.world · 6 months ago

Compared to something multi threaded, yes. But there are obviously a number of bottlenecks that might diminish the gains of a multi threaded program.

clif@lemmy.world · 6 months ago

I’ll never not upvote Veronica Explains. Excellent creator and excellent info on everything I’ve seen.

ominous ocelot@leminal.space · 6 months ago

rsnapshot is a script for the purpose of repeatedly creating deduplicated copies (hardlinks) for one or more directories. You can chose how many hourly, daily, weekly,… copies you’d like to keep and it removes outdated copies automatically. It wraps rsync and ssh (public key auth) which need to be configured before.

SayCyberOnceMore@feddit.uk · 6 months ago

Hardlinks need to be on the same filesystem, don’t they? I don’t see how that would work with a remote backup…?

suicidaleggroll@lemmy.world · 6 months ago

The hard links aren’t between the source and backup, they’re between Friday’s backup and Saturday’s backup

SayCyberOnceMore@feddit.uk · 6 months ago

Ahh, ok. Thanks for clarifying.

ryper@lemmy.ca · 6 months ago

I was planning to use rsync to ship several TB of stuff from my old NAS to my new one soon. Since we’re already talking about rsync, I guess I may as well ask if this is right way to go?

Suburbanl3g3nd@lemmings.world · 6 months ago

I couldn’t tell you if it’s the right way but I used it on my Rpi4 to sync 4tb of stuff from my Plex drive to a backup and set a script up to have it check/mirror daily. Took a day and a half to copy and now it syncs in minutes tops when there’s new data

GreenKnight23@lemmy.world · 6 months ago

yes, it’s the right way to go.

rsync over ssh is the best, and works as long as rsync is installed on both systems.

qjkxbmwvz@startrek.website · 6 months ago

On low end CPUs you can max out the CPU before maxing out network—if you want to get fancy, you can use rsync over an unencrypted remote shell like rsh, but I would only do this if the computers were directly connected to each other by one Ethernet cable.

SayCyberOnceMore@feddit.uk · 6 months ago

It depends

rsync is fine, but to clarify a little further…

If you think you’ll stop the transfer and want it to resume (and some data might have changed), then yep, rsync is best.

But, if you’re just doing a 1-off bulk transfer in a single run, then you could use other tools like xcopy / scp or - if you’ve mounted the remote NAS at a local mount point - just plain old cp

The reason for that is that rsync has to work out what’s at the other end for each file, so it’s doing some back & forwards communications each time which as someone else pointed out can load the CPU and reduce throughput.

(From memory, I think Raspberry Pi don’t handle large transfers over scp well… I seem to recall a buffer gets saturated and the throughput drops off after a minute or so)

Also, on a local network, there’s probably no point in using encryption or compression options - esp. for photos / videos / music… you’re just loading the CPU again to work out that it can’t compress any further.

ryper@lemmy.ca · 6 months ago

It’s just a one-off transfer, I’m not planning to stop the transfer, and it’s my media library, so nothing should change, but I figured something resumable is a good idea for a transfer that’s going to take 12+ hours, in case there’s an unplanned stop.

SayCyberOnceMore@feddit.uk · 6 months ago

One thing I forgot to mention: rsync has an option to preserve file timestamps, so if that’s important for your files, then thst might also be useful… without checking, the other commands probably have that feature, but I don’t recall at the moment.

rsync -Prvt <source> <destination> might be something to try, leave for a minute, stop and retry … that’ll prove it’s all working.

Oh… and make sure you get the source and destination paths correct with a trailing / (or not), otherwise you’ll get all your files copied to an extra subfolder (or not)

surph_ninja@lemmy.world · 6 months ago

Use borg/borgmatic for your backups. Use rsync to send your differentials to your secondary & offsite backup storage.

calliope@retrolemmy.com · 6 months ago

Tangentially, I don’t see people talk about rclone a lot, which is like rsync for cloud storage.

It’s awesome for moving things from one provider to another, for example.

David Vasandani@social.coop · 6 months ago

@calliope It’s also great for local or remote backups over ssh, smb, etc.

calliope@retrolemmy.com · 6 months ago

It has been remarkably useful! I keep trying to tell people about it but apparently I am just their main use case or something.

I would have loved it when I was using Samba to share files on my local network decades ago. It’s like a Swiss Army knife!

Eldritch@piefed.world · 6 months ago

It’s fine. But yes in the Linux space. We tend to want to host ourselves. Not have to trust some administrator of some cloud we don’t know/trust.

TehNomad@piefed.social · 6 months ago

rclone does support other protocols besides S3. You can also selfhost your own S3 storage.

calliope@retrolemmy.com · edit-2 6 months ago

deleted by creator

Eldritch@piefed.world · 6 months ago

I mention in the Linux space only because it’s what I’m familiar with and didn’t want to make assumptions about groups I’m not familiar with. Unlike you who’s looking for a way to take umbridge and talk passed people. I went to college for IT and have done it for 30 years.

In network and IT planning. The cloud is the wider network outside your own. That you don’t have mapped. Often depicted by a “cloud”. If I have a personal data pool on one of my own networks. And need it from another. It may transmit via the “cloud”. But it isn’t IN the cloud. It’s on a personal server. If the server is in your house, and you can point exactly to where your data is. Then the rule of thumb is that it is in your house. Not the cloud. If it’s hosted on a system you couldn’t directly point to on a network you have no knowledge of. Especially a shared system. Then things literally and figuratively are getting cloudier.

That said, marketing as it often does. Appropriates and misuses words based around buzz. And I am not about to admonish hobbyist who use it in the marketing sense. I understand, I get it.

If you host in OSX on Apple Silicon, that’s great. If you host on a 68k Mac or Amiga you’re a fucking mad lad! If you’re hosting under Windows, any TCP port in the storm mate. If you are hosting from a Linux distribution that is not God’s chosen, cool how is it working out? If you are hosting from BeOS. or Haiku, you are a glorious oddball and absolutely my sort of person. And if you are hosting from an appliance that you really don’t know what it’s running, welcome to the hobby. It’s a good starting point. And a lill data in the cloud isn’t a crime. We all have some. But if you can’t easily point to it. Can you really know you have it?

calliope@retrolemmy.com · 6 months ago

I’m not reading all that. Sorry for your issue, or I’m happy for you. Whichever you prefer.

Ardent@kbin.earth · 6 months ago

you should. they were polite unlike you. explained the origin of the term and how it was used. explaining that they were aware of how hobbyists have changed the definition etc. it was a decent post. frankly I’m kind of curious why your so hateful. but not enough to really care.

Bo7a@lemmy.ca · 6 months ago

Partakes in text-based medium. Refuses to read well written and polite comment that is four whole paragraphs. Proceeds to think they are the intelligent one in the conversation. Are you huffing glue right now?

Landless2029@lemmy.world · 6 months ago

I tried rclone once because I wanted to sync a single folder from documents and freaked out when it looked like it was going to purge all documents except for my targeted folder.

Then I just did it via the portal…

calliope@retrolemmy.com · 6 months ago

rsync can sometimes look similarly scary! I very clearly remember triple-checking what it’s doing.

rclone works amazingly well if you have hundreds of folders or thousands of files and you can’t be bothered to babysit a portal.

i_stole_ur_taco@lemmy.ca · 6 months ago

The thing I hate most about rsync is that I always fumble to get the right syntax and flags.

This is a problem because once it’s working I never have to touch it ever again because it just works and keeping working. There’s not enough time to memorize the usage.

mesa@piefed.social · 6 months ago

I feel this too. I have a couple of “spells” that work wonders in a literal small notebook with other one liners over the years. Its my spell book lol.

NuXCOM_90Percent@lemmy.zip · 6 months ago

One trick that one of my students taught me a decade or so ago is to actually make an alias to list the useful flags.

Yes, a lot of us think we are smart and set up aliases/functions and have a huge list of them that we never remember or, even worse, ONLY remember. What I noticed her doing was having something like goodman-rsync that would just echo out a list of the most useful flags and what they actually do.

So nine times out of 10 I just want rsync -azvh --progress ${SRC} ${DEST} but when I am doing something funky and am thinking “I vaguely recall how to do this”? dumbman rsync and I get a quick cheat sheet of what flags I have found REALLY useful in the past or even just explaining what azvh actually does without grepping past all the crap I don’t care about in the man page. And I just keep that in the repo of dotfiles I copy to machines I work on regularly.

muix@lemmy.sdf.org · 6 months ago

tldr and atuin have been my main way of remembering complex but frequent flag combinations

NuXCOM_90Percent@lemmy.zip · 6 months ago

Yeah. There are a few useful websites I end up at that serve similar purposes.

My usual workflow is that I need to be able to work in an airgapped environment where it is a lot easier to get “my dotfiles” approved than to ask for utility packages like that. Especially since there will inevitably be some jackass who says “You don’t know how to work without google? What are we paying you for?” because they mostly do the same task every day of their life.

And I do find that writing the cheat sheet myself goes a long way towards me actually learning them so I don’t always need it. But I know that is very much how my brain works (I write probably hundreds of pages of notes a year… I look at maybe two pages a year).

JohnAnthony@lemmy.dbzer0.com · 6 months ago

rsync -avzhP gang unite! I knew someone would have posted my standard flags. I used them enough that my brain moved them from RAM to ROM at this point…

oddlyqueer@lemmy.ml · 6 months ago

This is why I still don’t know sed and awk syntax lol. I eventually get the data in the shape I need and then move on, and never imprint how they actually work. Still feel like a script kiddie every time I use them (so once every few years).

solrize@lemmy.ml · 6 months ago

I’ve been using borg because of the backend encryption and because the deduplication and snapshot features are really nice. It could be interesting to have cross-archive deduplication but maybe I can get something like that by reorganizing my backups. I do use rsync for mirroring and organizing downloads, but not really for backups. It’s a synchronization program as the name implies, not really intended for backups.

cmgvd3lw@discuss.tchncs.de · 6 months ago

I think Arch wiki recommends rsync for backups

1984@lemmy.today · 6 months ago

I never thought of it as slow. More like very reliable. I dont need my data to move fast, I need it to be copied with 100% reliability.

sugar_in_your_tea@sh.itjust.works · 6 months ago

And not waste time copying duplicate data. And for the typical home user, it’s probably mo slower than other options.

RestrictedAccount@lemmy.world · 6 months ago

I use syncthing.

Is rsync better?

Syncthing works pretty well for me and my stable of Ubuntu, pi, Mac, and Windows

Taasz/Woof@lemmy.blahaj.zone · 6 months ago

Different tools for different use cases IMO.

But neither do backups.

RestrictedAccount@lemmy.world · 6 months ago

I dunno.

I am using it to keep a real time copy of documents on an offsite server.

Feels like a backup to me.

Taasz/Woof@lemmy.blahaj.zone · 6 months ago

What happens if you accidentally overwrite something important in a document and save it though? If there’s no incremental versioning you can’t recover from that.

RestrictedAccount@lemmy.world · 6 months ago

That is a good point.

In my case, I was trying to address the shortcomings of Apple Time Machine. I use a Mac mini as the server I work from on all my machines. Time Machine does the version Managment for me.

I just use Sync Thing through a VPN to keep an offsite backup of content files (not a complete OS restore) and to keep a copy of critical files on my laptop in case I am away from my home network and need to see a file.

I still need to implement a regular air gapped backup instead of the ad-hoc that I have now.

Encrypt-Keeper@lemmy.world · 6 months ago

I’m not super familiar with Syncthing, but judging by the name I’d say Syncthing is not at all meant for backups.

conartistpanda@lemmy.world · 6 months ago

Syncthing is technically to synchronize data across different devices in real time (which I do with my phone), but I also use it to transfer data weekly via wi-fi to my old 2013 laptop with a 500GB HDD and Linux Mint (I only boot it to transfer data, and even then I pause the transfers to this device when its done transferring stuff) so I can have larger data backups that wouldn’t fit in my phone, since LocalSend is unreliable for large amounts of data while Synchting can resume the transfer if anything goes wrong. On top of that Syncthing also works in Windows and Android out of the box.

WhyJiffie@sh.itjust.works · 6 months ago

its for a different purpose. I wouldn’t use syncthing the way I use rsync