They say if you don’t own your data in 3 different places you don’t own it.
With my organization, we maintain a 6.5 TB filesystem on an IBM Storwize v3700, and the storage is hosted via fibre-channel
and a Windows Server 2012 R2 (that doubles as a domain-controller) -now- we want all of this (well not all, just important stuff) duplicated at an AWS S3 bucket.
At first I established a connection to the S3 bucket from another Windows box (EC2 instance) at AWS. Using the VPN tunnel we have to our HQ, I mapped a share from there to HQ that was linked to the S3 bucket using (TNTDrive??) and then from the HQ fileserver I tried stuff with robocopy, and then “Deltacopy”…
I realized the way to go to get data from our local filesystem to S3 was via the AWS CLI. Furthermore, the AWS CLI installed on a linux utility server with access to the filesystem by way of CIFS, NOT from the Windows fileserver itself.
So, the utility server is set up with python-pip, and thusly with awscli with which I can do things like
aws s3 sync /media/'cifs-mountpoint' s3://bucket/Path
This can then be put into a crontab entry, along with any number of additional folder synchronizations.
And there you go, your sort of rsync over awscli. I’m also looking into rclone, which may simplify things even further, we’ll see.