Backup Copy job to S3 - Different retention settings & slow performance

DWigginsSWE · December 5, 2023

We're using Nakivo 10.8.0. It's provided as a Service by our hosting provider - they maintain the console, we have a dedicated transporter and we collocate a Synology NAS for storage of our backups.

I have a file server backup job that required about 3.75TB for a full backup, and generates incrementals of a couple hundred gig/night. I've got the backup job set to run a full every weekend and then incrementals during the week. I have enough space on the NAS to keep about a month's worth of nightly recovery points. I have retention settings currently defined on the job that will eventually exceed my available space, but I haven't hit the wall yet.

I'd like to start using a backup copy job to S3 to stash extra copies of backups and also build a deeper archive of recovery points. I'd like to keep about 2 weeks worth of nightly recovery points in S3, then 2 more weeks of weekly recovery points, then 11 more monthly recovery points, and finally a couple yearly.

I built a backup copy job with the retention settings specified, and then ran it. My data copy to S3 ran at a reasonable fraction of the available bandwidth from my provider. However, once the first backup was copied up to S3 in 700+ 5GB chunks, it took an additional 60+ HOURS to "move" the chunks from the transit folder to the folder that contained that first backup. All told, it was 100 hours of "work" for the transporter, and only the first 35ish were spent actually moving data over the internet from our datacenter to S3.

Either I've failed spectacularly at setting up the right job settings (a possibility I'm fully prepared to accept - I just want to know the RIGHT way), or the Nakivo software uses a non-optimal set of API calls when copying to S3, or S3 is just slow. I've read enough case studies and whatnot to believe that S3 can move hundreds of TB/hour if you ask it correctly, so 60 hours to move <5 TB of data seems like something is wrong.

So, my questions are these - what's the best way to set my retention settings on both my backup job and my backup copy jobs so that I have what I want to keep, and how do I optimize the settings on the backup copy to minimize the amount of time moving data around? The bandwidth from my datacenter to S3 is what it is and I know that a full backup is going to take a while to push up to Uncle Jeff, but once the pieces are in S3 what can be done to make the "internal to AWS" stuff go faster?

The Official Moderator · December 6, 2023

17 hours ago, DWigginsSWE said:

We're using Nakivo 10.8.0. It's provided as a Service by our hosting provider - they maintain the console, we have a dedicated transporter and we collocate a Synology NAS for storage of our backups.

I have a file server backup job that required about 3.75TB for a full backup, and generates incrementals of a couple hundred gig/night. I've got the backup job set to run a full every weekend and then incrementals during the week. I have enough space on the NAS to keep about a month's worth of nightly recovery points. I have retention settings currently defined on the job that will eventually exceed my available space, but I haven't hit the wall yet.

I'd like to start using a backup copy job to S3 to stash extra copies of backups and also build a deeper archive of recovery points. I'd like to keep about 2 weeks worth of nightly recovery points in S3, then 2 more weeks of weekly recovery points, then 11 more monthly recovery points, and finally a couple yearly.

I built a backup copy job with the retention settings specified, and then ran it. My data copy to S3 ran at a reasonable fraction of the available bandwidth from my provider. However, once the first backup was copied up to S3 in 700+ 5GB chunks, it took an additional 60+ HOURS to "move" the chunks from the transit folder to the folder that contained that first backup. All told, it was 100 hours of "work" for the transporter, and only the first 35ish were spent actually moving data over the internet from our datacenter to S3.

Either I've failed spectacularly at setting up the right job settings (a possibility I'm fully prepared to accept - I just want to know the RIGHT way), or the Nakivo software uses a non-optimal set of API calls when copying to S3, or S3 is just slow. I've read enough case studies and whatnot to believe that S3 can move hundreds of TB/hour if you ask it correctly, so 60 hours to move <5 TB of data seems like something is wrong.

So, my questions are these - what's the best way to set my retention settings on both my backup job and my backup copy jobs so that I have what I want to keep, and how do I optimize the settings on the backup copy to minimize the amount of time moving data around? The bandwidth from my datacenter to S3 is what it is and I know that a full backup is going to take a while to push up to Uncle Jeff, but once the pieces are in S3 what can be done to make the "internal to AWS" stuff go faster?

@DWigginsSWE, hello, your information/request has been received and forwarded to our Level 2 Support Team.

Meanwhile, the best approach would be to generate and send a support bundle (https://helpcenter.nakivo.com/display/NH/Support+Bundles) to support@nakivo.com so our Technical Support team can investigate your issue having more details.

Thank you.

The Official Moderator · December 7, 2023

On 12/5/2023 at 11:26 PM, DWigginsSWE said:

We're using Nakivo 10.8.0. It's provided as a Service by our hosting provider - they maintain the console, we have a dedicated transporter and we collocate a Synology NAS for storage of our backups.

I have a file server backup job that required about 3.75TB for a full backup, and generates incrementals of a couple hundred gig/night. I've got the backup job set to run a full every weekend and then incrementals during the week. I have enough space on the NAS to keep about a month's worth of nightly recovery points. I have retention settings currently defined on the job that will eventually exceed my available space, but I haven't hit the wall yet.

I'd like to start using a backup copy job to S3 to stash extra copies of backups and also build a deeper archive of recovery points. I'd like to keep about 2 weeks worth of nightly recovery points in S3, then 2 more weeks of weekly recovery points, then 11 more monthly recovery points, and finally a couple yearly.

I built a backup copy job with the retention settings specified, and then ran it. My data copy to S3 ran at a reasonable fraction of the available bandwidth from my provider. However, once the first backup was copied up to S3 in 700+ 5GB chunks, it took an additional 60+ HOURS to "move" the chunks from the transit folder to the folder that contained that first backup. All told, it was 100 hours of "work" for the transporter, and only the first 35ish were spent actually moving data over the internet from our datacenter to S3.

Either I've failed spectacularly at setting up the right job settings (a possibility I'm fully prepared to accept - I just want to know the RIGHT way), or the Nakivo software uses a non-optimal set of API calls when copying to S3, or S3 is just slow. I've read enough case studies and whatnot to believe that S3 can move hundreds of TB/hour if you ask it correctly, so 60 hours to move <5 TB of data seems like something is wrong.

So, my questions are these - what's the best way to set my retention settings on both my backup job and my backup copy jobs so that I have what I want to keep, and how do I optimize the settings on the backup copy to minimize the amount of time moving data around? The bandwidth from my datacenter to S3 is what it is and I know that a full backup is going to take a while to push up to Uncle Jeff, but once the pieces are in S3 what can be done to make the "internal to AWS" stuff go faster?

Hello, @DWigginsSWE, thank you for your patience during our investigation. In response to your query, I'd like to clarify that, indeed, in the current NAKIVO workflow for S3 repositories, data is initially copied to a "transit" folder and subsequently merged into the repository.

However, it's essential to note that the duration of moving data from the "transit" folder depends on the volume of the data being transferred. In some cases, this step may take more time.

For now, it's not possible to improve the performance of this last step as it is done on the S3 repository side after the solution's API request. Should you require any further assistance or have additional questions, please don't hesitate to reach out by email: support@nakivo.com.

Sign In

Backup Copy job to S3 - Different retention settings & slow performance

Recommended Posts

DWigginsSWE

Link to comment

Share on other sites

The Official Moderator

Link to comment

Share on other sites

The Official Moderator

Link to comment

Share on other sites

Join the conversation

Browse

Activity