Jump to content
NAKIVO Community Forum

Manic Mark

Members
  • Posts

    3
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by Manic Mark

  1. Hi, I have implemented a workaround. This works for me in the sense that it takes the load of my nas and moves it to my flash storage. Perhaps it will be of help to some one else, perhaps not. Try this at your own risk. I stopped all tenants and the transporter service. I unmounted the existing ext4 filesystem that held all the SaaS repos and remounted it in another location. I upgraded the va's virtual hardware to the current (7.0u2) and then added a new nfs backed 10TB vmdk (sda) and a 100GB flash backed vmdk (nvme0n1). I installed zfsutils-linux in the va and created a zpool on the nfs backed vmdk with metadata stored on the flash backed vmdk and mounted in the place of the original ext4 filesystem. zpool create nkv-saas -o ashift=13 -m /mnt/nkv-saas-repos /dev/sda special /dev/nvme0n1 I then copied all the saas repos to the new filesystem using rsync, and started the transporter service and the tenants. The difference is shown below. BEFORE: (ext4 on nfs backed vmdk): continuous 3K-25K IOPs over nfs Get used space 181.0-GB: 14min. 36sec. 248mls AFTER: (zfs on nfs backed vmdk with metadata on flash backed vmdk): peaks at ~250 nfs IOPs during a backup, little to no load the rest of the time. Get used space 181.9-GB: 1min. 44sec. 278mls I'm using 23% of the 10T vmdk on my NAS and 29% of the 100G vmdk on my directly attached flash array so the amount of flash storage required is trivial. I still think that the way Nakivo tracks used space is naive at best. Walking the filesystem every 30 minutes to add up the space used by what could be hundreds of millions of files ( I'm already up to 21M files ) is a waste of system resources. Even more so considering that the repo is PostgreSQL database and all the files in the repo are managed by Nakivo. A far more efficient approach would be to track the blob sizes in the database and, if needed, measure the sizes of the database and other files. -Mark
  2. Hi, I disabled the auto refresh of the repositories but it did not make any change to the usage. After some digging it looks like the transporter ( or pgsql ) is repeatedly trying to determine the used and free space in the repository. looking at all_pg_sql.log, it keeps logging eventl like this: I suspect that the "Get used space" is the culprit where it is walking the directory structure and adding up the size of all the files in the repo (which contains ~3.3M files in this case). It appears to be triggered at tenant startup (or repo mount) and then 30 minutes after the previous iteration completes. I have also tried switching off "system.repository.refresh.backup.size.calculation" under expert settings but "Get used space" still runs.
  3. I use Nakivo in a multi tenant configuration (MSP) for Office 365 backups only. I use onboard transporter of the director VA ( 4vCPU and 16GB ram ) for all tenants. Each tenant has a SaaS repository configured on an NFS backed VMDK. The backups run overnight and work fine. I noticed however that even when backups are not running, there is a transporter process that appears to do some housekeeping on the repo. Each thread that does this housekeeping generates ~3000 iops over nfs and more than one thread can run concurrently. As a result, My storage has a load of 3,000-25,000 iops (peaking at ~260MB/sec) continuously, 24/7. No space or inodes appear to be freed. I've only got 8 SaaS repos at the moment. What happens when I have 80? What is this background process doing that requires this much IO? Is there some way I can tell if it is actually accomplishing anything? Is there a way to limit it to running out of hours? -Mark
×
×
  • Create New...