I have implemented a workaround. This works for me in the sense that it takes the load of my nas and moves it to my flash storage. Perhaps it will be of help to some one else, perhaps not. Try this at your own risk.
I stopped all tenants and the transporter service. I unmounted the existing ext4 filesystem that held all the SaaS repos and remounted it in another location.
I upgraded the va's virtual hardware to the current (7.0u2) and then added a new nfs backed 10TB vmdk (sda) and a 100GB flash backed vmdk (nvme0n1). I installed zfsutils-linux in the va and created a zpool on the nfs backed vmdk with metadata stored on the flash backed vmdk and mounted in the place of the original ext4 filesystem.
zpool create nkv-saas -o ashift=13 -m /mnt/nkv-saas-repos /dev/sda special /dev/nvme0n1
I then copied all the saas repos to the new filesystem using rsync, and started the transporter service and the tenants.
The difference is shown below.
BEFORE: (ext4 on nfs backed vmdk): continuous 3K-25K IOPs over nfs
Get used space 181.0-GB: 14min. 36sec. 248mls
AFTER: (zfs on nfs backed vmdk with metadata on flash backed vmdk): peaks at ~250 nfs IOPs during a backup, little to no load the rest of the time.
Get used space 181.9-GB: 1min. 44sec. 278mls
I'm using 23% of the 10T vmdk on my NAS and 29% of the 100G vmdk on my directly attached flash array so the amount of flash storage required is trivial.
I still think that the way Nakivo tracks used space is naive at best. Walking the filesystem every 30 minutes to add up the space used by what could be hundreds of millions of files ( I'm already up to 21M files ) is a waste of system resources. Even more so considering that the repo is PostgreSQL database and all the files in the repo are managed by Nakivo. A far more efficient approach would be to track the blob sizes in the database and, if needed, measure the sizes of the database and other files.