At the moment I have my NAS setup as a Proxmox VM with a hardware RAID card handling 6 2TB disks. My VMs are running on NVMEs with the NAS VM handling the data storage with the RAIDed volume passed through to the VM direct in Proxmox. I am running it as a large ext4 partition. Mostly photos, personal docs and a few films. Only I really use it. My desktop and laptop mount it over NFS. I have restic backups running weekly to two external HDDs. It all works pretty well and has for years.
I am now getting ZFS curious. I know I'll need to IT flash the HBA, or get another. I'm guessing it's best to create the zpool in Proxmox and pass that through to the NAS VM? Or would it be better to pass the individual disks through to the VM and manage the zpool from there?
I think ecc isn't more required for zfs then for any other file system. But the idea that many people have is that if somebody goes through the trouble of using raid and using zfs then the data must be important and so ecc makes sense.
What I have now is one VM that has the array volume passed through and the VM exports certain folders for various purposes to other VMs. So for example, my application server VM has read access to the music folder so I can run Emby. Similar thing for photos and shares out to my other PCs etc. This way I can centrally manage permissions, users etc from that one file server VM. I don't fancy managing all that in Proxmox itself. So maybe I just create the zpool in Proxmox, pass that through to the file server VM and keep the management centralised there.
I did on proxmox. One thing I didn't know about ZFS, it has a lot of random writes, I believe logs and journaling. I killed 6 SSDs in 6 months. It's a great system - but consumer SSDs can't handle it.
I use a consumer SSD for caching on ZFS now for over 2 years and do not have any issues with it. I have a 54 TB pool with tons of reads and writes and no issue with it.
That doesn't sound right. Also random writes don't kill SSDs. Total writes do and you can see how much has been written to an SSD in its SMART values. I've used SSDs for swap memory for years without any breaking. Heavily used swap for running VMs and software builds. Their total bytes written counters were increasing steadily but haven't reached the limit and haven't died despite the sustained random writes load. One was an Intel MacBook onboard SSD. Another was a random Toshiba OEM NVMe. Another was a Samsung OEM NVMe.
ZFS is great, but to take advantage of it's positives you need the right drives, consumer drives get eaten alive as @scrubbles@poptalk.scrubbles.tech mentioned and your IO delay will be unbearable. I use Intel enterprise SSDs and have no issues.
Not sure where you're getting that. Been running ZFS for 5 years now on bottom of the barrel consumer drives - shucked drives and old drives. I have used 7 shucked drives total. One has died during a physical move. The remaining 6 are still in use in my primary server. Oh and the speed is superb. The current RAIDz2 composed of the shucked 6 and 2 IronWolfs does 1.3GB/s sequential reads and write IOPS at 4K in the thousands. Oh and this is all happening on USB in 2x 4-bay USB DAS enclosures.
No idea why you're getting downvoted, it's absolutely correct and it's called out in the official proxmox docs and forums. Proxmox logs and journals directly to the zfs array regularly, to the point of drive destroying amounts of writes.
I'm not intending to run Proxmox on it. I have that running on an SSD, or maybe it's an NVME, I forget. This will just be for data storage mainly of photos that one VM will manage and NFS share out to other machines.
Could this because it's a RAIDZ-2/3? They will be writing parity as well as data and the usual ZFS checksums. I am running RAID5 at the moment on my HBA card and my limit is definitely the 1Gbit network for file transfers, not the disks. And it's only me that uses this thing, it sits totally idle 90+% of the time.
For ZFS what you want is PLP and high DWPD/TBW. This is what Enterprise SSDs provide. Everything you've mentioned so far points to you not needing ZFS so there's nothing to worry about.
I am more looking into BTRF for backup due to
I run Linux and not BSD
ZFS requires more RAM
I only have one disk
I want to benefit from snapshots, compression and deduplication.
I use zfs with Proxmox. I have it as a bind mount to Turnkey Fileserver (a default lxc template).
I access everything through NFS (via turnkey Fileserver). Even other VMs just get the NFS added to the fstab file. File transfers happen extremely fast VM to VM, even though it's "network" storage.
This gives me the benefits of zfs, and NFS handles the "what if's", like what if two VMs access the same file at the same time. I don't know exactly what NFS does in that case, but I haven't run into any problems in the past 5+ years.
Another thing that comes to mind is you should make turnkey Fileserver a privileged container, so that file ownership is done through the default user (1000 if I remember correctly). Unprivileged uses wonky UIDs which requires some magic config which you can find in the docs. It works either way, but I chose the privileged route. Others will have different opinions.
Yes we run ZFS. I wouldn't use anything else. It's truly incredible. The only comparable choice is LVMRAID + Btrfs and it still isn't really comparable in ease of use.
Unless you need RAID 5/6, which doesn’t work well on btrfs
Yes. Because they're already using some sort of parity RAID so I assume they'd use RAID in ZFS/Btrfs and as you said, that's not an option for Btrfs. So LVMRAID + Btrfs is the alternative. LVMRAID because it's simpler to use than mdraid + LVM and the implementation is still mdraid under the covers.
Most NAS VMs want you to pass them the raw device so they can manage ZFS themselves. For every other VM, I have the VM running on ZFS storage that Proxmox uses and manages, and it will manage the datasets for backup, snapshots, etc.
It is definitely the way to go. The ability to snapshot a VM or CT before updates alone is worth it.
both works. Just do not forgot to assign fake serial numbers if you are passing disks. IMHO passing disk will be more performant, or may be just pass HBA controller if other disks are on different controller.