this post was submitted on 09 Feb 2025
81 points (98.8% liked)

Selfhosted

42074 readers
413 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hi all!

I will soon acquire a pretty beefy unit compared to my current setup (3 node server with each 16C, 512G RAM and 32T Storage).

Currently I run TrueNAS and Proxmox on bare metal and most of my storage is made available to apps via SSHFS or NFS.

I recently started looking for "modern" distributed filesystems and found some interesting S3-like/compatible projects.

To name a few:

  • MinIO
  • SeaweedFS
  • Garage
  • GlusterFS

I like the idea of abstracting the filesystem to allow me to move data around, play with redundancy and balancing, etc.

My most important services are:

  • Plex (Media management/sharing)
  • Stash (Like Plex πŸ™ƒ)
  • Nextcloud
  • Caddy with Adguard Home and Unbound DNS
  • Most of the Arr suite
  • Git, Wiki, File/Link sharing services

As you can see, a lot of download/streaming/torrenting of files accross services. Smaller services are on a Docker VM on Proxmox.

Currently the setup is messy due to the organic evolution of my setup, but since I will upgrade on brand new metal, I was looking for suggestions on the pillars.

So far, I am considering installing a Proxmox cluster with the 3 nodes and host VMs for the heavy stuff and a Docker VM.

How do you see the file storage portion? Should I try a full/partial plunge info S3-compatible object storage? What architecture/tech would be interesting to experiment with?

Or should I stick with tried-and-true, boring solutions like NFS Shares?

Thank you for your suggestions!

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 5 points 1 day ago

I use Ceph/CephFS myself for my own 671TiB array (382TiB raw used, 252TiB-ish data stored) -- I find it a much more robust and better architected solution than Gluster. It supports distributed block devices (RBD), filesystems (CephFS), and object storage (RGW). NFS is pretty solid though for basic remote mounting filesystems.

[–] [email protected] 14 points 2 days ago (2 children)

sshfs is somewhat unmaintained, only "high-impact issues" are being addressed https://github.com/libfuse/sshfs

I would go for NFS.

[–] [email protected] 3 points 2 days ago

And if you need to mount a directory over SSH, I can recommend rclone and its mount subcommand.

[–] [email protected] 2 points 2 days ago (1 children)

But NFS has mediocre snapshotting capabilities (unless his setup also includes >10g nics)

[–] [email protected] 3 points 2 days ago (1 children)

I assume you are referring to Filesystem Snapshotting? For what reason do you want to do that on the client and not on the FS host?

[–] [email protected] 3 points 2 days ago (1 children)

I have my NFS storage mounted via 2.5G and use qcow2 disks. It is slow to snapshot...

Maybe I understand your question wrong?

[–] [email protected] 1 points 1 day ago (1 children)

If i understand you correctly, your Server is accessing the VM disk images via a NFS share?

That does not sound efficient at all.

[–] [email protected] 1 points 1 day ago

No other easy option I figured out.
Didnt manage to understand iSCSI in the time I was patient with it and was desperate to finish the project and use my stuff.
Thus NFS.

[–] [email protected] 31 points 2 days ago (2 children)

"Boring"? I'd be more interested in what works without causing problems. NFS is bulletproof.

[–] [email protected] 8 points 2 days ago (1 children)

NFS is bulletproof.

For it to be bulletproof, it would help if it came with security built in. Kerberos is a complex mess.

[–] [email protected] 3 points 2 days ago

Yeah, I've ended up setting up VLANS in order to not deal with encryption

[–] [email protected] 8 points 2 days ago (1 children)

You are 100% right, I meant for the homelab as a whole. I do it for self-hosting purposes, but the journey is a hobby of mine.

So exploring more experimental technologies would be a plus for me.

[–] [email protected] 4 points 2 days ago (1 children)

Most of the things you listed require some very specific constraints to even work, let alone work well. If you're working with just a few machines, no storage array or high bandwidth networking, I'd just stick with NFS.

[–] [email protected] 1 points 1 day ago (1 children)

As a recently former hpc/supercomputer dork nfs scales really well. All this talk of encryption etc is weird you normally just do that at the link layer if you’re worried about security between systems. That and v4 to reduce some metadata chattiness and gtg. I’ve tried scaling ceph and s3 for latency on 100/200g links. By far NFS is easier than all the rest to scale. For a homelab? NFS and call it a day, all the clustering file systems will make you do a lot more work than just throwing hard into your nfs mount options and letting clients block io while you reboot. Which for home is probably easiest.

[–] [email protected] 1 points 1 day ago

I agree as well. No reason to not use it. If there were better ways to build an alternative, one would exist.

[–] [email protected] 7 points 2 days ago

I'm using ceph on my proxmox cluster but only for the server data, all my jellyfin media goes into a separate NAS using NFS as it doesn't really need the high availability and everything else that comes with ceph.

It's been working great, You can set everything up through the Proxmox GUI and it'll show up as any other storage for the VMs. You need enterprise grade NVMEs for it though or it'll chew through them in no time. Also a separate network connection for ceph traffic if you're moving a lot of data.

Very happy with this setup.

[–] [email protected] 15 points 2 days ago (1 children)

I'd only use sshfs if there's no other alternative. Like if you had to copy over a slow internet link and sync wasn't available.

NFS is fine for local network filesystems. I use it everywhere and it's great. Learn to use autos and NFS is just automatic everywhere you need it.

[–] [email protected] 5 points 2 days ago
[–] [email protected] 2 points 1 day ago

NFS gives me the best performance. I've tried GlusterFS (not at home, for work), and it was kind of a pain to set up and maintain.

[–] [email protected] 13 points 2 days ago (4 children)

What's wrong with NFS? It is performant and simple.

[–] [email protected] 16 points 2 days ago (1 children)

By default, unencrypted, and unauthenticated, and permissions rely on IDs the client can fake.

May or may not be a problem in practice, one should think about their personal threat model.

Mine are read only and unauthenticated because they're just media files, but I did add unneeded encryption via ktls because it wasn't too hard to add (I already had a valid certificate to reuse)

[–] [email protected] 7 points 2 days ago (1 children)

NFS is good for hypervisor level storage. If someone compromises the host system you are in trouble.

[–] [email protected] 4 points 2 days ago (1 children)

If someone compromises the host system you are in trouble.

Not only the host. You have to trust every client to behave, as @forbiddenlake already mentioned, NFS relies on IDs that clients can easily fake to pretend they are someone else. Without rolling out all the Kerberos stuff, there really is no security when it comes to NFS.

[–] [email protected] 1 points 2 days ago (1 children)

You misunderstand. The hypervisor is the client. Stuff higher in the stack only sees raw storage. (By hypervisors I also mean docker and kubernetes) From a security perspective you just set an IP allow list

[–] [email protected] 1 points 2 days ago

Sure, if you have exactly one client that can access the server and you can ensure physical security of the actual network, I suppose it is fine. Still, those are some severe limitations and show how limited the ancient NFS protocol is, even in version 4.

[–] [email protected] 10 points 2 days ago (1 children)

NFS is fine if you can lock it down at the network level, but otherwise it's Not For Security.

[–] [email protected] 4 points 2 days ago* (last edited 2 days ago) (1 children)

NFS + Kerberos?

But everything I read about NFS and so on: You deploy it on a dedicated storage LAN and not in your usual networking LAN.

[–] [email protected] 2 points 2 days ago

I tried it once. NFSv4 isn't simple like NFSv3 is. Fewer systems support it too.

[–] [email protected] 1 points 2 days ago

It is a pain to figure out how to give everyone the same user id. I only have a couple computers at home. I've never figured out how to make LDAP work (including laptops which might not have network access when I'm on the road). Worse some systems start with userid 1000, some 1001. NFS is a real mess - but I use it because I haven't found anything better for unix.

[–] [email protected] 3 points 2 days ago

Gotta agree. Even better if backed by zfs.

[–] [email protected] 10 points 2 days ago (1 children)

Your workload just won't see much difference with any of them, so take your pick.

NFS is old, but if you add security constraints, it works really well. If you want to tune for bandwidth, try iSCSI , bonus points if you get zfs-over-iSCSI working with tuned block size. This last one is blazing fast if you have zfs at each and you do Zfs snapshots.

Beyond that, you're getting into very tuned SAN things, which people build their careers on, its a real rabbit hole.

[–] [email protected] 3 points 2 days ago (1 children)

NFS with security does harm performance. For raw throughput it is best to use no encryption. Instead, use physical security.

[–] [email protected] 6 points 2 days ago (1 children)

I don't know what you're on about, I'm talking about segregating with vlans and firewall.

If you're encrypting your San connection, your architecture is wrong.

[–] [email protected] 1 points 2 days ago (1 children)

That's what I though you were saying

[–] [email protected] 2 points 2 days ago (1 children)

Oh, OK. I should have elaborated.

Yes, agreed. It's so difficult to secure NFS that it's best to treat it like a local connection and just lock it right down, physically and logically.

When i can, I use iscsi, but tuned NFS is almost as fast. I have a much higher workload than op, and i still am unable to bottleneck.

[–] [email protected] 1 points 2 days ago (1 children)

Have you ever used NFS in a larger production environment? Many companies coming from VMware have expensive SAN systems and Proxmox doesn't have great support for iscsi

[–] [email protected] 2 points 2 days ago (1 children)

Yes, i have. Same security principles in 2005 as today.

Proxmox iscsi support is fine.

[–] [email protected] 1 points 2 days ago (1 children)

It really isn't.

You can't automatically create new disks with the create new VM wizard.

Also I hope you aren't using the same security principals as 2005. The landscape has evolved immensity.

[–] [email protected] 1 points 2 days ago

Are you having trouble reading context?

No, I'm not applying 2005 security, I'm saying NFS hasn't evolved much since 2005, so throw it in a dedicated link by itself with no other traffic and call it a day.

Yes, iscsi allows the use of mounted luns as datastores like any other, you just need to use the user space iscsi driver and tools so that iscsi-ls is available. Do not use the kernel driver and args. This is documented in many places.

If you're gonna make claims to strangers on the internet, make sure you know what you're talking about first.

[–] [email protected] 5 points 2 days ago (1 children)

If you want to try something that’s quite new and mostly unexplored, look into NVMe over TCP. I really like the concept, but it appears to be too new to be production ready. Might be a good fit for your adventurous endeavors.

[–] [email protected] 3 points 2 days ago (1 children)

This is just block device over network, it will not allow the use cases OP is asking for. You will still need a filesystem and a file-serving service on top of that.

[–] [email protected] 1 points 2 days ago

I agree, but it’s clear that OP doesn’t want a real solution, because those apparently are boring. Instead, they want to try something new. NVMe/TCP is something new. And it still allows for having VMs on one system and storage on another, so it’s not entirely off topic.

[–] [email protected] 10 points 2 days ago (2 children)

Gluster is ~~shit~~ really bad, garage and minio are great. If you want something tested and insanely powerful go with ceph, it has everything. Garage is fine for smaller installations, and it's very new and not that stable yet.

[–] [email protected] 8 points 2 days ago

Ceph isn't something you want to jump into without research

[–] [email protected] 3 points 2 days ago (1 children)

Darn, Garage is the only one that I successfully deployed a test cluster.

I will dive more carefully into Ceph, the documentation is a bit heavy, but if the effort is worth it..

Thanks.

[–] [email protected] 5 points 2 days ago (1 children)

I had great experience with garage at first, but it crapped itself after a month, it was like half a year ago and the problem was fixed, still left me with a bit of anxiety.

load more comments (1 replies)
[–] [email protected] 4 points 2 days ago (1 children)

I've used MinIO as the object store on both Lemmy and Mastodon, and in retrospect I wonder why. Unless you have clustered servers and a lot of data to move it's really just adding complexity for the sake of complexity. I find that the bigger gains come from things like creating bonded network channels and sorting out a good balance in the disk layout to keep your I/O in check.

[–] [email protected] 2 points 2 days ago (1 children)

I preach this to people everywhere I go and seldom do they listen. There's no reason for object storage for a non-enterprise environment. Using it in homelabs is just...mostly insane..

[–] [email protected] 1 points 2 days ago

Generally yes, but it can be useful as a learning thing. A lot of my homelab use is for purposes of practicing with different techs in a setting where if it melts down it's just your stuff. At work they tend to take offense of you break prod.

[–] [email protected] 4 points 2 days ago (1 children)

I think you will need to have a mix, not everything is S3 compatible.

But I also like S3 quite a lot.

load more comments (1 replies)
load more comments
view more: next β€Ί