this post was submitted on 09 Sep 2023
28 points (93.8% liked)

Selfhosted

39435 readers
6 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Hello.

My setup is:

  • Lenovo M920q mini pc with Proxmox installed (this doesn't have IPMI, only vPRO and it's annoying me)
  • Fujitsu TX1320 M3 with TrueNAS Core installed - ZFS + RAID1 (this is a low-end "enterprise grade" server, and best thing - it has IPMI).

The Proxmox PC keeps all its CTs and 1 VM on the TrueNAS using iSCSI.

The idea behind my setup was that it felt nice that the TrueNAS would handle all the storage heavy lifting - ZFS, RAID etc., while the Proxmox mini PC would be a "compute-only" node that has a naked Proxmox install with some config.

The problem with that is if the TrueNAS machine loses power or is restarted, the Proxmox CTs/VMs switch their filesystem to read-only and stop responding to requests. This is because the iSCSI connection is interrupted. When the TrueNAS is back online, Proxmox doesn't make any attempt to restart the VMs/CTs - they'd still be broken.

It's annoying to me to have to VPN to the Proxmox web ui and wait 15 minutes until all the CTs/VMs are restarted and now again functioning on the "alive" iSCSI connection.

I was wondering what are my options here to remove the dependency chain?

I'm really into the idea of decomissioning the Proxmox node because I'm scared I won't be able to (over VPN) change the power state of the machine if something goes wrong, since it only has vPro and not iSCSI like the TrueNAS machine. By doing that, I'd consolidate the storage and the compute into the TrueNAS machine.

Options I can think of:

  1. Decomission the Proxmox node and move all Debian VMs/CTs to TrueNAS BSD jails. Is that even possible? Will all my Debian VMs work in BSD?
  2. Decomission the Proxmox node, switch TrueNAS Core to TrueNAS Scale and move CTs/VMs to TrueNAS Scale's Linux VMs
  3. Keep the Proxmox node and somehow figure out how to get Proxmox to refresh the CTs/VMs on iSCSI connection loss.
  4. Keep the Proxmox PC, but switch it to iESXI hoping that it handles the iSCSI failure more gracefully

EDIT: I didn't make it clear at first - TrueNAS stores more data than just VMs - documents, Linux ISOs (TM), photos, Syncthing

top 17 comments
sorted by: hot top controversial new old
[–] [email protected] 9 points 1 year ago* (last edited 1 year ago) (1 children)

Get rid of iscsi. Instead, use truenas scale for nas and use a zvol on truenas to run a vm of proxmox backup server. Run proxmox on the other box with local vms and just backup the vms to proxmox backup server at a rate you are comfortable with (i.e. once a night). Map nfs shares from truenas to any docker containers directly that you are running on your vms. map cifs shares to any windows vms, map nfs shares directly to any linux things. This is way more resilient, gets local nvme speeds for the vms and still keepa the bulk of your files on the nas, while also not abusing your 1gbit ethernet for vm stuff, just for file transfer (the vm stuff happens at multi GB speeds on the local nvme on the proxmox server).

[–] [email protected] 2 points 1 year ago (1 children)

For a home environment, this is the correct idea

[–] [email protected] 1 points 1 year ago (1 children)

I would argue it's the correct idea up to a fairly decently sized business. Basically anything where you don't have the budget or the need for super fault tolerant systems (i.e. where it's ok to very rarely have a 20 minute to an hour outage in order to save 50k+ of IT hardware costs). You can take the above and go next step to a high availability proxmox cluster to further reduce potential downtime before you step into the realm of needing vmware and very expensive highly available and fast storage as well. It gets even more true when you start messing around with truenas and differential speed vdevs (i.e build a super fast nvme one with 10-25gig networking for some applications, a cheaper spinning rust one with maybe 10 gig networking for bulk storage. It's also nice that, by using proxmox backup server as a zvol you can take advantage of all the benefit of both zfs replication/snapshotting and cloud (jstor/wasabi s3 bucket, another truenas server at a different location) for that zvol as well as your other data you are sharing as datasets.

[–] [email protected] 1 points 1 year ago

I would have assumed that most businesses invested in resilient hardware, but perhaps not. Thanks for the note

[–] [email protected] 5 points 1 year ago (1 children)

2 is the only possible one of those options.

You could also make the current setup more reliable by adding a UPS and/or second storage node for redundancy, so that when one goes down, the other is still available. Presumably TrueNAS supports this.

But nothing is going to help you recover if the iSCSI link is broken. It's up to the host and guest OS to re-establish the link, and to the guest it usually looks like the hard drive has been unplugged, and I don't know any OS that considers that a supported and recoverable condition.

[–] [email protected] 1 points 1 year ago

Thanks for making it clear that iSCSI power down is in fact one of the more grim scenarios, I couldn't make it out how bad of a situation it is. In an enterprise environment a SAN being down would require some type of incident report.

UPS - as you suggested - would solve most of my problems to be honest.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

What is TrueNAS adding to this arrangement? Generally when people run two different servers at home, they keep the VM drives on the hypervisor and just use the NAS for storing bigger things like media files. Hosting VM drives over iSCSI works in an enterprise environment, but if you can't guarantee uptime for your storage solution then all you're doing is adding failure modes.

It seems to me that your best bet is to go down to one server, which means cutting out either TrueNAS or Proxmox. Both can handle both storage (ZFS included!) and VMs, so ultimately it's a matter of which you like better.

Alternatively, if you're hosting other stuff on your NAS, you could consider keeping both servers but just getting a few SSDs to stick in your Proxmox mini PC to serve VMs. That may or may not be viable for your situation, but it's worth considering.

[–] [email protected] 4 points 1 year ago (2 children)

I edited my post to clarify that TrueNAS keeps more than just VMs. It has photos, documents etc. as well.

Generally when people run two different servers at home, they keep the VM drives on the hypervisor and just use the NAS for storing bigger things like media files

This is simple and makes sense as well. My TrueNAS is only 2 HDDs, which is not ideal for VMs. I could get a larger drive SSD/M.2 drive for the hypervisor, though the Lenovo M920q supports 1xM.2 and 1x2.5" drive.

Hosting VM drives over iSCSI works in an enterprise environment, but if you can’t guarantee uptime for your storage solution then all you’re doing is adding failure modes.

Well, my whole setup comes from the fact that I wanted to cosplay as an enterprise environment (famous last words for a homelabber). I've been powering the TrueNAS up and down a lot due to some electricity-related construction in my apartament, and it brought out this flaw in my setup. I guess an UPS would be in order, as another poster pointed out.

[–] [email protected] 3 points 1 year ago

In general if you lose your iscsi storage you are hosed.

The way around this is replication where you write one byte to two locations and pseudo load balancing where you have an active and inactive link. When power on one storage fabric goes down you flip to the other. Iscsi isn’t really good for this use case

[–] [email protected] 3 points 1 year ago

Well, my whole setup comes from the fact that I wanted to cosplay as an enterprise environment

I feel that. Experimenting I get, but I'd never have it be my primary vm backing storage. Esp. not on a 1Gb network, no ups, no redundancy.

Someone else suggested local vm storage and a PBS VM on the TrueNas box. I think that's a solid solution to consider.

[–] [email protected] 3 points 1 year ago

VMs in ESXi have the same behavior when iSCSI connection is lost then restored later. Windows with iSCSI drive mounts shows the same behavior in that scenario too.

UPS would be a great addition no matter what option you choose.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Hey, OP here again.

Here's what I ended up with:

  • upgrading my TrueNAS CORE to TrueNAS SCALE - it was really easy, just upload a 1.3GB upload file through the web UI. CORE's apps/plugins are based on BSD jails, where SCALE apps are based on Kubernetes/Docker, so I can any arbitrary Docker container from Dockerhub as I please, rather than being limited to BSD jails

  • migrating all the VMs/LXCs to matching TrueNAS SCALE Applications. So e.g. my hand-made Navidrome LXC was migrated to the TrueNAS SCALE Application. Sometimes there was no equivalent TrueNAS app for what I was using - e.g. Forgejo, so I just ran an arbitrary container from dockerhub.

  • decomissioning the Proxmox mini-pc (Lenovo M920q). I'll sell it later or maybe turn it into a pfSense router.

I installed a custom TrueNAS app repository called Truecharts. It has some apps that the default repo doesn't have, and it also has a nice integration with Ingress (Traefik), which allows you to easily create a reverse proxy using just the GUI.

I'm still yet to figure out how to set up Let's Encrypt for the services I made available to the Internet. I can no longer do things the Linux way, i must do it the Kubernetes way, so I'm kind of limited. Looks like HTTP01 challenges don't work yet and I'll have to use DNS01.

Looking back, I'm happy I consolidated. The hypervisor was idling all the time - so what's the point of having a second machine? Also, the only centralized machine has IPMI, so I have full remote control, and I'll hopefully never have to plug a VGA cable again. Of course, there's no iSCSI fault path anymore, though I'm happy I got to experiment with it.

The downside is as I said - I'm forced to do things the Kubernetes/Docker way, because that's what TrueNAS uses and that's the abstraction layer I'm working on. Docker containers are meant for running things, not for portability. I'm sad that I can't just pack things up in a nice LXC and drag it around wherever I please. Still, I don't thing I'll be switching from TrueNAS, so perhaps portability isn't that big of a deal.

I'm also sad that I ... no longer have a hypervisor. Sure, SCALE can do VMs, but perhaps keeping TrueNAS virtualized would give me the best of both worlds.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (2 children)

It'd be nice to have Proxmox and TrueNAS side by side on one machine, but since TrueNAS forums are against the virtualization of TrueNAS (yes I know people do that, but I'm not willing) I'm somewhat stuck with having to have one bare metal machine per appliance.

[–] [email protected] 3 points 1 year ago

I’m sure you’ve heard plenty through the forums, but Truenas virtualized is perfectly fine so long as you’re passing through an HBA directly. It doesn’t affect reliability any, but it doesn’t add any features either.

“Can I virtualized Truenas” is probably the second most popular question after “do I really need ECC ram”

[–] [email protected] 3 points 1 year ago (1 children)

I took my truenas some months ago and installed proxmox on that server instead.

I can't from your post figure out what you need truenas for? Proxmox runs zfs, etc?

I made an lxc container with samba to share the discs in proxmox.

[–] [email protected] 1 points 1 year ago

I edited my post to clarify. TrueNAS also keeps documents, photos, torrents, music. I also use the mount feature so that the music server LXC can access music

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
ESXi VMWare virtual machine hypervisor
LXC Linux Containers
NAS Network-Attached Storage
SAN Storage Area Network
SSD Solid State Drive mass storage

5 acronyms in this thread; the most compressed thread commented on today has 9 acronyms.

[Thread #122 for this sub, first seen 9th Sep 2023, 18:55] [FAQ] [Full list] [Contact] [Source code]