this post was submitted on 09 Jan 2024

69 points (98.6% liked)

Selfhosted

39435 readers

3 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

[email protected]

Kubernetes? docker-compose? How should I organize my container services in 2024? (lemmy.world)

submitted 10 months ago by [email protected] to c/[email protected]

46 comments fedilink hide all child comments

Currently, I run Unraid and have all of my services' setup there as docker containers. While this is nice and easy to setup initially, it has some major downsides:

It's fragile. Unraid is prone to bugs/crashes with docker that take down my containers. It's also not resilient so when things break I have to log in and fiddle.
It's mutable. I can't use any infrastructure-as-code tools like terraform, and configuration sort of just exist in the UI. I can't really roll back or recover easily.
It's single-node. Everything is tied to my one big server that runs the NAS, but I'd rather have the NAS as a separate fairly low-power appliance and then have a separate machine to handle things like VMs and containers.

So I'm looking ahead and thinking about what the next iteration of my homelab will look like. While I like unraid for the storage stuff, I'm a little tired of wrangling it into a container orchestrator and hypervisor, and I think this year I'll split that job out to a dedicated machine. I'm comfortable with, and in fact prefer, IaC over fancy UIs and so would love to be able to use terraform or Pulumi or something like that. I would prefer something multi-node, as I want to be able to tie multiple machines together. And I want something that is fault-tolerant, as I host services for friends and family that currently require a lot of manual intervention to fix when they go down.

So the question is: how do you all do this? Kubernetes, docker-compose, Hashicorp Nomad? Do you run k3s, Harvester, or what? I'd love to get an idea of what people are doing and why, so I can get some ideas as to what I might do.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 4 points 10 months ago* (last edited 10 months ago) (15 children)

I see no one else commented my stack, so I suggest:

Nomad for managing containers if you want something high availability. Essentially the same as k8s but much much much simpler to deploy, learn, and maintain. Perfect for homelabs imo. Most of the concepts of Nomad translate well to k8s if you do want to learn it later. It integrates really well with Terraform too if you are also hoping to learn that, but it's not a requirement.

NixOS for managing the bare metal. It's a lot more work to learn than say, Debian, but it is just as stable, and all configuration will be defined as code, down to the bootloader config (no bash scripts!). This makes it super robust. You can also deploy it remotely. Once you grow beyond a handful of nodes it's important to use a config management tool, and Nix has been by far my favourite so far.

If you really want everything to be infra-as-code, you can manage cloud providers via Terraform too.

For networking I use wireguard, and configure it with NixOS. Specifically, I have a mesh network where every node can reach every node without extra hops. This is a requirement if you don't want a single point of failure (hub and spoke) to disconnect your entire cluster.

Everything in my setup is defined 'as-code', immutable, and multi-node (I have 7 machines) which seems to be what you want, from what you say in your post. I'll leave my repo here, and I'm happy to answer questions!

My opinions on the alternatives:

Docker compose is great but doesn't scale if you want high availability (ie, have a container be rescheduled on node failure). If you don't want higher availability, anything more than docker might be overkill.

Ansible and Puppet are alright but are super stateful, and require scripting. If you want immutability you will love Nix/NixOS

k8s works (I use it at work) but is extremely hard to get right, even for well-resourced infra teams. Nomad achieves the same but with the leanings of having come afterwards, and without the history.

[–] [email protected] 0 points 10 months ago (7 children)

Hey, your stack is pretty similar to mine. One thing I recently started testing is Seaweedfs. I saw it listed in your repo too, how are you liking it so far? And do you use it on all of your nodes?

[–] [email protected] 2 points 10 months ago (1 children)

I struggled a bit to get it up and running well, but now I am happy with it. It's not too hard to deploy (at least easier than the alternatives), it has CSI which for me was big, and it has erasure coding. The dev that maintains it (yes, the one dev) is very responsive.

It has trade offs, so depending on your needs, I recommend it. Backing store for stateful workloads like postgres DBs? Absolutely not. Large S3 store (with an option for filesystem mount) for storing lots of files? Yes! In that regard it's good for stuff like Lemmy's pictrs or immich. I use it as my own Google drive. You can easily replicate in your own cluster, or back it up to an external cloud provider. You can mount it via FUSE on your personal machine too.

Feel free to browse through my setup - if you have specific questions I am happy to answer them.

[–] [email protected] 1 points 10 months ago (1 children)

Thanks! I'll do some testing over the weekend and see how it goes.

While I'd love to be able to use it for postgres, I figured that wouldn't work out well so probably won't try it any time soon. I do have several apps that use sqlite databases though, do you think those would have any issues? e.g. trilium, ntfy, ghost

The main downside to most of the distributed/clustered storage that I've tried is they always seem to corrupt sqlite db files due to not supporting locking or some other posix feature. Reading through some older github issues, it looks like that is something the dev of seaweedfs fixed hopefully.

[–] [email protected] 2 points 10 months ago (2 children)

The problem with using seaweedfs to a back your DBs is more on the filesystem than the implementations of POSIX features. When you are writing to a file, and the connection to seaweedfs breaks (container restart, wifi, you name it), then you might end up with a half-written file. If you upload pictures, this is unlikely, but DBs are doing several writes per second usually. So it is more likely one of those gets interrupted. In my case, my grafana sqlite DB would get corrupted every other week.

What I recommend is using DBs natively in your node's filesystem, and backing them up to seaweedfs periodically instead. That way your DBs 'work' but you can get them running again, and the backup is replicated in the distributed filesystem.

[–] [email protected] 1 points 10 months ago (1 children)

What I do right now is I have a rclone sidecar container that uploads files in a directory every few seconds, and I also have another init sidecar that runs before the main application and downloads those files (incl sqlite dbs) to the normal disk. This works okay but feels pretty clunky and can still result in stuff getting corrupted because I'm just backing up the db files and not using any sqlite commands to actually back up the db to another file that isn't in-use first.

How do you handle a job going from one nomad node to another? Or do you pin jobs like grafana to specific hosts?

[–] [email protected] 1 points 10 months ago

Nomad has host volumes - so you can tell it to mount a folder from the machine on the container, and it will only schedule that container on machines that have that folder. So yes, effectively you pin the workload, thus introducing a SPOF - I do not love it but Grafana only supports sqlite and postgres, so making those HA would require failover setups which is a bit much for a homelab :')

For backing up, you can use the sqlite command periodically (do cron job or Nomad periodic job) and then upload the backup to some external, safe storage (could be seaweedfs or S3!). For postgres you can use something like this.

[–] [email protected] 1 points 10 months ago (1 children)

That's an interesting issue. Do you think the problem would be the same for any CSI plugin? I'm thinking of using my NAS as the storage brains of the operation and hooking it up with NFS or something, but would that have issues with stateful stuff like DB's too?

[–] [email protected] 2 points 10 months ago

I have never used NFS, but I think it would fare much better than seaweedfs because it uses Fuse to implement CSI. So for NFS I am sure the protocol would consider half-assed writes

would be the same for any CSI plugin

No, it would depend on the CSI plugin and how it is implemented. Ceph for example I know it has several, and cloud providers offer CSI volumes for their block storage (AWS EBS, GCP PD), and they will all perform differently. See this comment from a seaweedfs issue:

[...] It is always better to run databases on host volumes if you can (or on volumes provided by AWS EBS or similar). But with Seaweedfs especially if you are running postgres with seaweedfs-csi volume be prepared for data corruption. Seaweefs-csi uses FUSE, if anything happens to seaweedfs-csi (Nomad client restart, docker restart, OOM) mount will be lost and data corruption will happen.

Running on CEPH (since CEPH CSI using Kernel driver not FUSE) is acceptable if you fine with low TPS.

I found it was easier to make recoverable, backed up, host volumes than to make DBs run on high availability filesystems like seaweedfs (I admit I have not tried Ceph - the deployment looked a bit complicated/overkill for a homelab).

Postgres and sqlite are just not made for that environment. To run a high-availability DB, it is better to run a distributed DB made for that (think etcd, cassandra) than to run a non-distributed DB on top of a distributed filesystem.

Good luck! :)

load more comments (5 replies)

load more comments (12 replies)