this post was submitted on 20 Jun 2023
2 points (66.7% liked)

Selfhosted

39435 readers
7 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Hi—I have some free Azure credits and would like to use them to host a personal Lemmy instance. I know Lemmy is containerised, but is there a preferred choice for hosting in Azure—AKS, Azure Container Apps, Container Instances? Also, any guidance on appropriate PostgreSQL configuration—I know there are some options around that.

Also, can anyone point me at what resource utilisation will look like for a Lemmy instance—I imagine disk space is more of a concern that compute usage.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

Will you only be supporting yourself and maybe a small subset of users? If you don't need your instance to scale, you can (shameless self plug) try my deployment script to get yourself running.

It just uses the recommended Postgres configuration as seen in the deployment files in Lemmy's official repo. It would just be in a Docker volume on disk, so if you had thoughts of scaling in the future, and wanted to use a managed Postgres service, I would not recommend using my script.

I run an instance just for myself, CPU resources are so low that pretty much anything you can get in the cloud will be good. Disk space is a much more important factor. In terms of just Lemmy-created data, my personal 10-day instance has stored about 6.2GB of data. 2.4GB of this is just thumbnails. Note that this does not include other things that consume resources, such as my Docker images or my Docker build cache, which I clear manually.

So, that is roughly 640MB of new data generated per day. Your experience will vary depending on how many communities you subscribe to, but that's a good rough estimate. Round it up to 700MB to have a safer estimate. But remember, this is with Lemmy's current rate of activity. If the amount of posts and comments doubles, triples in the future, my storage requirements will likely go up considerably.

I am genuinely not sure what long-term Lemmy maintenance looks like in terms of releasing disk space. I can clear my thumbnail data and be fine, but I wonder what's going to happen with the postgres database. Is there some way to prune old data out of it to save space? Will my cloud storage costs become so unreasonable in a year, that I'll have to stop hosting Lemmy? These are the questions I don't have answers to yet.

If there is something clever you can do to plan ahead and save yourself disk space costs in the future (like, are managed Postgres services cheaper to host than on disk ones?), I'd recommend doing that.


EDIT: Turns out ~90% of my Lemmy data is just for debugging and not needed:

https://github.com/LemmyNet/lemmy/issues/3103#issuecomment-1631643416

[–] r0bbbo 1 points 1 year ago (1 children)

Thanks for the great reply—I'll take a look at your deployment script to see if that fits my needs. I only plan to use the instance for me and a handful of friends. Like you say, data retention is probably my biggest concern so I'll look at the most sensible way to budget for that in Azure. Are there any numbers available from the major Lemmy instances? Consideration for retention policies seem like a bit of an oversight—I might do some reading to see what the plan is here.

[–] [email protected] 2 points 1 year ago

I'm not sure if the other instances have published their numbers, I can only see what my Docker volumes look like.

But if it helps you plan, you should know that federation only involves new data. When you set up a new instance, and federate with/subscribe to a community, it will only fetch an initial 20 posts (if that). From that point forward, you will receive a copy of all posts/comments posted to that community, but you will not have anything from before you federated. So you don't have to worry about mirroring the entirety of a community's history - I'd probably be out of disk space 3 times over if that were the case.

There are ways for users to retrieve "old" posts, but it's done on an individual basis, not in bulk.