From what I've heard (take this with a huge grain of salt) is that the posts themselves shouldn't take up much of your storage. The biggest thing that could take up your storage are images, but they are only stored on the instances where the community in which they were posted in is.
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Do you have any suggestions for what the 1.5GB is then?
It's likely the Docker images, and maybe the Docker build cache if they built from source instead of using the Docker Hub image.
I've been up for about a day longer than OP, and my Lemmy data is still under 800MB. OP either included non Lemmy data in that math, or is subscribed to way more communities than me. My storage usage has been growing much faster today with all the extra activity, but I won't have to worry about storage space for about a month even at this rate.
And that's assuming Lemmy doesn't automatically prune old data. I'm not sure if it does or not. But if it doesn't, I imagine I'll see posts in about 2-3 weeks talking about Lemmy's storage needs and how to manage it as an instance admin.
EDIT: Turns out ~90% of my Lemmy data is just for debugging and not needed:
https://github.com/LemmyNet/lemmy/issues/3103#issuecomment-1631643416
It would be cool to hear back from you guys hosting in about a week or so, so see if it'll just grow linearly, or if slows down at some point.
Sure, I'm curious too. I'll keep an eye on the usage!
I used the ansible route to get going. I am subbed to ~150 communities currently. Some of those won't stay, but for now I am subbing to almost anything to see how that affects disk usage. I am interested to see how, or if, it levels off over time and what a week or two out looks like. I expect by then we will all have many more tips for each other as we trial and error our way through.
Here's my current usage:
Ahhhh, image posts are where your usage is going! Makes sense, my instance is just for my account and I don't submit anything. Your postgres size is more or less in line with where mine was at your uptime. I'm using Docker Compose so I'm only considering the size of the volumes in my metrics, not the image sizes or anything.
Yeah, images are where the main bulk of the storage is going. Interestingly, my instance is also just for my account presently and I have not submitted any images until my screenshot above. So these images are just those that are being pulled from other instances. I was under the impression that images were hosted from their respective instance and not saved locally, so I am curious to see how this plays out long term.
Thumbnails are stored locally, I believe.
Confirmed. Investigated that earlier.
if its only thumbnails and only impacting that local instance i presume you can just cron to clear out old thumbnails regularly if space is a huge issue.
It's probably from the instance finding other instances and communities and saving them locally. But I don't know too much about how it actually works so I could be wrong. I also heard that they are only stored locally if someone on the instance subscribes to a community, so if that is the case my theory wouldn't make sense.
No worries, alright
Using Oracles "Always free" instances.
4 vCPU (ARMv8, not sure about the speed) 24 GB RAM 200 GB Flash storage
This is... Free? TIL
It is but Oracle can take it back without justification.
Yeah exactly, I wouldn’t recommend it for anything production grade if you’re not paying
Until everyone starts doing it, Oracle can be...fun to deal with
4vcpu (Ryzen), 8GB RAM, 256gb disk (which will be expanded when it gets to like 60% full). Not too worried about storage unless I get a bunch of image-happy users, text all comes in as json and goes straight to Postgres so it’s not a concern.
How many users do you have? Not starting a server any time soon, just curious. A you seem to have one of the bigger ones in this thread and are using them for privately. Are you public?
Mine is public, yes. Not sure how many active users I have, 28 signups but my sidebar shows 5 monthly active users so far. I imagine this will pick up once people start commenting and posting more.
I selfhost on my own homeserver. At the moment, I've spared it 2 cores, 1GB of RAM and 32GB of NVMe storage mirrored.
1 vCPU, 1GB Ram, 50GB storage using the smallest x86_64 compute instances on Oracle Cloud. Qualifies for always free which is nice while I'm simply testing out a personal server. It's working just fine within those constraints. For now, at least.
Like you, I'm worried about storage. I would like to run it from home, but I live in the woods and my internet isn't reliable enough.
If you run from home I would say to take precautions, like using a cloudflare tunnel.
I'm using a Ramnode VPS since I had some unused credit I wanted to use up. 2 vcpu, 1 GB ram, and 35 GB ssd.
Seems to be working well enough so far, but right now it's just me. If I open up to more users, I might need to upgrade, but we'll cross that bridge when we get there.
Edit: I may have spoke too soon; had to reboot the server due to low memory. Hopefully a swap file will alleviate that a bit, but I might have to upgrade the RAM on this server. We'll see.
Yeah, 1gb is definitely near the lower limit.
Currently a 1CPU/2GB RAM Linode instance for 26 users. Linode's pricing gets insane as you scale up though so I will definitely be looking elsewhere if I need to scale much bigger. I think I could get away with 1gb of RAM at half the cost right now, but I'm also hosting a Matrix homeserver on this VPS and Synapse is a hungy boy.
It's definitely overkill, but right now I'm hosting it on my nomad cluster. It only has 4cpu/8gb allocated at the moment but will autoscale (vertically) if needed. I already have a separate postgres cluster used for other things so I'm just borrowing that for now too. I haven't tried running multiple instances yet but I'll probably test that out this week to see how/if it works.
I have thought about running my own, following this for the info.
I just spun up my own instance. Trying it out. New to hosting lemmy. How do you guys list the disk usage? Where is lemmy? /Newb
I used the ansible method to get running and I am using the default paths. If you are also using the default paths, you can find your data in /srv/lemmy// . This location will hold your configuration files and your volumes directory. The volumes directory holds postgres (the database), pictrs (your image hosting) and lemmy-ui (the web-ui for lemmy). To see how much disk space you are using:
cd /srv/lemmy/<domain>/
du -hc --max-depth=1 volumes
replace with your domain.
Thanks! Not so bad apparently.
8.0K volumes/lemmy-ui
271M volumes/postgres
424M volumes/pictrs
If you are using docker just look at the volumes of the containers of the server, the UI and the two databases (one for stuff, one for pictures)
I have it running on my microk8s single node cluster. It's a dual xeon (40 cores total) with unfortunately only 64gb ram. The motherboard's max. I got a das with 72tb storage, currently in btrfs mirrored. Hoping btrfs raid-like configs become more usable in the future. I was using zfs but I always ran into issues.
I am running mine on an old Optiplex that i bought to run pihole and plex. It seems to be working fairly well although I am still trying to understand how all of this works.
- 1 vCPU 2.9ghz
- 1 GB DDR4 Memory
- 25 GB NVMe/SSD Storage
5~ USD a month. Working great for personal use and I'd imagine a handful of users. Hosted in a data center that is very close to me.
Also fwiw: 4 days of lemmy. I am subbed to a bunch of stuff. I've only uploaded like three pictures to my instance... All that space is thumbnails from other instances.
692M ./postgres
8.0K ./lemmy-ui
499M ./pictrs
1.2G .
1.2G total
There's my current disk usage. I've gone wild subscribing to just about every community I come across to see how the storage adds up. Right now I've got ~150 communities subbed. We'll see how it goes and when I'll need to expand the storage.
Not too bad... How long has your instance been up? Next thing I want to investigate is the postgrea database itself. On my to-do list.
Been running ~22 hours at this point
It's currently running on a proxmox VM on my R720. Probably gonna shift it to a VPS at some point.
Stack of Pi’s and a J4125 for streaming works well for my usecase
shower thoughts... and still on my first cup of coffee to more just musing than anything ...
if storage is the concern wonder if the lemmy roadmap might one day include an option to use cloud based storage?
azure storage at .06/gb per month is likely cheaper and more redundant than local storage - even if you factor in calls to the blob which could be lowered via caching.
cloud storage potentially might one day lead to a option for smaller self hosters to opt into a shared blob instance where the and cost is shared.
in this scenario security to ensure the cloud blob couldn't be deleted would need to be thought through (maybe splitting the password among multiple admins with each having one part of the whole?) but might be one way to better encourage more self hosting for them compute side of things.
If I self hosted my own Lemmy on my home server, just for myself and I posted / uploaded images on it, when another user from another instance views my image, they cache it, would this mean later down the line if I deleted to free up server space, if someone else on that instance was to come across my image after deletion, because it was previously cached, the image would still show?
Wondering if rolling storage is possible eventually, where an archive of posts older than 2 years is performed and data deleted.
I'm running it on an LXC container that lives on a proxmox cluster.
2 vCPU at 2.6Ghz. 2GB of RAM (it's LXC so I can allocate more if needed...) and 40GB of SSD-backed CEPH storage. I actually just upped this to 150GB because I can see the velocity of data I'm storing for this. I have about 2 more TB of storage on the CEPH cluster before I need to order a few more SSDs.
I have terrible internet, but I do have a static address. And they're installing fiber in my neighborhood right now. So that will change soon too.
Based on what I've seen thus-far, I suspect I can handle about a hundred users on this without much issue.
If I can eventually get it to work, I'll be hosting my personal Lemmy instance on a Hetzner CX21 VPS (2 CPU, 4GB RAM). Unfortunately after about a dozen attempts, with many variables tweaked to try to narrow down the issue, I still get nothing but a spinning button when I try to login after installation. I find it so odd that some people have no problem getting it up-and-running, and it simply won't work for others.
Hetzner vps, 2 vCPU, 8GB ram & 80GB SSD (Fully intended to setup a mail server at one point on that server hence the 8GB ram)
Good luck getting through spam filters
We have our instance running on a colo server. I am likely going to rebuild ansible to use a custom pict-rs docker container which offloads images to object storage so I don't need to store media locally on said server.
That is a good idea. I did a quick and dirty rclone mount for the pict-rs storage, partly because it was fast and partly because I was curious how it would hold up.