this post was submitted on 14 Mar 2024
94 points (100.0% liked)

Blahaj Lemmy Meta

2312 readers
19 users here now

Blåhaj Lemmy is a Lemmy instance attached to blahaj.zone. This is a group for questions or discussions relevant to either instance.

founded 2 years ago
MODERATORS
 

In the last 24 hours you've likely noticed that we've had some performance issues on Blåhaj Lemmy.

The initial issue occurred as a result of our hosting provider having technical problems. We use Hetzner, who provides hosting for approximately a third of the fediverse, so there was wide spread chaos above and beyond us.

As of lemmy 19.x, messages queue rather than getting silently dropped when an instance is down, so once Hetzner resolved their issues, we had a large backlog of jobs to process. Whilst we were working through the queues, we were operational, but laggy, and our messages were an hour or more behind. These queues aren't just posts and replies, but also include votes, so there can be a large volume of them, each one of which needs to be remotely verified with the sending instance as we process it, so geographical latency also plays a part.

As you can see from the graph, we are finally through the majority of the queues.

The exception is lemmy.world. Unfortunately, the lemmy platform processes incoming messages on a sequential basis (think of it as a sequential queue for each remote instance), which means Blahaj Lemmy can't process a second lemmy.world message until we've finished processing the first message.

Due to the size of Lemmy.world they are sending us new queue items almost as fast as our instance can process them, so the queue is coming down, but slowly! In practical terms, this means that lemmy.world communities are going to be several hours behind for the next few days.

For those that are interested, there is a detailed technical breakdown of a similar problem currently being experienced by reddthat, that explores the impact of sequential processing and geographical latency.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 9 points 8 months ago (1 children)

There are multiple queues, but each inbound instance queue is serial. At the moment it's not possible to split the incoming jobs from one instance across several concurrent queues, but one hopes it will be in the future. That would bypass the need for rate limiting.

[–] [email protected] 3 points 8 months ago

Interesting, gotcha. I'm not a backend dev so I assume you have more info than I do on all this, just spitballing.