this post was submitted on 03 Apr 2024
27 points (100.0% liked)

sh.itjust.works Main Community

7582 readers
1 users here now

Home of the sh.itjust.works instance.

Matrix

founded 1 year ago
MODERATORS
 

It might not just be lemmy.world, but several times recently I've made posts (usually to !helldivers2.lemmy.ca) that I check throughout the day. Then like 8 hours after I've made the post, I'll see comments appear from people on lemmy.world that are several hours old.

It seems like there can be a queue that gets build up and then is suddenly flushed.

I see it happening the other way as well. This morning I made a post from my sh.itjust.works account and then switched over to a lemmy.world account and can't see it. So the delay is going both ways.

top 18 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 13 points 5 months ago (1 children)

We are currently seeing a backlog in outbound federation towards lemmy.world.
Really, this graph says it all... ๐Ÿ“ˆ

I've restarted the containers, hoping it starts catching up.

[โ€“] [email protected] 7 points 5 months ago (1 children)

Oh yeah, that looks pretty telling. Also, just wanted to say thanks for how awesome you guys are on the admin team. The work you guys do and how responsive you are is freaking impressive. Thanks!

[โ€“] [email protected] 5 points 5 months ago (2 children)

Sadly that line keeps going up.
I don't have much of an update but it's really specific to outbound federation and just with lemmy.world.
ie: inbound works fine, outbound with other instances works fine.

[โ€“] [email protected] 2 points 5 months ago (1 children)

Hmm sounds like it's a problem on the lemmy.world side then.

[โ€“] [email protected] 3 points 5 months ago

Not necessarily.
Thing is, after a few consecutive failures, the next retry is now in a while (later tonight).
This retry spacing at least explains why it hasn't gone down after the restart

[โ€“] testAccount 1 points 5 months ago
[โ€“] [email protected] 7 points 5 months ago* (last edited 5 months ago) (1 children)

I'm experiencing the same and only noticed it when posting to lemmy.world communities. I'd navigate to their instance directly and see both comments that didn't federate yet and my comment missing. I haven't observed this on other instances though, lends me to believe this might be just a lemmy.world problem?

Edit- have an example here (open in browser) https://lemmy.world/post/13836315 https://sh.itjust.works/post/17236782

Note my comment still missing from their instance nearly 24h later

[โ€“] [email protected] 7 points 5 months ago (1 children)

I think I saw somewhere that lemmy.world has grown way bigger than most other instances - probably because the name is less confusing to new users. I'm wondering if they're running into scaling issues as a result.

[โ€“] [email protected] 5 points 5 months ago* (last edited 5 months ago) (1 children)

Some of the instance admins have been poking at this lagging federation issue for a few weeks now, trying to figure it out.

The Reddthat admin noticed that Lemmy's federation process can't seem to meet demand in some cases. Reddthat has had trouble staying current with lemmy.world due to the network latency between Europe and Australia: https://sh.itjust.works/comment/9807807. I understand that lemmy.nz has seen this, too.

Another thing that has been noticed is spamming of actions from kbin instances. It looks like some process gets stuck in a loop on the kbin side. https://lemmy.world/comment/8961882

I'm sure there are other contributing problems that still aren't well understood. This software is a work in progress, after all.

[โ€“] [email protected] 2 points 5 months ago

These posts are enlightening, I think reddthat is right lemmy will need to using batch messages on longer intervals soon, that's pretty unsustainable in current form. What an interesting scaling issue to encounter.

[โ€“] [email protected] 5 points 5 months ago (1 children)

I'd like to celebrate early and possibly jinx myself in the process but the line is going down.

[โ€“] [email protected] 2 points 5 months ago

Lol the way that's dropping I'd say a celebration is in order!

[โ€“] [email protected] 4 points 5 months ago (1 children)

This does happen from time to time where the federation queue gets overwhelmed, usually with Lemmy.world for whatever reason. It appears that our outbound federation queue is really backed up right now.

@[email protected] just restarted the service so it may be working better now.

See our Matrix chat for more details.

[โ€“] [email protected] 4 points 5 months ago

Awesome, you guys rock!

[โ€“] [email protected] 4 points 5 months ago (1 children)

Dbzer0 recently had issues with long federation queues with big instances, due to delays in database access caused by geographic distance between frontend and backend... maybe this might be caused by something similar..?

[โ€“] [email protected] 3 points 5 months ago

Interesting. So by having the database and front end servers in different data centers, they couldn't keep up with all the requests from the big instances. Sounds like upgrading to Lemmy 0.19.3 changed how the federation queues worked, where it wasn't a problem before. I wonder if something similar could be going on here.

[โ€“] [email protected] 2 points 5 months ago (1 children)

I only have a delay when I'm using a VPN.

[โ€“] [email protected] 2 points 5 months ago

Interesting. I'm not using a VPN in my case.