this post was submitted on 23 Aug 2023
8 points (100.0% liked)

Lemmy Server Performance

420 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
 

Lemmy is incredibly unique in it's stance of not using Redis, Memcached, dragonfly... something. And all the CPU cores and RAM for what this week is reported as 57K active users across over 1200 Instance servers.

Why no Redis, Memcached, dragonfly? These are staples of API for scaling.

Anyway, Reddit too started with PostgreSQL and was open source.

MONDAY, MAY 17, 2010

http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html

"and growing Reddit to 7.5 million users per month"

Lesson 5: Memcache
The essence of this lesson is: memcache everything.

They store everything in memcache: 1. Database data 2. Session data 3. Rendered pages 4. Memoizing (remember previously calculated results) internal functions 5. Rate-limiting user actions, crawlers 6. Storing pre-computing listings/pages 7. Global locking.

They store more data now in Memcachedb than Postgres. It’s like memcache but stores to disk. Very fast. All queries are generated by same piece of control and is cached in memcached. Change password Links and associated state are cached for 20 minutes or so. Same for Captchas. Used for links they don’t want to store forever.

They built memoization into their framework. Results that are calculated are also cached: normalized pages, listings, everything.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

There is some kind of social construct with Lemmy's 4.5 year development that foundational tools like Redis, Memcached, dragonfly were avoided when in May 2023 people were beating a path to the door of Lemmy. And lemmy servers crashed, one after the other, and I don't think there will be any statistics about all the times the pages didn't load and the 4 year old Lemmy app's SQL statements couldn't cope with the comments/post growth for even 20,000 users.

Sure, people will leave Reddit and Twitter/X and Threads and whatever again later in 2023 and in 2024. But it's still some pretty odd social situation that May 2023 came along and performance problems were holding the project back... and Redis, Memcached, dragonfly were not put on the table as routine tools of the trade.

MONDAY, MAY 17, 2010 Reddit spelled out all the performance and scaling problems they had, they gave an open presentation. The source code was open since June 18, 2008... long before this May 17 presentation.

If anything Reddit should have turned itself into a non-profit organization and kept selling reddit awards after ChatGPT came on the scene. The 2 month move to "charging for the API" was the wrong direction.

Twitter and Elon Musk with X, it just seems popular to turn things bad. Crashing servers, broken features, wild changes.

Clickbait news and anti-science popularity still seem to keep growing. Reddit or Lemmy users could replace headlines with sincere and earnest descriptions of news and information... but it seems Reddit and Elon Musk are in some kind of agreement that audiences don't really want that. In APRIL 2017, Wikipedia Founder Jimmy Wales tried to take on clickbait and fake news, but few have cared and clickbait headlines are still all over Reddit and Lemmy in 2023.. I don't get the appeal and attraction of junk clickbait all the time. But it's hard to ignore the upvotes it gets on Reddit and Lemmy both... you can watch it every day.

Leemy seems determined it doesn't want to optimize SQL statements and add scaling tools like Redis, Memcached, dragonfly to the platform. The crashes have been all over multiple sites since May, but the SQL problems and caching need is still ongoing in August.

The trend of Reddit, Twitter the ongoing favoring of clickbait seems intertwined.