This is weird. Account deletion should be handled by JOIN at lookup time, so comments/posts only display if the account is active. No mass updates, pipelines or otherwise
Lemmy Server Performance
Lemmy Server Performance
lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.
AccountDelete has a marketed feature to overwrite all post and comment content.
Hmm ok, false sense of security there since another advertised feature is the open API (meaning no restrictions on scraping bots so there will definitely be archives of deleted posts), but whatever.
How does this sound: encrypt the comments in the db using a random key stored in the account row. Then at account deletion, overwrite that key, so the comments can no longer be decrypted. Maybe there is a way to purge those comments altogether during the next VACUUM. No idea how often that happens though.
Question... user ban, are moderators doing the remove data option? The API seems to allow it on a testing server install, but I don't know what moderators are actually doing in the field.
That ban removal is another potential high I/O operation. There are accounts that copy postings in mass off hacker news, Reddit, etc. And I could see banning one of those accounts triggering a lot of PostgreSQL activity.
I'm inclined to encourage we bite the bullet while data is still relatively small and change delete/removed field into a unified status field, enum or integer or is it a character? I think I've seen code that says= true and and ='t'
EDIT: I created a new posting regarding consolidating some of these fields that yield the same results.
And have some timestamps of deleted, even if that's off on another table. Need to thigh this through some more.
I think the true / 't' thing is just postgres and how it handles boolean fields.
If they're not all boolean, then yes we should fix.
I did perform some tests yesterday. DeleteAccount does NOT Delete the communities created by the account. If the account deleted was the moderator, the community will be left with zero moderators. That person's post and comments are overwritten, but a post created in the community by an account other than the deleted one should still function (not sure how deep the testing went on that, still enhancing testing).