Hey folks!
I made a short post last night explaining why image uploads had been disabled. This was in the middle of the night for me, so I did not have time to go into a lot of detail, but I'm writing a more detailed post now to clear up where we are now and where we plan to go.
What's the problem?
As shared by the lemmy.world team, over the past few days, some people have been spamming one of their communities with CSAM images. Lemmy has been attacked in various ways before, but this is clearly on a whole new level of depravity, as it's first and foremost an attack on actual victims of child abuse, in addition to being an attack on the users and admins on Lemmy.
What's the solution?
I am putting together a plan, both for the short term and for the longer term, to combat and prevent such content from ever reaching lemm.ee servers.
For the immediate future, I am taking the following steps:
1) Image uploads are completely disabled for all users
This is a drastic measure, and I am aware that it's the opposite of what many of our users have been hoping, but at the moment, we simply don't have the necessary tools to safely handle uploaded images.
2) All images which have federated in from other instances will be deleted from our servers, without any exception
At this point, we have millions of such images, and I am planning to just indiscriminately purge all of them. Posts from other instances will not be broken after the deletion, the deleted images will simply be loaded directly from other instances.
3) I will apply a small patch to the Lemmy backend running on lemm.ee to prevent images from other instances from being downloaded to our servers
Lemmy has always loaded some images directly from other servers, while saving other images locally to serve directly. I am eliminating the second option for the time being, forcing all images uploaded on external instances to always be loaded from those servers. This will somewhat increase the amount of servers which users will fetch images from when opening lemm.ee, which certainly has downsides, but I believe this is preferable to opening up our servers to potentially illegal content.
For the longer term, I have some further ideas:
4) Invite-based registrations
I believe that one of the best ways to effectively combat spam and malicious users is to implement an invite system on Lemmy. I have wanted to work on such a system ever since I first set up this instance, but real life and other things have been getting in the way, so I haven't had a chance. However, with the current situation, I believe this feature is more important then ever, and I'm very hopeful I will be able to make time to work on it very soon.
My idea would be to grant our users a few invites, which would replenish every month if used. An invite will be required to sign up on lemm.ee after that point. The system will keep track of the invite hierarchy, and in extreme cases (such as spambot sign-ups), inviters may be held responsible for rule breaking users they have invited.
While this will certainly create a barrier of entry to signing up on lemm.ee, we are already one of the biggest instances, and I think at this point, such a barrier will do more good than harm.
5) Account requirements for specific activities
This is something that many admins and mods have been discussing for a while now, and I believe it would be an important feature for lemm.ee as well. Essentially, I would like to limit certain activities to users which meet specific requirements (maybe account age, amount of comments, etc). These activities might include things like image uploads, community creation, perhaps even private messages.
This could in theory limit creation of new accounts just to break rules (or laws).
6) Automated ML based NSFW scanning for all uploaded images
I think it makes sense to apply automatic scanning on all images before we save them on our servers, and if it's flagged as NSFW, then we don't accept the upload. While machine learning is not 100% accurate and will produce false positives, I believe this is a trade-off that we simply need to accept at this point. Not only will this help against any potential CSAM, it will also help us better enforce our "no pornography" rule.
This would potentially also allow us to resume caching images from other instances, which will improve both performance and privacy on lemm.ee.
With all of the above in place, I believe we will be able to re-enable image uploads with a much higher degree of safety. Of course, most of these ideas come with some significant downsides, but please keep in mind that users posting CSAM present an existential threat to Lemmy (in addition to just being absolutely morally disgusting and actively harmful to the victims of the abuse). If the choice is between having a Lemmy instance with some restrictions, or not having a Lemmy instance at all, then I think the restrictions are the better option.
I also would appreciate your patience in this matter, as all of the long term plans require additional development, and while this is currently a high priority issue for all Lemmy admins, we are all still volunteers and do not have the freedom to dedicate huge amounts of hours to working on new features.
As always, your feedback and thoughts are appreciated, so please feel free to leave a comment if you disagree with any of the plans or if you have any suggestions on how to improve them.
I think the images should never be cached from other instances in the first place, that is a huge oversight in pictrs since not only does it have the potential to cache unwanted content but also causes the images hosted to rapidly accumulate which isn't ideal as it increases storage requirements which is unfair to people who want to self-host a personal instance. Hosting a personal instance should not have monstrous storage requirements or serious liability risk due to caching all images automatically, it should only cache what is uploaded to the Instance like profiles and banners, and posts that include images from the Instance.
I have reservations about allowing fully-invite based registrations on lemmy instances. While I do think it might be good to have invites as a way for users to skip filling out an application I don't really like the idea of requiring them like Tildes does, makes it feel like an elitist exclusive club of sorts having to beg for an invite from users. I don't think it should be an alternative to application-based registration, but rather a supplement to it, if someone can get an invite from users that's great but if not they should still be able to write an application to join, this could be extensive and also lower priority since you could get invites but should still be an option available.
Account requirements really depends on what they are and what they restrict (also who on the instance is allowed to impose restrictions). For example on instances with downvotes enabled I think score/upvote requirements are a bad idea since it essentially means that people who disagree are locked out, like on Reddit with karma restrictions, I do not support this, it creates an echo-chamber where unpopular opinions. It'll also lead to upvote farming if there are negatives due to having a lower score.
Comment or post requirements would just lead to post or comment farming similar to vote farming, though it's not as bad as score-requirements since people making posts and comments naturally (whether they are liked or not) can't be taken away by other people based on opinions (only if they break the rules and get posts removed, which isn't even remotely similar since they broke the rules).
Limiting image uploading is a fair requirement in my opinion since uploads can be particularly harmful if the uploads are malicious, and also uploads aren't really needed since people can externally host almost all their images without the need for uploads.
When it comes to DMs and restrictions around them I feel like that should be up to individual users to decide to allow private communication from certain users or not, or even to allow DMs at all, this shouldn't be something globally applied to people, maybe it could be a default in User settings and have a requirement set by the Admins but people should be able to turn it off if they don't care or want to accept messages from new users, I know I certainly will, I hate being nannied when it comes to who's allowed to send me messages, IMO Annoying or uncomfortable DMs are a fact of life and I prefer to deal with issues when they happen rather than block anyone who's a new user that might want to talk to me, it's one of the things I hated that Reddit does without giving me the option to opt out and receive messages from everyone.
I think having a Machine-Learning based system to identify Malicious images is actually a pretty good idea going forward, I know how some people feel about AI and Machine-Learning but I think it's probably our best defense considering that none of us want to see it, it might have False positives but I'd rather than than to allow CSAM to live here. Ultimately the choice is have ML scanning or Disable pictrs here, I think ML is the better option because people are going to want to have Avatars and without pictrs that isn't possible (unless Lemmy adds support to the UI for externally hosted Avatars and Banners).