It has been a while since our last update, but it's about time to address the elephant in the room: downtimes. Lemmy.World has been having multiple downtimes a day for quite a while now. And we want to take the time to address some of the concerns and misconceptions that have been spread in chatrooms, memes and various comments in Lemmy communities.
So let's go over some of these misconceptions together.
"Lemmy.World is too big and that is bad for the fediverse".
While one thing is true, we are the biggest Lemmy instance, we are far from the biggest in the Fediverse. If you want actual numbers you can have a look here: https://fedidb.org/network
The entire Lemmy fediverse is still in its infancy and even though we don't like to compare ourselves to Reddit it gives you something comparable. The entire amount of Lemmy users on all instances combined is currently 444,876 which is still nothing compared to a medium sized subreddit. There are some points that can be made that it is better to spread the load of users and communities across other instances, but let us make it clear that this is not a technical problem.
And even in a decentralised system, there will always be bigger and smaller blocks within; such would be the nature of any platform looking to be shaped by its members.
"Lemmy.World should close down registrations"
Lemmy.World is being linked in a number of Reddit subreddits and in Lemmy apps. Imagine if new users land here and they have no way to sign up. We have to assume that most new users have no information on how the Fediverse works and making them read a full page of what's what would scare a lot of those people off. They probably wouldn't even take the time to read why registrations would be closed, move on and not join the Fediverse at all. What we want to do, however, is inform the users before they sign up, without closing registrations. The option is already built into Lemmy but only available on Lemmy.ml - so a ticket was created with the development team to make these available to other instance Admins. Here is the post on Lemmy Github.
Which brings us to the third point:
"Lemmy.World can not handle the load, that's why the server is down all the time"
This is simply not true. There are no financial issues to upgrade the hardware, should that be required; but that is not the solution to this problem.
The problem is that for a couple of hours every day we are under a DDOS attack. It's a never-ending game of whack-a-mole where we close one attack vector and they'll start using another one. Without going too much into detail and expose too much, there are some very 'expensive' sql queries in Lemmy - actions or features that take up seconds instead of milliseconds to execute. And by by executing them by the thousand a minute you can overload the database server.
So who is attacking us?
One thing that is clear is that those responsible of these attacks know the ins and outs of Lemmy. They know which database requests are the most taxing and they are always quick to find another as soon as we close one off. That's one of the only things we know for sure about our attackers. Being the biggest instance and having defederated with a couple of instances has made us a target.
"Why do they need another sysop who works for free"
Everyone involved with LW works as a volunteer. The money that is donated goes to operational costs only - so hardware and infrastructure. And while we understand that working as a volunteer is not for everyone, nobody is forcing anyone to do anything. As a volunteer you decide how much of your free time you are willing to spend on this project, a service that is also being provided for free.
We will leave this thread pinned locally for a while and we will try to reply to genuine questions or concerns as soon as we can.
What I find most ridiculous about people claiming lemmy.world is too big and therefore bad for the Fediverse is simply... Have you people wondered why it got so big?
During the crucial first weeks of the Reddit migration, the single time period with the most chance of bringing new users, pretty much all larger Lemmy instances closed their registrations - they couldn't handle the influx. Other big ones decided to immediately defederate everybody, they were afraid of having to moderate content. And a few did remain open and federated, but they were also extremely niche and focused on their own political side of the spectrum.
Lemmy.world however remained open, remained with active admins that helped the first moderators, and kept upgrading the server at a very fast rate - you might forget it now, but Lemmy was massively slow and frustrating and then a new Lemmy.world update would drop and it would feel like a different website.
So yeah, "bad for the Fediverse" for being the only instance that kept up with the demand at the most necessary time.
Have you guys contacted law enforcement? It may surprise you. A startup I worked for had the same issue and contacted the FBI. They were able to quickly (within hours) find the person doing it despite him using VPNs and other tools for OpSec.
In all seriousness, we all appreciate your work. These are the growing pains that are to be expected, and your hard work and transparency (and writing it up at a level that even I can understand) is welcome.
Im a data engineer with 20+ years of experience in sql and various databases, I do performance tuning on daily basis. How can I help? Please message me if you think you can use me. Id be very happy to help where I can!
Besides the actual developers of lemmy, none has done more for the lemmiverse than the maintainers of lemmy.word.
When the Reddit shitstorm started and other leading servers shut down user registration, you guys held the ship steady and didn’t flinch from the sudden flood of new users. Discovering new bottle-necks in lemmy code, helping to resolve them and deploying hot fixes. All in super fast reaction time.
About “lemmy.world shouldn’t be largest server” crap - it’s good for lemmy that one server is the easy entry point to lemmy. This is where the “mainstream” communities could/should be and new users will have an easier landing. Having dedicated servers with their own communities (like start trek, piracy, etc) is great but it’s not mandatory for all communities.
Thanks for being so transparent with us. Lemmy really does feel like home now to me. I wish the maintainers all the best as they continue to fight the forces of evil.
I couldnt care less. You provide a great forum at no charge to me. I thank yoy for your contribution to discourse, communication with the community, and look forward to the growth of lemmy.world
usually my reaction when a website I visit daily goes down is to probably visit that website less or think the backend team behind it is lazy. but when lemmy.world goes down or is under attack, I sympathize and just open it when it's back up. y'all prove that you're hardworking by providing clear communication and explanation on what's happening everytime. shout out lemmy team, you deserve the world!!
Reddit was down a lot too, and they stuck ads in my face. It’s not like I have a pacemaker that needs Lenny.world to be up in order to function. Keep up the good work and I hope whoever is behind the attacks steps on a Lego.
I have to wonder why expensive SQL queries in Lemmy operations even exist. As Lemmy scales, won't those queries get executed more often just as part of normal operation? That would say to me that the Lemmy software needs optimization. Otherwise there will be scaling issues even if the attacks stop.
If you think it might help I've got a bit of a hack I've used in the past to cache a sql database in a compressed ramdisk using zram and bcache. Imagine stuffing a 50G DB into 20G of memory.
It won't fix the inefficient SQL queries but it would make it so frequently accessed tables get cached in a ram disk cutting query time significantly.
This might be enough to reduce the impact of these attacks until queries can be optimized.
This assumes your database isn't running on something like RDS though.
I'm sure they don't want to reveal to much but I'm curious if the attackers were authenticated. If not it seems reasonable to rate limit anonymous users.
I found that LMAO/Angled (guy who was angry about being banned for community name squatting) has a YouTube that does techy stuff, he's always in the back of my mind as someone who could be contributing to the DDoS, total speculation though but the threat of "ruining your site" and then coming back to spam the trending communities with spam makes me suspicious lol
I’m imagining spez is sending his flying monkeys and they’ve been trying to shut it all down. Doesn’t matter that you’re smaller than Reddit, Egos like spez’s can’t take even a minor rumble. Just look at how he has to ‘win’ against all his own users. Should tell you all you need to know on his motives.
A fantastic job is being done by you folks - obviously in the face of adversity. Given the amount of users on the instance is at a critical point, would it not be possible to 'move' accounts off it onto other less populated instances ?
Keep up the great work folks - I sympathise for ya.
Thank you for your time & efforts in maintaining this platform. I (and many others I'm sure) have great respect for the work you do in trying to combat this menace. The community is completely behind you and appreciates the value of this resource.
Any de-federated instance doesn't have the money or resources to start DDOS attacks. You know who does? Large corporations who feel attacked at the very existence of large platforms such as lemmy.world.
Who do we know with those resources, funding, knowledge of software (in general, as well as able to place specific people to learn about certain FOSS projects that have their code available), and the desire to spend such resources?
You know it's Reddit Co, we know it's Reddit Co. They know they're doing it too.
Fuck Spez and his bullshit army. I hope they can sleep well in their suburban McMansions while they sell out their future.
It's really annoying that it's down but I've found another instance to use when I'm not able to use this one. I hope you're able to stop these losers at some point. It's very frustrating what's happening but at the same time Lemmy is young and I think and hope it will be optimised so that it won't be a issue in the future.
Stay strong fellow lemmies, we're going to get trough this. For those of you that is very annoyed now: make a new account at some other instance. I've already got 3 accounts across 3 different instances already. Check what instance to join here: https://join-lemmy.org/instances
There are quite a few InfoSec people here. While I have never held an official InfoSec job I do have a degree. However, my degree is debatable about whether it actually educates me as intended.
Point being there are a lot of people that have more knowledge than me as well as experience but I want to learn. As someone who is always listening to security podcasts like Hacking Humans or Darknet Diaries, naked hacking, or even InfoSec journalism around popular ongoing issues in the world like Click Here. I always want to learn and get experience.
I currently work in IT for a hospital. Is there any way to help with this kind of thing to learn and build on knowledge to help? To volunteer time to potentially see what is going on?
The downtime is causing an issue with posting content from other instances - I've seen this a handful of times from kbin. I post something to a lemmy.world community, and kbin thinks it's there, but lemmy.world doesn't see it. But, the delete request seems to need to go through lemmy.world, which doesn't agree that the content exists. So my profile is filled with posts people on kbin can see, but no one else can, and I can't delete them. Is there any kind of catch-up mechanic for instances to try to agree on what content should be present if content was altered during downtime? I can see this becoming a lot more confusing as people look at a community from multiple different instances and see different content, not realizing this is unintended behavior.
The biggest misconception I've seen on Reddit and elsewhere is that you need an account on every single instance if you want to interact with content on that instance, and it's not supposed to be true but while this bug continues, it kind of is true.
People should stick with the instance otherwise you're just encouraging those tankies and nazis to use DDOS attacks again to bring down instances that defederate with them, don't let them know that they're successful. This opportunistic concern trolling around lemmy.world's downtime needs to stop. As the admins said, sooner or later "small" instances would have 100k users and would start having these issues all at once if it weren't for lemmy.world experiencing them first hand. Some DB optimizations were pushed to Lemmy thanks to lemmy.world.
Would it be possible to have the error page when you are being attacked/there is an outage point to some other lemmy instances to go to?
I think that could be a big help if there is an issue when a new user tries to check out .world for the first time. They will at least have a link to click on to check out what lemmy is like on another instance and maybe sign up there too.
Ive been waiting for this response from you guys. You have been a fantastic admin team so far. I still don't agree with some of the de-federating, but overall you guys truly show you care about this instance and the lemmy fediverse as a whole.
I know I wont be wavering because of butt hurt idiots in other instances. I will hold my ground and stick to Lemmy.World.
Keep it up and i hope that in due time, you guys can keep the DDOS attacks under control.
Ah no, sorry, while I sympathise with your technical issues, the rest of your post is disingenuous at best.
Lemmy.world being too big is bad for Lemmy as a product/software/"brand" etc - your downtime, being the instance most people link to, is a LOT of people's first impression and when it spends time being down, people associate THAT downtime with Lemmy, and not the hundreds of other instances that don't have downtime.
The issue isn't even about you being the biggest instance, its the absolute imbalance in both users and communities on one instance and you willingly allowing it continue. If you genuinely cared about Lemmy, you would close registrations now.
You have enough "technical" people to build your own instance from the source code with that change for the banner built in (and you could go ahead and submit the PR/Issue anyway), but you haven't - instead placing the blame on the developers. Hell, you only made the PR 5 hours ago after weeks of other admins asking you to close the instance.
You could even make the simple change to the sign up link instead lead to join-lemmy, but for whatever reason you want to continue to be the biggest instance and don't care about the wider lemmy ecosystem and the effect that it has.
To my understanding Datadog is not FOSS. Would you guys consider using a FOSS alternative for motoring the status of lemmy.world such as Uptime Kuma? That way your who stack is closer to being FOSS.
Is Lemmy not throttling requests to APIs based on how computationally expensive they are? Or is it that many IP addresses are hitting those APIs and are within the throttling limits?
Sounds like you are the victims of a hackathon more so than a single person upset about drama. defcon is tomorrow, maybe some group/feds/soon-to-be-fed will have a demonstration and talks about ActivityPub
The conversation gets a bit scrambled/broken up by disruptive/toxic people but this is a comment chain on lemmy.ml two weeks ago about SQL issues and challenges in getting the Lemmy Dev team to address them that might be worth reading:
I think it would be good to not close registration and if once a month or something there could be a post by admins about migrating to smaller instances (this is made easy with the LASIM tool) so new users can easily sign up with no hurdle but we also prevent too much centralization.
I have nothing bad to say about Lemmy.world, but I do recommend that people move away from it in order to better decentralize Lemmy. Here is some useful information for people wanting to move instances.
For a list of instances, along with with stats for those instances:
The fun thing about the Fediverse is that when this goes down the other instances stay up, so whoever is doing the attacks isn't really doing much except promoting people to create accounts on multiple instances. Which makes the numbers look really big.
When I learned about the whole fediverse thing, I want to join but was hesitant due to many instances. But I realized that lemmy.world is the largest Lemmy instance with a HUGE margin so I just signed up. Thank you for keeping this place alive and kicking!
Thanks for the update and keep up the good work! It seems like reddit went down a few times a week regularly for years. I have to think that some state sponsored actors are responsible for some of this. I’m sure that some topics being discussed here are not in line with the values of many regimes.
Thank you so much for explaining the reason for the downtimes. I just thought it was some temporary issue caused by unforseen popularity. Knowing it was malicious does make me more understanding of how difficult this must be. I will continue to be patient. I am sadly not good enough with anything other than basic powershell scripts and learning proprietary software configurations.. 10 years of software support does that to a guy. I'll still check if there is anything I can do to help. I do want this project to succeed.
Are you guys using a load balancer at all? How about a tool like CrowdSec?
I use that and the nginx Bad Bot Blocker to stop malicious shits on the sites I operate (medium-large e-commerce) to great success. We used to get scraped heavily by competitors but now they get the middle finger.
Thank you to the admins for all of your hard work maintaining Lemmy.World through the downtime. A lot of us are already so comfortable here that we rush to the Discord server to check in when it's down.
Point being, the members of Lemmy.World are really grateful to the admins, the mods, and fellow Lemmings who have been posting interesting content and participating in deep discussions!
I know there’s so much work being put into this site but there’s just too many outages and it’s unreliable. Maybe I’ll check back in a few months to see if it’s improved but it’s not working as a Reddit replacement which is what I hoped. Really sucks nefarious actors keep targeting the site.
I think I initially signed up on your instance and then figured it out, signed up for a more local instance but then figured I made a mistake and ended up where I am.
Thank you again for being available to let me through the door. Once I figured out that there's lots of doors, it was much better.
Lemmy.world will always be a special place and you and anyone who volunteers for work hare is fuckin awesome. Thanks again ♥️
Thanks for all your amazing work! I know just enough about SQL to know I know next nothing, but could someone intelligent explain how databases are publicly accessible for anyone to be able to make queries?
It's a shame you are having to go through this but it was bound to happen to an instance sooner or later. It's better that it happen to a large instance with the time, talent, and money to work through the challenges because this kind of consistent attack would bury any smaller instance pretty quickly.
With that said is there anything that users can do to help?
It's difficult to fix and not without changes in the code.
Most solutions involve fixing those heavy SQL. Tuning them, caching them in redis or memcached or refactor the whole process from scratch.
Thinking on the DDoS part, implement short circuits so reaching those queries must follow a session pattern. It doesn't stop it but you force those script kiddies to make real connections. If they are anonymous then all the heavy queries should be cached due to lack of custom vars. If not, it's a matter of identifying users and banning them automatically.
Ah this is much needed response ! I switched servers but lemmy.world is still my go to server , it was down so much that I had to try alternative ones ! Good it know its not a load issue !
What about that "show context" button in our inboxes? It's super annoying getting replies and not being able to see what the context was, all I get is that 'bad gateway' error or whatever.
Is there any update on the instances that were unintentionally defederated from lemmy.world? I know that one of the fanaticus.social admins was trying to get that sorted out.
Thank you for addressing the registration issue. Informing new users to consider alternatives due to the size relieves the issue I had. And it is a valid counterpoint.
There are some points that can be made that it is better to spread the load of users and communities across other instances
Out of curiosity, what’s the relative overhead of those two services (hosting user accounts vs hosting communities)? If the aim were to distribute the overhead over multiple instances (as a general goal, not just a solution to the DDS attacks), is it more important to distribute users or communities?
Thanks for the update, if there's ways we can help please mention them. Be it about know-how, be it financial, be it about our behaviour when interacting with the server, be it about general knowledge we could provide.
I'm actually curious as IDK why no CF or DDOS-Guard is used to block massive DDoS?
Additionally, I'm not that knowledgeable but having some Cache that serves a bit of stable data for SQL queries might be better than the server being overloaded and having downtime.