First of all, I would like to thank the Lemmy.world team and the 2 admins of other servers @stanford@discuss.as200950.com and @sunaurus@lemm.ee for their help! We did some thorough troubleshooting to get this working!
The upgrade
The upgrade itself isn't too hard. Create a backup, and then change the image names in the docker-compose.yml and restart.
But, like the first 2 tries, after a few minutes the site started getting slow until it stopped responding. Then the troubleshooting started.
The solutions
What I had noticed previously, is that the lemmy container could reach around 1500% CPU usage, above that the site got slow. Which is weird, because the server has 64 threads, so 6400% should be the max.
So we tried what @sunaurus@lemm.ee had suggested before: we created extra lemmy containers to spread the load. (And extra lemmy-ui containers). And used nginx to load balance between them.
Et voilà. That seems to work.
Also, as suggested by him, we start the lemmy containers with the scheduler disabled, and have 1 extra lemmy running with the scheduler enabled, unused for other stuff.
There will be room for improvement, and probably new bugs, but we're very happy lemmy.world is now at 0.18.1-rc. This fixes a lot of bugs.
Good work upgrading! I can't imagine it being too easy with a big instance.
I had issues with comments not federating to my own instance before this update (showing 0 for hours). Opening up this up now showed most of them right away if not all. Hopefully that means 0.18.1 fixed a fair few issues people had with federation.
This is caused by an issue in the latest RC of the Lemmy UI.
It's already been reported, and ruud will probably decide how to deal with it tomorrow.
The current workaround
Make sure you are on the main page (https://lemmy.world) and not looking at any posts or something like that before hitting the login button.
If you encounter other issues, please make sure to clear the browser cache. The latest upgrade also made changes to the API, which can cause issues with the cached version of the website.
A bit off topic, but does anyone else hate how when you click on a post and then go back, the page auto-resets to the top? Wish it would remember how far down you scrolled and return to that point.
Huge thanks to the lemmy.world team over the last couple of days to scale and maintain the instance! There's a link for donating on the sidebar for lemmy.world - just a couple bucks a month can help us support this instance!
So some strange behaviour: When I pressed the upvote arrows in 0.17.4, it'd immediately show this in the UI. Right now, it does not. The response appears quite slow. Is this a function of 0.18.1-rc or a function of the traffic of the Reddit-fugees?
I had a strange bug today where I wasn't able to upvote comments. So I cleared out my website data like the website suggested and I started having problems logging in. It would log in but then when I refreshed it wasn't logged in anymore. It stopped after a while but then when I clicked on an old tab when I refreshed I was logged out again. So, the log in issue must be something to do with how iOS Safari handles web cache on tabs.
I'm not sure if this has been said but, when I open lemmy on browser, my account would sometimes be someone elses.
I don't know if it's a bug and I saw it happen three times to me so far, and it even happened again a few minutes ago.
It's like I i logged into someone elses account, I saw three other usernames so far.
A few minutes ago it said my account was Professor -?-?-?- with that account's profile picture shown too.
It only does that for half a second before it returns back to my account.
I'm just making sure this is said because I don't want to one day accidentally log into someone
else's account by accident.
I'm one of the many who have had trouble logging in, and this issue is surely underreported as those affected generally aren't able to report it. It also seems like I'm not able to upvote or downvote. I'll update with any more issues that I come across but I only just now became able to log-in after a long wait and several different browsers.
Edit: it seems like I can successfully upvote/downvote, but the updated vote count and my blue/red arrow only show after refreshing the page. Thanks for all the work you put into this instance btw
Thanks a bunch for your hard work, Ruud and other admin folks! It's so damn GOOD to be able to use Jerboa again!
Also, it's really nice to see the breakdown of your work, helps a lot in understanding what you go through and maybe even of there's anything we can help with. Keep it up!
I may not be a user on your instance, but either way, thanks for the upgrade. I was noticing a lot of issues with federation from lemmy.world, and it seems like this upgrade more-or-less fixed them.
I'm just running a tiny, single-user instance, but I want you to know that I appreciate the work you're putting in! I run large-scale infra as my day job, so I understand how challenging this sudden influx of users (and federated servers!) is.
I'd like to know more about the exact container topology you have, since I may try something similar on my instance as well.
Is it something like this?
I am having some issues logging in to my lemmy.world account atm, just a heads up. I'm sure you folks are slammed right now, thanks for all the work you're doing!
Real challenging this morning posting and commenting. Circle of death waiting for something to post. Then getting multiple posts if it does go through.
To everyone having a login problem, it seems that resetting the password solves the issue! Maybe this means that the upgrade corrupted the stored hashes somehow?
Thanks for the hard work on the upgrade! Much appreciated although I'm only using the web version (not needing specific client apps). Lemmy World feels quite 'snappy' when browsing now.
Not gonna lie. This service is way too slow and riddled with way too many bugs. Posting comments is a huge chore as half the time there is some “error”. Comments don’t load half the time.
No one is going to switch from Reddit to this service if it’s going to always be this miserable of an experience. I’d rather bang my head against the wall.
Thank you from England for all the hard work AND for giving such interesting details, especially as it will encourage others to set up their own instances, and help them cross similar hurdles!!
So, basically I can't see any content from lemmyworld, I'm commenting right now from another instance. When I logged into my Lemmy world account its just empty, zero content, any solutions?
I really appreciate the transparency in this post. There's enough information for me to feel like I kind of know what's going on, and I can go dig into it deeper if I feel like it. This is a breath of fresh air from what I'm used to, thanks so much!
Login problem is fixed for me, yay! Back on Jerboa and here on the browser! Thanks for your hard work and for putting up with me, lol.
I'm getting network errors that aren't allowing me to actually view content on Jerboa right now, though, but at this point I'm assuming it's a Jerboa thing and not a problem with the instance.
Have you considered running your Lemmy instance on more than a single machine? If it is possible to run two lemmy containers anyway (ie, lemmy is not a singleton), why not run them on separate machines? With load balancing you could achieve a more stable experience.
It might be cheaper to have many mediocre machines rather than a single powerful one too, as well as more sustainable long-term (vertical vs horizontal scaling).
The downside would be that the set-up would be less obvious than with Docker compose and you would probably need to get into k8s/k3s/nomad territory in order to orchestrate a proper fleet.
Nice, really liking the update!
Some questions about development for the fediverse:
Is the code for running Lemmy written by one or person or a smome core team?
Is there any decision making process as to which features will be worked on in the next release or which bugs to prioritize?
In theory what would happen if the original developers started making changes that other people don't agree with? Would we get a fork then where servers have to choose to adopt it or not?
0.18 looks a lot better. Far better use of screen real estate on PCs.
Lag is still very prevalent though. Page loading, upvote delay. It's frustrating.
Live comments (like on new Reddit) does not seem to be working on 0.18, so I have to manually refresh the page each time. That also resets the comment sort to Hot, causing further annoyance.
Is there a issue with the api? ( Because the api wrapper lemmy-js-client doesnt work on login. ) I tried it yesterday but not today yet. I will test it when i can :)
Awesome! Loading issues are still the bane of Lemmy's existence though, or at least it is for me and my experience with Lemmy. Everything just loads so slow. Sorting is still broken as well. Communities that I KNOW that are active just show as blank for me no matter what I sort by.
Edit still see some performance issues. Needs more troubleshooting
Federation overheard is putting a lot of load on servers. Creating one task for every single post, comment, and vote in RAM-only queue.... pending changes: https://github.com/LemmyNet/lemmy/pull/3466
Thanks for the update. I especially like the transparency on not only the “upgrade” itself but also the potential issues encountered, together with the solutions. Seems rare nowadays, or I’m just seeing less and less people doing this.
Tried to login but nothing happen except a "?" was added into the link. Tried delete data, cookie, etc but the probelm still persist. Comment from other instance
Half the time when I comment, it just spins. :(
Edit: Apparently when I comment it posts, but just shows spinning until I manually refresh. Must be on my end.
It's faster now, and we finally have buttons for rich text features! Congratulations!
Update: upvotes are a bit broken and weird right now, I need to refresh every time to see that I upvoted. But that's really the only issue I see right now.
I was having trouble earlier but now able to log in just fine on browser. Voting on posts doesn't seem to be working for me on desktop or apps. In apps I keep seeing error notices about votes not going through and desktop browser (Firefox) doesn't work but there's no notification there. Anyone else? Maybe everything needs a little time to sync up.
You know, this is a nice post because now I understand what was happening to me as a user. Thanks for confirming that I am not insane! Well, maybe I'm insane, but what I was trying to do and couldn't was real, not something I was doing wrong. Also, thanks for updating the stuff that makes it work.
Browser still not working for me. The interface loads but there's no content. Also can't login on browser, after entering user and password and clicking login nothing happens.
obviously not critical, but it looks like there's a small sidebar bug (or feature?) that puts the pic near the instance name if it is the first thing in its description?
Running so many Lemmy instances against the same database doesn't cause race conditions? I wonder why that "just worked" so easily, usually load balancing DB-backed apps is a whole beast on its own.
Thanks for the hard work!, I had an issue the first minutes where every time I logged in I got logged in with a different stranger account, now it doesn't happen but I can't login haha.
thank u for letting us know what happening behind the scene.
Me myself is a sysadmin and really love to read story about scaling up servers and it actually works!
Once again. Thank you.
Love the update, all back up and running again :)
I joined this morning after discovering this awesome Apollo replacement and was so disappointed that it was down already! Understand that the sudden surge must be huge, looking forward to seeing the data of amount of users gained by Lemmy!
Honestly praying this is the solution we all want and need!
Does browsing with Incognito/Private mode opens up new bugs, or does the refreshing thing kept the same principle? I should be stayed as logged on, but for some reason - after this update - whenever I open a new private tab from the tab I'm logged on I am indicated as not logged in.
Thank you for al the work. Do you have a need or plans for community help at all? Outside of content moderation? Not quite sure how I could help but I do software for a living.
Question @ruud@lemmy.world why update to the release candidate? Just want to help testing? Or was there some readdition (ie: captcha) that had you quick on the trigger?
Thanks very much for your time and effort Ruud, it's much appreciated!
Now, after you've put the kids to bed, grab yourself a beer and put your feet up!
Thank you so much for doing this! The having an instance this big really made the difference for leaving reddit. I really missed jerboa and am glad to have it back as a client.
Thank you so much for doing this! The having an instance this big really made the difference for leaving reddit. I really missed jerboa and am glad to have it back as a client.
Browser still not working for me. The interface loads but there's no content. Also can't login on browser, after entering user and password and clicking login nothing happens.
Kinda makes sense that multiple containers might scale better. The actual processes within the container may have some limitations in terms of how well they thread etc.
Congrats on figuring it out! I'm just wading into docker in a professional capacity so I admit some of it feels like magic to my traditional developer brain but glad it worked out.
Browser still not working for me. The interface loads but there's no content. Also can't login on browser, after entering user and password and clicking login nothing happens.