I made this based on the gripe about some of the silent failures with federation. Might help users choose other servers. Might help admins troubleshoot. Open to comments and criticisms!
Oooohhh ... Nice!! I'm repeatedly impressed at how many hackers are going ahead and just getting some stuff done here!!
Questions/thoughts:
What instance is used as a reference for the delay? One you self-host (lemmy.management)?
Sooo ... what's the deal with lemmy.ml ... that seems to have gone beyond lag and is basically falling over ... seems like the devs have neglected their own instance's health?
What's that Redash? Is it a plotly thing or some other product that just uses their graphing library? How have you found it?
What instance is used as a reference for the delay? One you self-host (lemmy.management)?
Yes. lemmy.management. It is purposefully updating subscribed communities to as many as possible (via automation.) This doesn't correct for network lag, but the idea was to capture the "federation" lag. There's no code I'm aware of that allows admins to prioritize outbound federation traffic. I could be wrong though.
Sooo … what’s the deal with lemmy.ml … that seems to have gone beyond lag and is basically falling over … seems like the devs have neglected their own instance’s health?
I just collect the data.
What’s that Redash? Is it a plotly thing or some other product that just uses their graphing library? How have you found it?
https://redash.io I don't remember how I found it. Probably an "awesome" list on github.
Not the person you’re replying to, but I didn’t find it awful on mobile. The zoom by dragging worked well, as did the double tap to view the whole dataset.
For a quick browse I wasn’t frustrated at all and found the information I wanted to in a short amount of time!
Fixed! The regex was not getting content from < 0.18.0 instances. Thanks!
EDIT: I am wrong, it was something else in feddit.de's messages I THOUGHT was a version thing, but must be a localization thing. A string in the JSON was breaking some regex. Regardless.. fixed.
I’m expecting that JSON parsing is a huge overhead with the fediverse. I work on a SAAS that needs to do all its internal processing in under 10 ms, and serializing/deserializing ends up being a sizable chunk of server time. I saw a 40% reduction in runtime using simdjson for deserializing, and there exists a rust crate for it, but I haven’t had time to look the Lemmy code over.
Can anyone with an overloaded instance get on their command line and gather a decent flamegraph so the performance folks can aim optimizations in the right direction?
It'll be interesting to see how this changes through the day! I know .world tends to slow down later in the day when the US contingent is getting going.
This is awesome! Hopefully it'll help spread the load among instances. Definitely going to use this to see which instance to move to (and which to avoid)
This is really cool! Would it be possible to grab this data as json, csv or some other equivalent format? I'm working on making my own lemmy client and this would be very helpful to be able to display i think