Gradually over the last decade, Reddit went from merely embarrassing but occasionally amusing, to actively harmful, to—mainly by accident—essential. As the platform that swallowed niche message boards, it became home to numerous small communities of surprisingly helpful enthusiasts, and grew into a ...
The internet’s best resources are almost universally volunteer run and donation based, like Wikipedia and The Internet Archive. Every time a great resource is accidentally created by a for-profit company, it is eventually destroyed, like Flickr and Google Reader. Reddit could be what Usenet was supposed to be, a hub of internet-wide discussion on every topic imaginable, if it wasn’t also a private company forced to come up with a credible plan to make hosting discussions sound in any way like a profitable venture.
There are maybe 4 or so 'crawlers', and the rest buys access to the part of their data they are willing to sell to others.
Running a crawler with the current size and complexity of the internet is expensive, and complicated. Then there is sifting and sorting the data in a reasonable searchable format, and then there is the quality problem, etc.
Much easier to license data access from a provider (Usually Bing or Google or both) and just offer some added features on top, like no tracking, different result UI, custom filtering values per Bing or Google's APIs that make your own "secret sauce", etc.
Unrelated to the article, but I've never heard of defector before, and I spend the last hour crawling through and reading a couple articles. Seems like a fantastic little site (and apparently worker owned?). I don't care much about most of the sports articles, but the other stuff is great. Thanks for posting!
my opinion: the knowledge is not going to "lost", cause niche community probably will crawl their own data or reddit wiki/FAQs before they retreat to other platform, lemmy or not. However, that also means you can't just google the phrase and put site:reddit in the search term to quickly filter your results.
Yes it sucks that say in the future you want to search for something it would take more time to reach the same result or user base, it will still exist somewhere. Like those emulator or homebrew communities that are universally chased/DMCAed around the internet, they still exist and you just need to spend more time to find them.
It's not just spending more time though. If they splinter out into the fediverse, that's not too bad, but the major downside of independent forums was that you needed to register an account for every single niche and obscure site, many of which had restrictions and weird requirements for registration, posting, and participation, and generally had a far less reach than reddit.
Reddit is just one account for everything. Technically, the fediverse can be this, but then, the pitfall here is the volatility of instances. What happens when an operator decides they can't manage it anymore? Or they're situation changes and they can't afford to? Or they pass away? Or any number of scenarios? Sure, you can just re-register in another instance, but whatever information had accumulated in that instance is now blackholed. It's just gone.
Reddit won't likely go out completely any time soon, and the wealth of existing knowledge will continue to be reachable, but it will become continually less useful for new queries. Now there's an empty space.
If Lemmy wants to fill that space, it's volatility needs to be addressed. I've mentioned this before, but I think the simplest way to address this would be to implement mirror instances, with the sole purpose of being a real-time redundancy for other instances in case they go down.
I am under the impression that the info you posted is shared to other server and thus also exist somewhere else. It make sense that your login or handle is gone, but the communities/posts are also gone? There should be some cached content on other server as well. (say if I sub technology at beehaw, it has to send info in batch to lemmy.ca so when I read I am reading from the local cached version instead of pulling data from beehaw directly.)
Also, any serious enough instance would likely have regular back ups and multiple admins to prevent that.
niche community probably will crawl their own data or reddit wiki/FAQs before they retreat to other platform,
Except I've seen a bunch of people saying they deleted their reddit history before they deleted their account, because they didn't want reddit to be able to keep any value from their past contributions.
Now, I don't know how many valuable comments would be impacted by that, but it is a concern for the same reasons already discussed.
I would likely do the delete post as well but I will also download my posts and comments to archive somewhere. It's very interesting to search something and then google send a reddit link to your older post. I don't have that much impact anyway, it's just for my personal amusement. (like sharing my old Monster Hunter achievement or etc, nothing serious.)
Communities can archive their data and wiki/FAQs, which is nice, and hopefully they'll have better SEO than astrosurfing articles about the same subject (hahahaha jk jk we know that's impossible as google has seemingly no interest in prioritizing organic content anymore).
But the alternatives don't offer anything even remotely close to /top?t=all. This feature of reddit is the single greatest thing that has ever happened to the concept of "getting into a hobby", and just like that it's gone.
Even if it weren't nonprofit, it would be better run as a worker cooperative where the employees of Reddit made the decisions. Instead, its the capricious whims of Mr. u/spez