Skip Navigation
Automation down
  • My bad. The bot had crashed and I don't have any monitoring set up at the moment.

    The mentions did work, but unfortunately liftoff doesn't explicitly sends notifications for mentions if you don't manually check the account.

  • r/plexsubs
  • Good news, everyone! The plexsubs subreddit was banned and this had caused the lemmit bot to get stuck in an infinite loop.

    Wait, that's not good news at all!

    Well, at least the bug has now been fixed.

  • Upcoming plans: auto-suggest alternatives and minimum post karma level.
  • It's probably technically feasible, because I control both the bot and the server. But it would require some hacky stuff (I don't known any Rust, the language Lemmy is built in). And beside, it would feel rather cheaty. It wouldn't just mess with how it would appear on this server, but on other servers as well - they probably wouldn't take kindly to that kind of manipulation. Nor should they.

    So ehm.. No.

  • Live now: Minimum upvotes and ratio amount.

    As discussed here, I have implemented a minimum level of upvotes that a post needs to have on reddit, as well as a minimum ratio of upvotes to downvotes.

    Right now I have those configured to require at least 5 upvotes, and more upvotes than downvotes (0.51). At first glance this already seems to be great improvement. There might be some tweaking later.

    As a side note I have now switched from using the reddit RSS feed, to using the JSON feed. This was required in order to get easy access to the upvote/ratio properties. So there might be some new and interesting new bugs introduced because of that. It's a brave new world.

    Needless to say, the first thing I'll do after releasing this, is plop down on the couch with a beer, and hope this doesn't crash. Fingers crossed!

    1
    r/Superstonk
  • Because it already exists, you dolt.

  • r/Superstonk
  • !Superstonk@lemmit.online is already a thing brah.

  • ADMIN: Please stop reporting OnlyFans models as spam just because they are OF models
  • Personally I'd be fine with allowing it in bios only. If people want to see more, they'll check out the bio, and see the link there. In other cases someone will just be like "... Nice." without feeling advertised to.

    In the end, it's all about the rules the community itself puts up. Personally, I get more enjoyment out of fewer "real" (imperfect/amateur) out-of-love quality, than more perfect/fitgirl for-profit quantity. But I'm aware this is generally a minority opinion.

  • Upcoming plans: auto-suggest alternatives and minimum post karma level.

    I'd like to hear some feedback on this, or approach vectors.

    Right now the bot is rather spammy. I was hoping that by using Reddits HOT feed, it would return have some level of quality control (I know, right?). Unfortunately, it seems that in most cases, it will just return anything that's new. The downside of this is that a lot of garbage gets through, and the bot spends a lot of time scraping the underlying page to get the details.

    I propose to only archive reddit posts that have a karma score of 5 or higher. In case of subs that hide the karma scores of posts for a certain time, they'd have to be at least 2 hours old, so that the Reddit moderators can weed out garbage on our behalf.

    Do you folks have any thoughts on this?

    Secondly, I want to put sticky comments on each community, with links to native Lemmy communities that cover the same subject. For this I would need some kind of API, or a master list of... oh, I see sub.rehab has just the thing I need. So expect that somewhere this week :).

    3
    r/unixporn
  • Thanks, added as a sticky in the lemmit community.

    Ideally I want to have this done automatically.

  • r/AskReddit
  • thank you!

    Beside the fact that this community already exists, I think all of the ask-* reddits are terrible contender for being replicated here.

  • (Done) Hold on to your horses, upgrading to 18.1

    See you on the other side! *** So the update is done, but the bot was offline for 6 hours, and needed to catch up.

    Unfortunately, another update slipped through, which switched the default feed from www.reddit.com to old.reddit.com, which has the side effect of changing all the urls in the posts as well. On one hand this is great, because new reddit sucks. On the other hand, this is terrible, because for every post the bot encounters, it checks if it already exists on lemmit... based on the url.

    So for every post the bot encountered, it went like "old.reddit.com/r/blabla/123? Haven't seen that one yet, there's an www.reddit.com/r/blabla/123, but that must be something completely different, let's post it again!"

    This also meant that the bot took over a minute and a half to update each community because it takes a couple of second per post. When I went to bed last night, I figured it was just posting a lot of content because it had so much catching up to do. But this morning I figured something was off because it still hadn't caught up.

    Anyway, the fix is out now. Sorry for all the duplicates. I need coffee now.

    0
    Smart syncing, defederation and server growth

    > ChatGPT, write a post for the stuff that I have in my head and want to get out as an update.

    Hmm. No brain implant yet. Guess I'll have to write this the hard way.

    Syncing update

    It has been an eventful week. I successfully deployed the initial version of smarter content syncing, and have made some adjustments to algorithm since then. Most notably, communities with only 1 subscriber (the bot) will no longer receive updates, and communities with fewer than 5 subscribers or with a low posting frequency will only be updated twice a day. Furthermore, for the highest update priority (every 10 minutes), a community must have a minimum of 50 subscribers. Implementation details can be found in the decide_interval() method over here.

    Being a developer is fun

    Meanwhile... Damnit, bot is stuck again. 2023-07-08 10:13:39,945 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 2:30:48 ago, interval 120 minutes 2023-07-08 10:13:40,653 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00 2023-07-08 10:13:45,324 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page 2023-07-08 10:13:46,333 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 2:30:54 ago, interval 120 minutes 2023-07-08 10:13:48,581 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00 2023-07-08 10:13:51,227 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page ... 1 bugfix and deployment later: 2023-07-08 10:46:42,836 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 3:03:51 ago, interval 120 minutes 2023-07-08 10:46:43,573 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00 2023-07-08 10:46:48,327 - utils.syncer - ERROR - Couldn't find post on https://old.reddit.com/r/BustyNaturals/comments/14told8/latina_bodies_are_the_best/, skipping.

    Defederation

    Meanwhile, the folks at https://lemmy.world reached out to me to tell me they're defederating Lemmit. They are not fond of high volume of posts made by the bot, and the fact that there are now (quick check) 462 communities on this server all being moderated by a single person. They have already received a couple of complaints about spam, and it didn't help that some requests for NSFW subreddits were not marked as NSFW. Occasionally, those subreddits had explicit thumbnails that appeared in the 'All feed' without warning.

    I had a good talk with the LemmyWorld admin, wherein they explained their point of view, and I explained mine. I understand their decision to disassociate with Lemmit, and appreciate their attempt to contact me. Other instances like Beehaw, and some smaller ones have also reached the same decision.

    This does mean that you will no longer be able to get new community updates on those servers. So make sure to check the blocked instances list on your home server if you were subscribed to Lemmit. At the same time I have removed all the subscriptions of users from those servers, in order to not affect the sync priority mentioned above. This does mean, that if LemmyWorld, Beehaw, etc ever decide to connect to Lemmit again (however unlikely), you will need to un- and re-subscribe from there.

    Meanwhile, I've added a feature in the bot that will remove request posts for NSFW subreddits, if the post itself is not marked for NSFW. This should prevent explicit thumbnails showing up where they are not wanted.

    Server growth

    Last night I got an alert from my server monitoring that the disk is 80% full. Unfortunately, the disk is only 60 GB, so that doesn't leave much room for expansion. On the bright side, a good chunk of that is from Lemmys very verbose logging (like, 4 GB a day, which gets cleaned up daily), so it should last throughout the weekend if I tune that down. Furthermore, most of the storage growth is from from pictrs, the image upload part of Lemmy, and that can utilize an S3 bucket, rather than using the VM's storage like it is now. Using an S3 bucket offers a cost-efficient solution for expanding storage. Initial estimates indicate a monthly cost of around $5 for 1000 GB of storage, which should be sufficient for a while \*fingers crossed\*.

    In the early days of Lemmit (literally, as the server is less than a month old) image uploads were limited to a default setting, which was something around 40 megabytes. That did add up quickly (thanks to half-minute porn gifs), and so I had to limit the max filesize to 1 MB, and later 0.5 MB. Once the server has switched to S3 storage, I can probably up that limit a little, although not too much.

    Finally, Lemmy v0.18.1 has been released, and it contains even more performance boosts compared to v0.18.0, so if there's time left this weekend (and I can verify the Lemmit Bot is compatible), I will probably perform the upgrade.

    0
    Proposed rules update, 8 July 2023
  • Cheers, both all three of you. We're off to a beautiful federated future.

  • Proposed rules update, 8 July 2023
  • Congrats on reaching this set of sane rules. The efforts of creating an admin community behind the scenes are really starting to show off.

    Request for clarification for uhmm, a friend of mine: When someone creates that own instance, with blackjack and hookers, and one of your users subscribes to a community there, it will synchronise part of that content to lemmynsfw. What will you do then?

    I'd like to remind you that some beautiful maniacs can be quite reasonable ;)

  • r/UkraineWarVideoReport
  • Yeah, I've upped the limit on this server, so it should come through now if you retry.

  • Any chance of making posts link to old.reddit.com instead of www.reddit.com?
  • Could you give an example post of what you mean? Every Post starts with "The original was posted on /r/blabla", in which the latter links to the original, old.reddit.com link, will that work for you?

  • /r/BestOfRedditorUpdates

    You know, on account of me upping that one setting in the admin which I should have thought of long ago.

    0
    r/UkraineWarVideoReport
  • That could work, but it would be terrible for discoverability. In the mean time, I put up a feature request at Lemmy. I'm not a fan of pushing my problems upstream, but in this case it would actually be the easiest solution - as far as I can see (and I have 0 experience with Rust) they only need to adjust the validation regex, because the database already allows for it. That is - as long as the ActivityPub protocol allows for it.

    If they deny it, I could try something with name mapping, but you'd either end up with something that is unreadable, or something with a high collision chance. Neither option is very appealing. For now I'm just going to wait and see.

  • /r/bestofredditorupdates
  • I have considered some technical solutions, and I agree that this sub would be an excellent candidate for archiving. For now I have made a feature request at Lemmy because, let's face it, that would solve several problems.

    If they aren't up for it, I could try and fix it some other way, but ideally it would be fixed if they would just allow for 1 more character than they do now.

  • r/UkraineWarVideoReport
  • Unfortunately, Lemmy cannot handle community names over 20 characters, so this won't be possible.

  • Update 30-06-2023: smarter content syncing

    Okay, this one took me a bit longer than I planned (mostly due to sql fun and trying to use integers as minutes, WEEEE!).

    Backdrop: Last week I disabled the mirroring of a couple of subreddits to the database, because they were initially requested but the nobody subscribed to them. At the same time, the bot was just crawling in a loop, starting at todayilearned, ending at latestsubreddit. As more subreddits were requested, this loop took longer and longer (21 minutes before I rolled out this update). This wasn't sustainable.

    So here's the new situation. The more popular a community is, the more often it will be updated. In this case popular means a mixture between number of subscribers and the amount of posts it receives per day (Link to relevant snippet of source code).

    In short, the most popular subs will be synced every 10 minutes, the next tier ever 30 minutes, 120 minutes and the content with either no posts per day or no subscribers (other than the bot), will only be synced every 12 hours. I hope this will hit a good distribution of updates vs popularity, but it will most likely be refined at some point in the future.

    Speaking of distribution, we now have over 300 communities on this server 🥳, and their update intervals are spread out as such:

    • Every 10 minutes: 22
    • Every 30 minutes: 39
    • Every 60 minutes: 55
    • Every 120 minutes: 143
    • Every 720 minutes: 44

    With this update running live (I started typing after I deployed it, and it has now gotten through the backlog of 'abandoned' subs), I'm going to step back from feature development for a few days. Any bugs that cause the bot to crash will of course continue to be addressed.

    Have a blast!

    0
    r/unixsocks
  • ...

    of course this exists.

    (I'm not complaining)

  • r/AskReddit
  • 👍 Fair enough. I just want to prevent people requesting things, deciding it's not what they wanted, and then have the bot keep it up to date for nothing.

  • r/AskReddit
  • Normally it does, see https://lemmit.online/comment/490 Not sure why it didn't here though :(

  • r/AskReddit
  • askreddit is already being archived.

    Question is: why would you want to? You'll only get the questions, not the actual answers (see the FAQ in !about@lemmit.online).

  • Server VM upgraded / Upcoming bot plans

    Before was running on the cheapest model (1 core / 1GB mem / 30GB storage) at $12/month. The machine was running pretty low on memory, causing it to start swapping, which in turn caused the cpu to get too busy, and everything to slow down.

    Now it has a whopping 2GB of memory, and things seem to have calmed down - cpu is back to around 10-15% usage, and swap is down to 0. Happy times all around.

    Because of the amount of subs being archived, it now takes about 15 minutes between updates for each sub (was 18 before I updated the VM).

    I'm planning to build some kind of scoring system, based on the amount of posts per subreddit (per day?), and amount of subscribers on the lemmy community. That way communities with little subscribers or that don't see many posts per day, will only be updated once per hour.

    At the same time, I feel that subs like AskReddit, OutOfTheLoop and other "question-based" subreddits shouldn't be archived by Lemmit. In my opinion those kind of posts are useless without those answers, but please let me know if you disagree.

    0
    Bug fixes 24-06-2023
    • Fixed a bug where posts would not be submitted because the title didn't contain long enough words.
    • Fixed a bug where posts would not be submitted because the url was too long.
    • Fixed a bug where posts would not be submitted when it was linking to a /user subreddit.
    • Fixed a bug where the bot would think Every Post Everywhere was a subreddit request, and would reply to it.
    • Fixed a bug where the bot would crash without recovering whenever something went wrong during new subreddit requests

    A fruitful day all in all, I'd say.

    0
    /r/bestofredditorupdates
  • Bad bot. Deploying a fix right now for this, apologies for the spam.

    The bad news is that I now know why it cannot clone this subreddit - the name is too long. That's going to take some time to fix, I'm afraid.

    At any rate, "bestof" subreddits don't work very well at the moment anyway, since they do not yet retrieve the underlying message.

  • Please don't tell me

    That the replies-everywhere-bug was just because I forgot to include a variable in the bot deployment? 🤦

    0
    Frequently Asked Questions / What is Lemmit?

    In the short time since this instance and bot launched, I've been seeing the same questions resurface multiple times. This is totally understandable, since the concept of a Fediverse is still new to most (myself included), and this server is not like the others.

    Q: What is Lemmit?

    A: Lemmit is a Lemmy instance specifically designed for archiving Reddit content. Users can request new subreddits to be included in the archiving process by posting in the !requests@lemmit.online community. It is powered by an open source python bot, which periodically checks the request list, adds new requests to the queue, and continuously monitors the Hot feed of those subs for new posts to cross-post here.

    Q: Does it synchronize comments?

    A: No, that would be impossible. Considering there are thousands of posts already on Lemmit, many of them having at least several hundred comments on Reddit, often buried in deep layers, it simply wouldn't be feasible to index those for more than a few posts, let alone keep them up to date.

    Unfortunately, this means that archiving certain subreddits, such as Ask Historians/Men/Women/Hyperintelligentshadesofthecolourblue-type subs, is going to be rather pointless.

    Q: Can it send comments back to Reddit?

    A: No, it cannot. The purpose is to help bootstrap the Lemmy platform, not to serve as a bridge between the two networks. Also, see the answer about synchronizing comments.

    Q: Can I request any subreddit?

    A: Technically, yes. However, as the list of subs grows, the time it takes to update all of them will also increase. I do not have strict guidelines in place for this, so I'm relying on your common sense (hoooo boy). At some point, I will probably have to either stop accepting new requests or disable scraping for very low-traffic communities.

    Q: Does this use the API? Will it keep working after July 1st?

    A: Nope, it uses a combination of the public feed and scraping old.reddit.com. So, as long as those are still available, it will continue working. And even if they close those sources, there will probably be new ways to achieve the same effect. "Content, eh, finds a way."

    Q: This is spam, can you stop?

    A: First of all, I apologise for the inconvenience. All you have to do is block @bot@lemmit.online, and none of its posts will ever show up on your instance. If you you don't want anyone else on your server to be exposed to this bot/instance, you should convince your admin to defederate from lemmit.online. Since there are no other users on here, there will be no harm done.

    Obviously I could stop, because running this server and software is only ever going to cost me time and money. But for the reasons listed above, I still think this server is a useful addition to the lemmyverse at this time. But I'm looking forward to the day where I can turn the bot off because it's no longer needed.

    Q: What started this?

    A: Okay, nobody asked this, but I'm going to tell you anyway. After Reddit made it clear that they are effectively killing third-party apps and implementing plenty of other anti-end user decisions, I realized that I would either have to accept not being able to access my time-wasting content or have to do so in a rather uncomfortable way (either through the official app or old.reddit.com for as long as they'll allow it to exist).

    Being a stubborn developer, naturally, I chose option C: Have my own Reddit. With blackjack, and hookers. This way, I would still be able to access my beloved content without being beholden to Reddit's mood swings and abusive relationship tendencies.

    Besides that, I also know that Content is King. So I'm order to counter the network effect (No users because no content, No content because no users), I figured it would be better to have some inorganic content to bootstrap the adoption of Lemmy.

    Q: Are NSFW subreddits allowed?

    A: Absolutely. Like I said: Blackjack and hookers.

    Q: My request isn't picked up by the bot!

    A: That isn't a question. But yeah, the process isn't flawless yet. I'm trying to iron out all the bugs as I encounter them. In the meantime, feel free to re-request the subreddit by making a second post. No harm done.

    Q: No new posts are showing up at all on Lemmit

    A: If no posts are appearing on the Lemmit Frontpage (sorted by NEW), it's possible that the bot has crashed or is stuck on something. Since no software is flawless, this sometimes happens. I usually fix this as soon as I'm aware, and I'm happy to say that these kinds of fatal errors are becoming less and less frequent. However, they may still occur, and as a human with needs of sleep and other responsibilities, I'm not always able to fix them immediately.

    Q: Posts aren't showing up on my instance, what's up?

    A: Due to the spammy nature of the bot, some server admins choose to block this server, and that is completely understandable. So first of all, make sure to check the instances link in the footer of your home server. If Lemmit is the Blocked Instances list, you're out of luck.

    When you have verified that Lemmit is not blocked on your instance, try unsubscribing, waiting a little, and then re-subscribing. That tends to fix things.

    0
    Today I Fucked Up by not doing DNS registration properly.

    Long story short: I messed up with the domain registration for this instance, and never replied to a mandatory email. The domainname (lemmit.online) got put in suspension, causing disconnects all over the fediverse.

    I fixed it as soon as I found out, but it will probably take a few more hours for the issues to be fully fixed.

    So ehm. Whoops. Hope this explains and fixes the federation issues we've been having today.

    0
    Bug fixes 21-06-2023

    Most importantly that the bot no longer crashes (and does nothing all night while I sleep 😛) when trying to create a community that has already been requested.

    Furthermore mostly making the code prettier and adding tests.

    0
    /r/mildlyinteresting
    old.reddit.com For photos that are, you know, mildly interesting • r/mildlyinteresting

    This subreddit is closed in protest of Reddit killing third party apps. Please check out this post...

    For photos that are, you know, mildly interesting • r/mildlyinteresting

    Try again, you lazy bot.

    0
    Bug fixes 19-06-2023

    Fixed a couple of bugs today:

    • Nasty one that made the bot get stuck in an infinite when trying to add a post by a deleted user, which kept the bit offline for most of last night.
    • Another creative one that, when posting certain links, would actually work, but the lemmy gateway would respond with a timeout. It only happens on certain links, but consistently. Which would make the bot think it was unsuccessful, which would make it try to post again the next time. Causing a duplicate post each time (technically it was a cross post to itself... Which is interesting in a whole new way).

    TLDR: right now there is a workaround in place that assumes a timeout post to lemmit was actually successful. This might cause it to drop posts in the future, but seeing that the server is barely breaking a sweat at this time, it should be good until a better fix is implemented.

    Also got some great feedback from users, which I added to the TODO.

    0

    cross-posted from: https://lemmit.online/post/177

    > I have created some software that is capable of synchronising posts from Reddit to Lemmy. It's still a little rough around the edges, but it works as a such: > > People can request new subreddits to be mirrored on !requests@lemmit.online. A bot (open source) will monitor the threads there, and if it finds a new request for a subreddit, it will make a new community on the Lemmit server, and add it to its monitored list. > It will then make periodic checks to see if any new posts (it doesn't copy any comments) have been posted on reddit, and copy those over. > > Users can then subscribe to those communities from their own lemmy instance, and from there federation will pick it up. Or at least, that's the theory. At the moment, federation is not working awesomely, and that is where my lack of fediverse knowledge comes in. Maybe it needs more time, or something is not so properly - I don't know. > > Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, "original" content to all the rest of the Fediverse while it's going through a ramp-up phase. Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself. There is a !about@lemmit.online community for further questions about the project itself though, in case people want to discuss it further. > > So ehm... Let me know what you think :)

    0

    I have created some software that is capable of synchronising posts from Reddit to Lemmy. It's still a little rough around the edges, but it works as a such:

    People can request new subreddits to be mirrored on !requests@lemmit.online. A bot (open source) will monitor the threads there, and if it finds a new request for a subreddit, it will make a new community on the Lemmit server, and add it to its monitored list. It will then make periodic checks to see if any new posts (it doesn't copy any comments) have been posted on reddit, and copy those over.

    Users can then subscribe to those communities from their own lemmy instance, and from there federation will pick it up. Or at least, that's the theory. At the moment, federation is not working awesomely, and that is where my lack of fediverse knowledge comes in. Maybe it needs more time, or something is not so properly - I don't know.

    Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, "original" content to all the rest of the Fediverse while it's going through a ramp-up phase. Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself. There is a !about@lemmit.online community for further questions about the project itself though, in case people want to discuss it further.

    So ehm... Let me know what you think :)

    23
    admin admin @lemmit.online
    Posts 23
    Comments 27
    Moderates