Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)PR
Posts
33
Comments
726
Joined
2 yr. ago

  • I use Headscale, but Tailscale is a great service and what I generally recommend to strangers who want to approximate my setup. The tradeoffs are pretty straightforward:

    • Tailscale is going to have better uptime than any single-machine Headscale setup, though not better uptime than the single-machine services I use it to access... so not a big deal to me either way.
    • Tailscale doesn't require you to wrestle with certs or the networking setup required to do NAT traversal. And they do it well, you don't have to wonder whether you've screwed something up that's degrading NAT traversal only in certain conditions. It just works. That said, I've been through the wringer already on these topics so Headscale is not painful for me.
    • Headscale is self-hosted, for better and worse.
    • In the default config (and in any reasonable user-friendly, non professional config), Tailscale can inject a node into your network. They don't and won't. They can't sniff your traffic without adding a node to your tailnet. But they do have the technical capability to join a node to your tailnet without your consent... their policy to not do that protects you... but their technology doesn't. This isn't some surveillance power grab though, it's a risk that's essential to the service they provide... which is determining what nodes can join your tailnet. IMO, the tailscale security architecture is strong. I'd have no qualms about trusting them with my network.
    • Beyond 3 devices users, Tailscale costs money... about $6 US in that geography. It's a pretty reasonable cost for the service, and proportional in the grand scheme of what most self-hosters spend on their setups annually. IMO, it's good value and I wouldn't feel bad paying it.

    Tailscale is great, and there's no compelling reason that should prevent most self-hosters that want it from using it. I use Headscale because I can and I'm comfortable doing so... But they're both awesome options.

  • I replied to the parent comment here to say that governments HAVE set up CSAM detection services. I linked a review of them in my original comment.

    • They've set them up through commercial partnerships with technology companies... but that's no accident. CSAM fighting orgs don't have the tech reach of a major tech company so they ask for help there.
    • Those partnerships are limited to major/successful orgs... which makes it hard to participate as an OSS dev. But again, that's on-purpose as the same access that would empower OSS devs to improve detection would enable CSAM producers to improve evasion. Secrecy is useful in this race, even if it has a high cost.

    Plus with the flurry of hugely privacy-invading or anti-encryption legislation that shows up every few months under the guise of "protecting the children online", it seems like that should be a top priority for them, right?! Right...?

    This seems like inflammatory bait but I'll bite once.

    • Improving CSAM detection is absolutely a top priority of these orgs, and in the last 10y the scope and reach of the detection tools they've created with partners has expanded in reach from scanning zero images to scanning hundreds of millions or billions of images annually. It's a fairly massive success story even if it's nowhere near perfect.
    • Building global internet infrastructure to scan all/most images posted to the internet is itself hugely privacy invading even if it's for a good cause. Nothing prevents law-makers from coopting such infrastructure for less noble goals once it's been created. Lemmy is in desperate need of help here, and CSAM detection tools are necessary in some form, but they are also very much scary scary privacy invading tools that are subject to "think of the children" abuse.
  • I'm not sure I follow the suggestion.

    • NCMEC, the US-based organization tasked with fighting CSAM, has already partnered with a list of groups to develop CSAM detection tools. I've already linked to an overview of the resulting toolsets in my original comment.
    • The datasets used to develop these tools are private, but that's not an oversight. The datasets are... well... full of CSAM. Distributing them openly and without restriction would be contrary to NCMEC's mission and to US law, so they limit the downside by partnering only with serious/capable partners who are able to commit to investing significant resources to developing and long-term maintaining detection tools, and who can sign onerous legal paperwork promising to handle appropriately the access they must be given to otherwise illegal material to do so.
    • CSAM detection tools are necessarily a cat and mouse game of CSAM producers attempting to evade detection vs detection experts trying to improve detection. In such a race, secrecy is a useful... if costly... tool. But as a result, NCMEC requires a certain amount of secrecy from their partners about how the detection tools work and who can run them in what circumstances. The goal of this secrecy is to prevent CSAM producers from developing test suites that allow them to repeatedly test image manipulation strategies that retain visual fidelity but thwart detection techniques.

    All of which is to say...

    ... seems like law enforcement would have such a data set and seems they should of course allow tools to be trained on it. seems but who knows? might be worth finding out.)

    Law enforcement DOES have datasets, and DO allow tools to be trained on them... I've linked the resulting tools. They do NOT allow randos direct access to the data or tools, which is a necessary precaution to prevent attackers from winning the circumvention race. A Red Hat or Mozilla scale organization might be able to partner with NCMEC or another organization to become a detection tooling partner, but db0, sunaurus, or the Lemmy devs likely cannot without the support of a large technology org with a proven track record or delivering and maintaining successful/impactful technology products. This has the big downside of making a true open-source detection tool more or less impossible... but that's a well-understood tradeoff that CSAM-fighting orgs are not likely to change as the same access that would empower OSS devs would empower CSAM producers. I'm not sure there's anything more to find out in this regard.

  • I haven't been moderated a lot, but I believe the user gets no indication they've been moderated unless the mod replies to them or DMs them to tell them.

    I agree that auto-notificiation would be beneficial. Despite the easy availability of the modlog, this kind of question is pretty common. Not everyone knows it exists or how to search it.

  • It's worth considering some commercially developed options as well: https://prostasia.org/blog/csam-filtering-options-compared/

    The Cloudflare tool in particular is freely and widely available: https://blog.cloudflare.com/the-csam-scanning-tool/

    I am no expert, but I'm quite skeptical of db0's tool:

    • It repurposes a library designed for preventing the creation of synthetic CSAM using stable diffusion. This library is typically used in conjunction with prompt scanning and other inputs into the generation process. When run outside it's normal context on non-ai images, it will lack all this input context which I speculate reduces its effectiveness relative to the conditions under which it's tested and developed.
    • AI techniques live and die by the quality of the dataset used to train them. There is not and cannot be an open-source test dataset of CSAM upon which to train such a tool. One can attempt workarounds like extracting features classified and extracted separately like trying to detect coexisting features related to youth (trained from dataset A using non sexualized images including children) and sexuality (trained separately from dataset B using images containing only adult performers)... but the efficacy of open source solutions is going to be hamstrung by the inability to train, test, and assess effectiveness of the open tools. Developers of major commercial CSAM scanners are better able to partner with NCMEC and other groups fighting CSAM to assess the effectiveness of their tools.

    I'm no expert, but my belief is that open tools are likely to be hamstrung permanently compared to the tools developed by big companies and the most effective solutions for Lemmy must integrate big company tools (or gov/nonprofit tools if they exist).

    PS: Really impressed by your response plan. I hope the Lemmy world admins are watching this post, I know you all communicate and collaborate. Disabling image uploads is I think I very effective temporary response until detection and response tooling can be improved.

  • My money is also on IO. Outside of CPU and RAM, it's the most likely resource to get saturated (especially if using rotational magnetic disks rather than an SSD, magnetic disks are going to be the performance limiter by a lot for many workloads), and also the one that OP said nothing about, suggesting it's a blind spot for them.

    In addition to the excellent command-line approaches suggested above, I recommend installing netdata on the box as it will show you a very comprehensive set of performance metrics without having to learn to collect each one on the CLI. A downside is that it will use RAM proportional to the data retention period, which if you're swapping hard will be an issue. But even a few hours of data can be very useful and with 16gb of ram I feel like any swapping is likely to be a gross misconfiguration rather than true memory demand... and once that's sorted dedicating a gig or two to observability will be a good investment.

  • I kind of want to settle on this read, and the thing that jumped out at me was also his weirdly small head. The trophy skulls everywhere look regular size, though. If this is a head-shrinking bit, that seems incongruous.

    I don't have other ideas though, beyond "fat man on his way to get eaten"... which I guess is maybe enough. Sometimes they're that simple. His head looks so weird though, I want it to mean something.

    Edit: I think Lux has it. He's shrunken his own head to make it an undesirable trophy.

    • Does this happen all the time or intermittently?
    • If you retry does it work later?
    • Can you link a post where it happens?

    I just tested and it looks ok to me, but world is flaky sometimes. It could just be random timeouts from slowness that would self-resolve if you retry.

  • Tailscale is out, unfortunately. Because the server also runs Plex and I need to use it with Chromecast on remote access...

    I rather suspect you already understand this, but for anyone following along... Tailscale can be combined with other networking techniques as well. So one could:

    • Access Plex from a Chromecast on your home network using your physical IP, and on your tailnet using the overlay IP.
    • Or one could have some services exposed publicly and others exposed on the tailnet. So Immich could be on the tailnet while Plex is exposed differently.

    It's not an all or nothing proposition, but of course the more networking components you have the more complicated everything gets. If one can simplify, it's often well worth doing so.

    Good luck, however you approach it.

  • So for something like Jellyfin that you are sharing to multiple people you would suggest a VPS running a reverse proxy instead of using DDNS and port forwarding to expose your home IP?

    I run my Jellyfin on Tailscale and don't expose it directly to the internet. This limits remote access to my own devices, or the devices of those I'm willing to help install and configure tailscale on. I don't really trust Jellyfin on the public internet though. It's both a bit buggy, which doesn't bode well for security posture... and also a misconfiguration that exposes your content could generate a lot of copyright liability even if it's all legitimately licensed since you're not allowed to redistribute it.

    But if you do want it publicly accessible there isn't a hoge difference between a VPS proxying and a dynamic DNS setup. I have a VPS and like it, but there's nothing I do with it that couldn't be done with Cloudflare tunnel or dyndns.

    What VPS would you recommend? I would prefer to self host, but if that is too large of a security concern I think there is a real argument for a VPS.

    I use linode, or what used to be linode before it was acquired by Akamai. Vultr and Digitalocean are probably what I'd look to if I got dissatisfied. There's a lot of good options available. I don't see a VPS proxy as a security improvement over Cloudflare tunnel or dyndns though. Tailscale is the security improvement that matters to me, by removing public internet access to a service entirely, while lettinge continue to use it from my devices.

  • Do I need to set up NGINX on a VPS (or similar cloud based server) to send the queries to my home box?

    A proxy on a VPS is one way to do this, but not the only way and not necessarily the best one... depending on your goals.

    • You can also use port-forwarding and dyndns to just expose the port off your home-ip. If your ISP is sucky, this may not work though.
    • You can also use Cloudflare's free tunneling product, which is basically a hosted proxy that acts like a super port-forward that bypasses sucky ISP restrictions.
    • If you want to access Immich yourself from your own devices but don't need to make it available to (many) others on devices you don't control, I like and use tailscale the best. The advantage of tailscale is that Immich remains on a private network, not directly scannable from the internet. If there's a preauth exploit published and you don't pay attention to update promptly, scanners WILL exploit your Immich instance with internet-exposed techniques... whereas tailscale allows you to access services that internet scanners cannot connect to, which is a nice safety net.

    Do I need to purchase a domain (randomblahblah.xyz) to use as the main access route from outside my house?

    Not for tailscale, and I don't think for Cloudflare tunnel. Yes for a VPS proxy.

    I've run a VPS for a long while and use multiple techniques for different services.

    • Some services I run directly on the VPS because it's simple and I want them to be truly publicly accessible.
    • Other services I run on a bigger server at home and proxy through the VPS because although I want them to be publicly accessible, they require more resources than my VPS has available. When I get around to installing Immich, there's a decent chance it will go into this category.
    • Still other services, I run wherever and attach them to my tailnet. These I access myself on my own devices (or maybe invite a handful of trusted people into my tailnet), but aren't visible to the public internet. If I decide not to use immich's shared gallery features (and so don't need it publicly accessible) or decide I don't trust it security-wise... it will go here instead of the proxy-by-vps category.
    1. ...create a sidebar with some contents... At least some of these communities have empty sidebars.
    2. Every community needs enough moderators. A single-mod community is not "enough" for a healthy community because things can blow up when you're asleep or away, even in a community that was previously inactive. If a community member reaches out to offer to join a single-mod team... that contact warrants a response from the existing mod. Not necessarily to immediately accept the offer, but at least to discuss the possibility of extra mod coverage.
    3. It's just not at all true that if others aren't posting there's no moderation work that could be done. Mods of inactive communities can jumpstart them by soliciting feedback on proposed rules, advertising them elsewhere, making scheduled discussion posts, and more. Some of these things can be done by a "regular" community member as well, but if community members try to include mods in discussions about how best to promote the community and the mods ignore them... that's a sign that the community is abandoned.
    4. If a mod is notified that they're their community is about to get reassigned and they don't respond... the community is definitely abandoned.

    All of which is to say, there are lots of way to detect abandoned communities when post volume is low, and the process I highlighted is the standard way to request a takeover.

  • I use k8s at work and have built a k8s cluster in my homelab... but I did not like it. I tore it down, and currently using podman, and don't think I would go back to k8s (though I would definitely use docker as an alternative to podman and would probably even recommend it over podman for beginners even though I've settled on podman for myself).

    1. K8s itself is quite resource-consuming, especially on ram. My homelab is built on old/junk hardware from retired workstations. I don't want the kubelet itself sucking up half my ram. Things like k3s help with this considerably, but that's not quite precisely k8s either. If I'm going to start trimming off the parts of k8s I don't need, I end up going all the way to single-node podman/docker... not the halfway point that is k3s.
    2. If you don't use hostNetworking, the k8s model of traffic routes only with the cluster except for egress is all pure overhead. It's totally necessary with you have a thousand engineers slinging services around your cluster, but there's no benefit to this level fo rigor in service management in a homelab. Here again, the networking in podman/docker is more straightforward and maps better to the stuff I want to do in my homelab.
    3. Podman accepts a subset of k8s resource-yaml as a docker-compose-like config interface. This lets me use my familiarity with k8s configs iny podman setup.

    Overall, the simplicity and lightweight resource consumption of podman/docker are are what I value at home. The extra layers of abstraction and constraints k8s employs are valuable at work, where we have a lot of machines and alot of people that must coordinate effectively... but I don't have those problems at home and the overhead (compute overhead, conceptual overhead, and config-overhesd) of k8s' solutions to them is annoying there.

  • The more normal transfer path is to offer to take over a specific community or communities by:

    1. Reaching out to the existing mod and asking to be added to the mod team.
    2. Documenting their lack of response after a few days or a week.
    3. Documenting the failure to abide by Lemmy world moderation guidelines: https://lemmy.world/post/424735 by linking spam or off-topic posts and to communities that lack rules/useful-sidebar-content, etc.
    4. Posting this info in !moderators@lemmy.world and offering to takeover moderation.

    This is better than mass deletion because it keeps whatever small list of existing subscribers and post content intact across the transition. For moderation, Lemmy world admins will get notified of reports and can address anything that violates instance rules.

  • I wanted to plug one of them over USB, but it seems that docker just doesn't like to have volumes on external drives. AFAIK docker starts before the drive is fully mounted, preventing it from doing so. I couldn't find any reliable way to work around this (but I'm open to suggestions!).

    You haven't said what operating-system you're using, how your mount was configured, or how you're starting docker or your containers. An external drive is the normal way to do this, though, and I do it on Linux with ZFS drives and docker-compose auto-starting the containers and it works fine.