So, I’ve started my own Lemmy instance. The main issue is that right now, I am the only user, which makes it pretty easy for anyone to see what kinds of communities I visited, or am subscribed to. Is there any way to automate creation of some amount of accounts, and subscribing to random communities?
You could disable web interface access to block easy scraping. Unauthenticated users only need a few ActivityPub routes with very specific Content-Types to make federation work.
You can put the web UI and Lemmy API behind some kind of auth screen (you can use Caddy or Apache+OpenID to block access to URLs in your proxy unless the user is authenticated, of example) but that would break most apps. You could also whitelist your personal IP range or require a VPN for the frontend.
Your comment history will be visible to other servers so you’ll probably spread information that way. I can think of workarounds but they require patching the Lemmy source code. You could probably patch the Lemmy code to pick a random username for each comment to block other servers from tracking your comment history as easily (though server admins can still get all the comments for your domain, of course). Alternatively, you could make implement a 4chan-style “everyone is anonymous” system where all accounts turn into @anonymous@yourserver.tld after posting by faking the data that gets rendered to the frontend. If you allow multiple people on your server, you’d all appear (and get banned/moderated) as one single user, but probably without breaking functionality (because the local database can still keep track of who actually owns what posts).
I think hiding the web UI and Lemmy API would probably block most scrapers. You can also mess with scrapers (feed the web UI fake data when an unauthenticated user queries it) if you really want. Your post history cached on other servers will be your biggest privacy challenge.
AFAIK post history is always public, like Reddit. I’m mainly concerned about subscription list
In that case simply blocking access to the Lemmy API (not the federation API) from IP addresses should be sufficient.
Running one of the various subscriber bots would make your own subscriptions simply be part of the noise otherwise on the server. The downside of course being that now you have a pile of noise to sift through. Left one sit for about a week onine and emded up with around 2000 communities subscribed.
How does that make it easy for others to know the comunities you visit / subscribed to?
Only communities a user subscribes to get federated over.
clicks username
views comments you’ve posted
walla my egyptian friend
… okay? But if i subscribe to every lemmynsfw community, but never post to them… you’d have no idea.
With your own instance, looking at the instance list will show them all to anyone.
If you are the only user on an instance, your subscriptions are the only ones federating over into the server’s All feed. For example, even if you haven’t posted in all of these communities, is this not essentially your personal list of subscriptions?
That’s precisely the issue I’m talking about
This isn’t entitely true. I sometimes see communities I’m not subscribed to pop up on my server. I think it has to do with making the server look up a community by clicking on a link to it? I think one of the frontend I’ve tried prefetches links or something, because I’ve had to get rid of a few NSFW communities in my server that appeared without me subscribing to them.
Most of the communities in the public list are indeed the ones a servers sole user is subscribed to for personal instances.
Oh that’s interesting. I guess when your instance creates a local copy of the post, it would also add the corresponding community to the list to match.
…walla?
Voila
ooooh of course. I should’ve just tried saying it out loud haha
Maybe by monitoring federation data, or seeing which communities have been fetched?
I know that if you’re the first person in an instance to look at a community, it won’t load right away. However I’m not sure how someone would monitor that (or why they would want to)
If there’s only one user that instance’s “all” feed will be indistinguishable from the user’s subscription feed.
(unless you do some community seeding)
Why do you need to automate it and do multiple decoy accounts? Can’t you just make a single account and use it to subscribe to a bunch of the biggest communities?
If your concern is about your instance’s publicly visible /instances list, can’t you just make it private? Or even make the entire web interface private? You’re the boss, after all.
I’m afraid making your instance private disables federation.
As for making the web interface private, while it would prevent the average Joe from seeing federated communities, you could still do it through the API, which you have to keep public if you want to use alternative and mobile clients without a VPN.
deleted by creator
Why not just manipulate the API before it is delivered then?
You can probably make the list private by blocking the specific API endpoint or rewriting the JSON output in your reverse proxy. You could patch the Lemmy source code as well, of course.
External servers will still be able to see what remote sites you visit, but there’s no reason you couldn’t at least block the scrapers by messing with the JSON API. ActivityPub doesn’t mandate anything about these lists so federation should still work even if you disable 90% of the Lemmy API.
People were talking about a script that would go out and subscribe to a bunch of communities. As long as that’s better enough should be able to operate under that umbrella?
I did this on my instance. You create a new user and give the script those credentials, it goes out and subs to all the trending communities across the various instances so now my instance has a big mish-mash of communities federated, not just the ones I originally subscribed to on my personal user.
That’s exactly what I need. Could you share the script you use?
I finally found the script, sorry for the massive delay.
https://github.com/Fmstrat/lcs
It’s named “Lemmy Community Seeder” which is probably why I couldn’t find it anywhere, I wasn’t searching for those terms.
I can’t seem to find it in a Google search now so I’ll take a look at my server in case I ran the script there and saved a copy. There appear to be a lot of similar tools now to assist with people moving over from Reddit but this script was really quick and handy.