UPDATE (4/27): There were two reasons that affected federation between Ani.Social and Lemmy.World.
The first reason was an outgoing activity that got stuck in queue. This was resolved by manually marking that activity as successful.
The second reason was a subsea fiber cut between Europe and Asia which caused incoming activities from Lemmy.World to lag behind.
The original plan was to move Ani.Social closer to Lemmy.World this weekend but seeing the activity queue rapidly dropping over the past two days, it’s unlikely this will push through. We will continue to monitor this however.
Again, thanks to @wjs018 for informing me about this and for keeping others up to date in this post’s comments. Thank you MrKaplan from Lemmy.World for helping us resolve the issue!
There are ongoing issues between incoming and outgoing activities between Ani.Social and Lemmy.World. Posts, comments, and votes from both instances are not in sync with each other.
The exact cause is to be determined. The instance may experience downtime and interruptions until further notice.
In the meantime, we suggest using an account on an instance that is federated with both Ani.Social and Lemmy.World.
Thanks to @wjs018 for informing me about this issue. Thanks also to MrKaplan from the Lemmy.World team for continuously helping us resolve this issue.
The outbound federation to lemmy.world has been resolved at this point (thanks hitagi!), but the inbound federation from lemmy.world continues to be an issue. I don’t really have any insight into solutions for that as it seems it might be due to physical constraints similar to what has been plaguing the AU/NZ servers.
Just a heads up to moderators of ani.social communities though. Because the world version of any community here is going to be a couple hours delayed (currently ~8 hours), it might make sense to have an account on world that you appoint as a mod to your community. This lets you take moderator actions on things that are in the world version of your community even before it federates over instead of having to wait hours for the spam/whatever to federate over before you can remove it.
Tagging @MentalEdge@sopuli.xyz to let you know since you probably have some of the more active communities on the instance.
Ah… I get it.
I couldn’t understand why I keep duplicating Kadath’s posts. I was positive I was checking sufficiently to be sure chapters hadnt been posted yet, but then I’d come back later and find that I’d doubled one of her posts.
Because her account is on .world.
I expect there’ll turn put to be some duplicates in the latest batch too.
So I guess when I see that something I really expect to have already been posted hasn’t been (like just a bit ago with Shiretto Sugee Koto Iteru Gal), I need to just let it go, since the odds are that it has already been posted and it’s just that .world hasn’t let us know yet…
Yeah, it’s a weird situation. I am not removing duplicates that haven’t federated over yet. Until posts federate to ani.social, no other, third-party instances see them. The situation is growing worse too. The current delay is up to ~12 hours behind.
Edit: Currently improving, back down to ~8.5 hours.
deleted by creator
Wouldn’t be the first time lemmy.world had federation issues, I’m honestly surprised how quickly Lemmy ran into scaling problems of this nature.
The federation delays seem to be getting better over the past day or so. So, whatever you are doing seems to be working. It seems like the PR to engineer out a lot of the latency sensitivity of Lemmy federation is still a WIP, so this might be something that comes and goes as the network topology and routing changes day to day. Almost like federation weather.
Alright, I have been doing some poking around the grafana dashboard and noticed that about 20k activities/hour (~ 6 per second) seems to be the limit that ani.social can process coming in from lemmy.world. Whenever the activity peaks on world go over that (generally EU afternoon/NA morning), we start to lag a bit. Then, after the peak has subsided, we catch up.
All this really seems like it is putting a pretty hard limit on how big the fediverse could actually grow without federation becoming completely impossible. I was reading up on efforts that reddthat has undertaken to improve federation from world (since they are in AUS). Their EU-based proxy seems to have worked well, but even with batching like this, federation is always going to be a lot of bandwidth and message passing between servers that just might not scale past a certain point. Anyway, I am off topic.
In any case, the lag seems like it will be coming and going with a bit of regularity, kind of like fediverse tides.
The latency limit is caused by the activity queue that was introduced in v19.
Servers can only talk as fast as round time allows, because Lemmy instances now keep track that each event actually does get federated, and in the right order.
That last point means each event only gets sent once acknowledgement of the last one is received, creating a hard limit for how many events can be communicated, depending on ping. A mere two per second with a latency of 500ms.
This serial process will obviously need to be parallelized. But that’s difficult.
I was wondering what was happening, so I ended up just creating an account here. Especially since I probably spend most of my time either here or on programming.dev.
Thank you for signing up! I’m still continuously trying to work around this issue.
Oh no worries. I’ve really enjoyed the time spent here, so you must be doing something right! :D
Random thought, what would happen if there was an ani.social cache server at the same datacenter as Lemmy.world?
That’s what Reddthat is doing from my understanding but their project doesn’t have enough documentation for me to understand how to use it. I could move ani.social to Europe or US but I’d prefer to have the server closer to me (and to avoid centralizing all Lemmy instances in two regions).
I think the Lemmy devs are working on it but it looks like a fix isn’t coming until 19.5.
In this case, I might just move ani.social already (likely tomorrow). At least temporarily until a new fix comes.
So, what happen when queue drop ?
When the queue drops it means Ani.Social is catching up with activities sent from Lemmy.World. Ideally we want to be at 0.
I know this is an old thread and the problem has been resolved after the migration, but I did just want to update that the PR to address the latency sensitivity to federation was merged. I saw nutomic on matrix say that they are actively using it in production on lemmy.ml in order to test it. So, hopefully it will be part of the next lemmy version.
That’s great! Good to see there’s progress on this issue. Not sure how long it will take for Lemmy.World to adopt it though but I’m sure they’ll catch up. They’re always a version behind because they’re a big instance.
deleted by creator