Current Fediverse Implementation

From my understanding, the prominent fediverse implementations implement fanout via writing to other instances.

In other words, if user A on instance A makes post A, instance A will write or sync post A in all instances that have followers for user A. So user B on instance B will read post A from instance B.

Why this is Done

From my understanding, to prevent a case where post A is viral and everyone wants to read it, and instance A’s database gets overwhelmed with reads. It also serves to replicate content

My Question: Why not rely on static files indeed of database reads / writes to propagate content?

Instead of the above, if someone follows user A, they can get their timeline via a static file that contains all their posts.

Reading this file will be a lot less resource intensive than a database read, and with a CDN would be even better.

This introduces one issue: when post A is made, the static file must be updated. This may be slower to read-after-write than a database, and if using a CDN have to deal with invalidations and TTLs, which makes posts a lot less “real time”

But the benefit is that hosting a fediverse server is more accessible and cheaper, and it could scale better. Federation woes of posts not federating to other instances can potentially be resolved.

What are your thoughts on this? Please be kindly constructive, I do not claim to be an expert and this is simply an idea that came to mind.

    • matcha_addict@lemy.lolOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      17 hours ago

      Oh my bad, I can explain that.

      Before I do, one benefit of this method is that your timeline is entirely up to your client. Your instance becomes primarily tasked with making your posts available, and clients have the freedom of implementing the reading and news feed / timeline formation.

      Hence, there are a few ways to do this. The best one is probably a mix of those.

      Naive approach: fetch posts and build news feed when user requests it

      This is not a good approach, but I mention it first because it’ll make explaining the next one easier.

      • User opens app or website, thereby requesting their timeline / news feed
      • server fetches list of user’s subscriptions and followees
      • for each followee or subscription, server fetches their content via their static file wherever they are hosted
      • server performs whatever filtering and ordering of content they want
      • user sees the result

      Cons: loading time for the user may be long, depending on how many subscriptions they have it could be several seconds. P90 may even be in double digits.

      Better approach: pre-build user’s timeline periodically.

      Think like a periodic job (hourly, or every 10 min, etc) , which fetches posts in a similar manner as described above, but instead of doing it when user requests it, it is done in advance

      Pros:

      • fast loading time compared to previous solution
      • when the job runs, if users on the same instance share a followee or subscription, we don’t have to query it twice (This benefit already exists on current fediverse implementations) Cons: posts aren’t real-time, delayed by the batch job frequency.

      Best approach: hybrid

      In this approach, we primarily do the second method, to achieve fast loading time. But to get more up-to-date content, we also simultaneously fetch the latest in the background, and interleave or add the latest posts as the user scrolls.

      This way we get both fast initial load times and recent posts.

      Surely there’s other good approaches. As I said in the beginning, clients have the freedom to implement this however they like.