join-lemmy.org regularly crawls all active Lemmy instances to keep the instance list updated. Additionally it also collects data from all Lemmy communities. The data is now publicly available in the following git repository:

https://github.com/LemmyNet/lemmy-statistics

Check the readme for details about the available data. Interestingly the numbers are quite different from other websites:

join-lemmy.org fediverse.observer fedidb.com
Monthly Active Users 42.170 36.336 50.063
Instances 512 376 446

Here are some ideas what to do with the data:

  • Recreate the Lemmymap, graphically showing the connections or defederations between instances.
  • Render graphs, which could be added directly to join-lemmy.org (#532).
  • Investigate what is causing the different numbers shown above.
  • Run various types of analysis, like this one done by @malsadev.
  • Build a tool to help users discover interesting and relevant communities.
  • Nutomic@lemmy.mlOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    Good job! I wonder why some of these are missing from fediverse.observer. There is an add instance page, and entering for example aussie.zone says it already exists, but the search doesnt find it. Lemmy.cafe, fosscad.io are included in the statistics (file instances/full.json.gz in the git repo), but not currently shown on the website.

    Thanks to your comment I realized that we are only showing instances with registration application on the official site, as there was concern that others would be overrun by spam bots. I had a look at it now, and instances with captcha are actually fine. Here you can see all that are newly listed on joinlemmy.

    The instance list is sorted by monthly active users with slight randomization. As these instances you mention are among the larger ones, it is expected that they show near the top. Is there a better sort method that you would suggest?