EventFrontier
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
tofu@lemmy.nocturnal.garden to Selfhosted@lemmy.worldEnglish · 11 days ago

Guarding My Git Forge Against AI Scrapers

vulpinecitrus.info

external-link
message-square
12
fedilink
73
external-link

Guarding My Git Forge Against AI Scrapers

vulpinecitrus.info

tofu@lemmy.nocturnal.garden to Selfhosted@lemmy.worldEnglish · 11 days ago
message-square
12
fedilink
Guarding My Git Forge Against AI Scrapers - VulpineCitrus
vulpinecitrus.info
external-link
A summary of the techniques in place to protect my git forge

Cross posted from: https://lemmy.nocturnal.garden/post/407947

alert-triangle
You must log in or register to comment.
  • Lemmchen@feddit.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 hours ago

    https://her.esy.fun/posts/0031-how-i-protect-my-forgejo-instance-from-ai-web-crawlers/index.html

    Also: https://iocaine.madhouse-project.org/

  • IanTwenty@piefed.social
    link
    fedilink
    English
    arrow-up
    21
    ·
    11 days ago

    Self-hosting anything that is deemed “content” openly on the web in 2025 is a battle of attrition between you and forces who are able to buy tens of thousands of proxies to ruin your service for data they can resell.

    This is depressing. Profoundly depressing.

    Sigh

  • xyro@lemmy.ca
    link
    fedilink
    English
    arrow-up
    11
    ·
    11 days ago

    Good read, I use crowdsec to block most of the traffic considered as malicious (which tend to overlap with scrappers), but I should look into Locaine to feed garbage to the remaining ones instead of throttling.

    • JustTesting@lemmy.hogru.ch
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      10 days ago

      it’s iocaine not Locaine, tripped me up at first as well.

  • magic_smoke@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    11
    ·
    11 days ago

    I would like to know how well iocaines spanky new redirection module works compared to Anubis.

    If nothing else, to see if throwing Anubis in front of iocaine is still a worthwhile idea.

  • Selfhoster1728@infosec.pub
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 days ago

    Build your own captcha, there’s just no other way to be sure it’s human traffic with prebuilt solutions :(

  • False@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    11 days ago

    deleted by creator

  • Possibly linux@lemmy.zip
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    3
    ·
    11 days ago

    Honesty we need a POW system built into http.

    • DaGeek247@fedia.io
      link
      fedilink
      arrow-up
      5
      ·
      10 days ago

      POW built in to the web spec would be hell. Making every single device in the world do that extra bit of work would noticeably affect energy use across the planet.

    • tofu@lemmy.nocturnal.gardenOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 days ago

      A what? Not prisoners of war I guess

      • billhead@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        11 days ago

        I’m guessing Proof of Work.

        • tofu@lemmy.nocturnal.gardenOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 days ago

          Oh right. The article author isn’t a fan of that. I guess it’s fine while it works but I’m not too optimistic about how long it does

          • DaGeek247@fedia.io
            link
            fedilink
            arrow-up
            2
            ·
            10 days ago

            Yeah. The corporations with money are always going to beat the casual users without in regards to processing capability.

            There are smarter ways to discourage the big companies from taking pictures of your house than by adding speed bumps to your driveway.

Selfhosted@lemmy.world

selfhosted@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

  • selfh.st Newsletter and index of selfhosted software and apps
  • awesome-selfhosted software
  • awesome-sysadmin resources
  • Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 243 users / day
  • 2.07K users / week
  • 6.54K users / month
  • 15.3K users / 6 months
  • 1 local subscriber
  • 53.8K subscribers
  • 2.85K Posts
  • 59.1K Comments
  • Modlog
  • mods:
  • Ruud@lemmy.world
  • Loki@lemmy.world
  • CannaVet@lemmy.world
  • devve@lemmy.world
  • HybridSarcasm@lemmy.world
  • HybridSarcasm@lemmy.hybridsarcasm.xyz
  • BE: 0.19.8
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org