this post was submitted on 08 Apr 2026
13 points (100.0% liked)

Forgejo

311 readers
2 users here now

This is a community dedicated to Forgejo.

Useful links:

Rules:

founded 2 years ago
MODERATORS
 

My instance is getting pummeled by scrapers crawling nonsense. Like issue and pull searches with every single variant of label combinations.

Everything's coming from a shitload of different residential IPs at a very fast cadence.

There's just not that much content on my instance to warrant this traffic. It could be scraped in a minute or two like this if it were legitimate traffic.

you are viewing a single comment's thread
view the rest of the comments
[–] Kissaki@programming.dev 10 points 4 days ago (2 children)

Possibly AI company crawlers. When they came up there was a lot of bad publicity and reports of actively malicious and toxic crawling behavior, including ban evasion.

You can think about locking some url paths behind valid login sessions, or use a proof of work proxy guard.

Anubis is the popular tool for that. I've seen maybe three alternatives, one of which from Cloudflare.

See also related Codeberg ticket (Forgejo instance) https://codeberg.org/forgejo/discussions/issues/319

If you search, you can find various blog posts about these issues. Not just when Forgejo.

[–] treadful@lemmy.zip 6 points 4 days ago

Possibly AI company crawlers. When they came up there was a lot of bad publicity and reports of actively malicious and toxic crawling behavior, including ban evasion.

That was kind of what I was thinking, but if that's true, they're wasting so much bandwidth and compute. Going through every combination of issue label combinations does not get them any useful code to hoover up. They could've just cloned my repos and be done with it.

You can think about locking some url paths behind valid login sessions, or use a proof of work proxy guard.

Anubis is the popular tool for that. I’ve seen maybe three alternatives, one of which from Cloudflare.

Really don't want to Cloudflare, but Anubis is interesting. If I can't shake these bots, maybe I'll consider this. Thanks.

[–] Eezyville@sh.itjust.works 1 points 3 days ago (1 children)

If you think it's AI then maybe you can get another AI to write bad code and poison their training data.

[–] Kissaki@programming.dev 2 points 3 days ago (1 children)

There's a tool for that too - I don't have the link or name at hand though