this post was submitted on 30 May 2025
406 points (99.0% liked)

Programmer Humor

24347 readers
849 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
top 10 comments
sorted by: hot top controversial new old
[–] HappyFrog@lemmy.blahaj.zone 46 points 3 weeks ago (1 children)

As long as the scrapers follows robots.txt

[–] Jankatarch@lemmy.world 37 points 3 weeks ago (2 children)

It's equivalent to "the code."

[–] dejected_warp_core@lemmy.world 2 points 2 weeks ago

It really should be "parlay.txt".

[–] TropicalDingdong@lemmy.world 26 points 3 weeks ago

beautiful soup

[–] mspencer712@programming.dev 15 points 3 weeks ago (1 children)

I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

[–] erytau@programming.dev 5 points 3 weeks ago (1 children)

Fourth panel as well, with those bots collecting data for AI training that don't respect your robots.txt, change user agents and overload your servers

[–] dejected_warp_core@lemmy.world 1 points 2 weeks ago

War boys from Fury Road?

[–] Kojichan@lemmy.world 2 points 2 weeks ago

I just recently seen a python scraper in my server logs earlier today. Strangest thing to see.

[–] shiroininja@lemmy.world 1 points 3 weeks ago

Love me some Scrapy spiders