Introduction
Introduction Statistics Contact Development Disclaimer Help
Post AwWoFBKvggNkZeIkFM by [email protected]
More posts by [email protected]
Post #AwWc3adEcEMC6JZNHk by [email protected]
0 likes, 0 repeats
Tried to use /robots.txt to tell bots to stay out. The bots' response: &quo…
Post #AwWcJDCTV9iM3vVBbc by [email protected]
0 likes, 0 repeats
@nixCraft Fascists don't follow rules either.I'm wondering whether ther…
Post #AwWcT7qQGiUsLcul7I by [email protected]
0 likes, 0 repeats
@nixCraft Maybe feed them a zip bomb, if they go for a Disallowed file?https://…
Post #AwWcruItkXwHak3Bke by [email protected]
0 likes, 0 repeats
@nixCraft if you have a crawler honey trap, including it in robots.txt would en…
Post #AwWe47ffyYzcTl5qXA by [email protected]
0 likes, 1 repeats
@nixCraft If this would work, Stackoverflow would be dead by tomorrow. I wouldn…
Post #AwWf2EZVRtZlFts2r2 by [email protected]
0 likes, 0 repeats
@nixCraft I wonder when they started doing it. Looks like it's a recent thi…
Post #AwWghvN3KMc5p0UNVo by [email protected]
0 likes, 0 repeats
@nixCraft @Matti_Vuori It has been always useless. So, nothing has changed.
Post #AwWgj6Ji7EJc3mUEwC by [email protected]
0 likes, 0 repeats
@nixCraft “The solution to surveillance is pollution”. it’s the uniquene…
Post #AwWhGb7oanuvALtZFw by [email protected]
0 likes, 0 repeats
@nixCraft robots.txt is like asking a bully to not bullying you 🙃
Post #AwWijGqVAYYuch27nc by [email protected]
0 likes, 0 repeats
@nixCraft I had to use Anubis and IPFire to block LLM scrapers. Works for the m…
Post #AwWjVfMTK8gOYb418S by [email protected]
0 likes, 0 repeats
@oe_simon 404 for the better! ☝️🏾😁@nixCraft
Post #AwWnSpigks1ndAMO3M by [email protected]
0 likes, 1 repeats
@nixCraft Man, if only we could have a far more accurate version of the CAPTCHA…
Post #AwWnq9bwxvpFz0KQIy by [email protected]
0 likes, 1 repeats
@heartshadows @m @nixCraft We can beat em by their own game, host the site insi…
Post #AwWoFBKvggNkZeIkFM by [email protected]
0 likes, 0 repeats
@nixCraft put a trap in place like disallow /list_of_politicians_that_received_…
Post #AwWp8DkmP2PVzsdZFA by [email protected]
0 likes, 0 repeats
@nixCraft Obeying your rules is optional? Well, guess what?
Post #AwWpaN3DQf0QwNgQkK by [email protected]
0 likes, 0 repeats
@adipoeserPursch @nixCraft eher nicht. 404 signalisiert: hier ist was nicht in …
Post #AwWpauzVEyEqCiCwgC by [email protected]
0 likes, 0 repeats
@heartshadows @nixCraft The EU text and data mining exemption gives content pro…
Post #AwWpd3QKrI8Mw9mEpU by [email protected]
0 likes, 0 repeats
@nixCraft actually, the lesser known sitemap directive encourages crawling, fro…
Post #AwWr0uQO4K0bvljwQa by [email protected]
0 likes, 0 repeats
@oe_simon Nein. Hier ist nix, also kann nix gescraped werden... 😁@nixCraft
Post #AwWsJx3QcEPXXie3mq by [email protected]
0 likes, 0 repeats
@nixCraft can we just start adding ridiculous terms of service to robots.txt so…
Post #AwYBW1n6vGRPzoRczw by [email protected]
0 likes, 1 repeats
@nixCraft that is how SO’s robots.txt looks today. It was different a few mon…
Post #AwYruB8O25Wlzwaw64 by [email protected]
0 likes, 0 repeats
@nixCraft major search engines have other ways to crawl content. One of them is…
You are viewing proxied material from pleroma.anduin.net. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.