Post Au1gNRWoSL7HyQV1Ky by [email protected] | |
More posts by [email protected] | |
Post #Au15BOOoZETYUS9ISu by [email protected] | |
0 likes, 4 repeats | |
I noticed that a *lot* of the crawlers/bots we see on www.bbc.co.uk & www.b… | |
Post #Au15BOY20xqex2Sepc by [email protected] | |
0 likes, 1 repeats | |
@tdp_org Fantastic! Are you at liberty to say how your classifier(s) are implem… | |
Post #Au16GJNWlHi2JwSLz6 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org hot damn, that's a big difference | |
Post #Au16GJTuNYoUdjRRvk by [email protected] | |
0 likes, 0 repeats | |
@gsuberland Definitely highlights how spammy the web is eh? Wild West++ 𤣠| |
Post #Au16GusfKTbMdEJF9k by [email protected] | |
0 likes, 0 repeats | |
@tdp_org Next step: immediate IP block. I have strong feelings towards the comp… | |
Post #Au16Guzku7GyzDcuCu by [email protected] | |
0 likes, 0 repeats | |
@ross Bad news: Many of these are plain old smartphones where the flashlight ap… | |
Post #Au16H8vf68JiCZoplo by [email protected] | |
0 likes, 0 repeats | |
@tdp_org does the filter just match known crawler user agents against their kno… | |
Post #Au16H91gjj8aVGdeAC by [email protected] | |
0 likes, 0 repeats | |
@kfh Yeah, more or less. It identifies the crawler/bot via the user-agent strin… | |
Post #Au16HfD94fJeK4FFSq by [email protected] | |
0 likes, 0 repeats | |
@piegames six hour IP block with a message to uninstall malware from your devic… | |
Post #Au16HuSwMDrZdOkZfM by [email protected] | |
0 likes, 0 repeats | |
@davidgerard ā¬ļø | |
Post #Au16JT6d3GeTMI3crA by [email protected] | |
0 likes, 0 repeats | |
@tdp_org i assume you are aware that also meta provides official guidance on ho… | |
Post #Au16JTF8XdSPmg2Q7M by [email protected] | |
0 likes, 0 repeats | |
@slink Yep, that's what I followed for them šš» | |
Post #Au16K6hrokc0B4mPoW by [email protected] | |
0 likes, 0 repeats | |
@tdp_org i won't describe the mitigation we applied on rationalwiki, but we… | |
Post #Au16Llz7pacMUQe9Vw by [email protected] | |
0 likes, 0 repeats | |
@tdp_org do you do anything different based on if a bot is known or not? | |
Post #Au16Lm6DPEHyqPxoZ6 by [email protected] | |
0 likes, 0 repeats | |
@q We (in the distribution team) don't - these data are informational right… | |
Post #Au16M6DKcTrhEHRmQy by [email protected] | |
0 likes, 0 repeats | |
@dngrs yep - same botnet model LWN saw a while ago https://lwn.net/Articles/100… | |
Post #Au16MztlaPUNvoiwiG by [email protected] | |
0 likes, 0 repeats | |
@tdp_org @gsuberland thatās entirely consistent with what Iāve seen recentl… | |
Post #Au16OC0k2P2PmsYYrI by [email protected] | |
0 likes, 0 repeats | |
@tdp_org You could poison the well withhttps://come-from.mad-scientist.club/@al… | |
Post #Au19MxvBElXnf63sgK by [email protected] | |
0 likes, 0 repeats | |
@tdp_org This is interesting information and certainly something I will conside… | |
Post #Au19RkwstcNasO8w4W by [email protected] | |
0 likes, 0 repeats | |
@tdp_org nice. | |
Post #Au19T60I6FQCSzNKPA by [email protected] | |
0 likes, 0 repeats | |
@tdp_orgI can see this escalating, next is that they use semi random user-agent… | |
Post #Au19UozGL1B0GmiW9o by [email protected] | |
0 likes, 0 repeats | |
@kasperd Infeel like the definition of ālegitimate crawlerā is getting hard… | |
Post #Au1FWQqfefwni2qjjc by [email protected] | |
0 likes, 0 repeats | |
@tdp_org take a look at https://datatracker.ietf.org/doc/html/draft-meunier-web… | |
Post #Au1FWwlSzYnYhAf248 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org Is the bottom line zero or is it some other fixed number?Looks crazy h… | |
Post #Au1FZX4oxHLhtWwkRk by [email protected] | |
0 likes, 0 repeats | |
@jalict Yeah the bottom line is zero - I can't bear non zero-based graphs ļæ½… | |
Post #Au1FZXAUcBt0B7bHHs by [email protected] | |
0 likes, 0 repeats | |
@tdp_org Ah, so you are still serving all the crawlers? | |
Post #Au1FaHWkFn1BxSMhMG by [email protected] | |
0 likes, 0 repeats | |
@paco @tdp_org Where to draw the line is certainly a tricky question.As long as… | |
Post #Au1FcSKh38ao3AlToe by [email protected] | |
0 likes, 0 repeats | |
@paco @kasperd @tdp_org a little OT but FWIW the magic search URL param to prev… | |
Post #Au1FdVtLIrvlNSTZLc by [email protected] | |
0 likes, 0 repeats | |
@SimmerVigor Oh nice! From the title and ToC that looks very much like the sort… | |
Post #Au1FdqkXkmN04FWL32 by [email protected] | |
0 likes, 0 repeats | |
@ross @piegames I don't even bother with the time window. Permablock.I have… | |
Post #Au1FermVP4NBYZvILw by [email protected] | |
0 likes, 0 repeats | |
@jalict Those which aren't "blocked" by robots.txt or blocked bec… | |
Post #Au1KJjXtKOUXhgpW6a by [email protected] | |
0 likes, 0 repeats | |
@piegames @ross From a personal experience, on Android this is trivially easy a… | |
Post #Au1KKXBGmm6be0d9mq by [email protected] | |
0 likes, 0 repeats | |
@tdp_org So crawlers definitely not identifying themselves appropriately, if I … | |
Post #Au1KKXHeP3D3xncFjU by [email protected] | |
0 likes, 0 repeats | |
@gimulnautti Yeah. Someone is spoofing the bot user agents. Not sure why, maybe… | |
Post #Au1KMtPr5QWVBWZrn6 by [email protected] | |
0 likes, 0 repeats | |
@[email protected] i only run my small personal website but i was plannin… | |
Post #Au1KQjBv2mn6dDt66y by [email protected] | |
0 likes, 0 repeats | |
@tdp_org It doesnāt seem odd at all, if the product of your company requires … | |
Post #Au1KQjJ0cQSizDClA8 by [email protected] | |
0 likes, 0 repeats | |
@gimulnautti @tdp_org It might not actually be spoofing.Big Tech has started in… | |
Post #Au1KQjQSAkPvMIghlY by [email protected] | |
0 likes, 0 repeats | |
@ck @tdp_org omg, when you thought it couldnāt get worse⦠š¤¦āāļø | |
Post #Au1KRiFa1mxs0F4Nmq by [email protected] | |
0 likes, 0 repeats | |
@tdp_org What tools do you use for ASN validation by the way? | |
Post #Au1KRwQ1LldPwg3v0K by [email protected] | |
0 likes, 0 repeats | |
@tdp_org So what happens to requests from something that identify themselves as… | |
Post #Au1KTB5oHIDhkICMgy by [email protected] | |
0 likes, 0 repeats | |
@losttourist No, this is just stats/reporting (at the moment at least) | |
Post #Au1KVlWFoeRTIdnk7k by [email protected] | |
0 likes, 0 repeats | |
@ck @gimulnautti @tdp_org this is the worst -- have you read about this practic… | |
Post #Au1RGxtGEneH0GS6Ii by [email protected] | |
0 likes, 0 repeats | |
@edsu @gimulnautti @tdp_org I belive @jwildeboer has blogged about this. | |
Post #Au1RHfOySlZ9x4cXCa by [email protected] | |
0 likes, 0 repeats | |
@ck @gimulnautti Yeah i have heard of that too - never underestimate how shady … | |
Post #Au1RJiQ6DEvXtL7j5U by [email protected] | |
0 likes, 0 repeats | |
@piegames @ross This is a link I saved, there are probably many similar blogs o… | |
Post #Au1RJss7LvHUJ8JF9E by [email protected] | |
0 likes, 0 repeats | |
@ck @gimulnautti @tdp_org aw Jesus F Christ | |
Post #Au1RKSurLCrg63rwQ4 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org that's a writeup a lot of people would be interested in I think. | |
Post #Au1RKdblaX81G21Lge by [email protected] | |
0 likes, 0 repeats | |
@ross @piegames it's sad, but it's about time people are made accountab… | |
Post #Au1RKdjZ7XMneDfZqK by [email protected] | |
0 likes, 0 repeats | |
@bilboed No. Don't push systemic issues onto individual responsibility. Bei… | |
Post #Au1RKdpwjoTFy0efmy by [email protected] | |
0 likes, 0 repeats | |
@piegames Good luck convincing app stores of that. They won't until there&#… | |
Post #Au1RNy286WcNTRwJNY by [email protected] | |
0 likes, 0 repeats | |
@ck @gimulnautti @tdp_org @jwildeboer ah thanks for the pointer, found this one… | |
Post #Au1RQ2Lj1oLdRI5lh2 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org ASN = Autonomous System Number in case this helps anyone besides me. A… | |
Post #Au1RUFH8uYlQgexIHo by [email protected] | |
0 likes, 0 repeats | |
@tdp_org What I don't understand is, I had expected the improved detection … | |
Post #Au1TCmOdEKKW6aG2Sm by [email protected] | |
0 likes, 1 repeats | |
@itgrrl yeah. I heard about that. I havenāt used Google to search in years. N… | |
Post #Au1agA0vCOE9Qm7mEK by [email protected] | |
0 likes, 0 repeats | |
@paco OMG that really deserves to be the name of a no-slop search engine. FuckF… | |
Post #Au1b6PeXfxefSoAtN2 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org nice! I never ran the stats like that, but yes, checking forward/rever… | |
Post #Au1gFPhZMZolzZhZxo by [email protected] | |
0 likes, 0 repeats | |
@tdp_org @SimmerVigor Not being well-mannered is definitely a problem. One of m… | |
Post #Au1gNRWoSL7HyQV1Ky by [email protected] | |
0 likes, 0 repeats | |
@ross @piegames It doesn't matter. They only scrape a few files per unique … | |
Post #Au1gOdrG6FROVGdOTo by [email protected] | |
0 likes, 0 repeats | |
@tdp_org ohhhh god yes. I love it when I make bad line on graph go down sharply | |
Post #Au1gT6GePpkLlwdZNQ by [email protected] | |
0 likes, 0 repeats | |
@bertkoor @tdp_org "Known bot" as in "bot of known identity"… | |
Post #Au1j1pSH7FOXfmlJGi by [email protected] | |
0 likes, 0 repeats | |
@ross @piegames a zip bomb is probably more effective. | |
Post #Au1j4QghelbDj9kbbs by [email protected] | |
0 likes, 0 repeats | |
@kasperd @tdp_org Is this really the current state? Pretending to be Googlebot … | |
Post #Au1j4QmNJg8W0kP8S0 by [email protected] | |
0 likes, 0 repeats | |
@feld @tdp_org There is probably a combination of both. I wasnāt the one maki… | |
Post #Au1jASt2OTgqHiOjlA by [email protected] | |
0 likes, 0 repeats | |
@paco @itgrrl @tdp_org There are two reasons thatās not a viable alternative … | |
Post #Au1o9UjyW0yfW1W2i0 by [email protected] | |
0 likes, 0 repeats | |
@tdp_org @gimulnautti Anecdotally, I think special cases allowing or treating g… | |
Post #Au2B9r9BVR4txDRsie by [email protected] | |
0 likes, 1 repeats | |
So what do you use, @kasperd ?And what is the mistrust of Microsoft _hosting_? … | |
Post #Au7Bfwf9VR767LVCHw by [email protected] | |
0 likes, 0 repeats | |
@itgrrl @paco @kasperd @tdp_org ooh thanks! I knew about udm=14 but never even … | |
Post #Au7BfwoMxAUCZvoYee by [email protected] | |
0 likes, 1 repeats | |
@DrHyde @paco @kasperd @tdp_org | |
Post #Au8RcDALzx9IZGIWpM by [email protected] | |
0 likes, 0 repeats | |
@paco @kasperd @tdp_org the shitty results and AI slop aren't intended to k… | |
Post #Au8RcDIrUJxEzeHK5Y by [email protected] | |
0 likes, 0 repeats | |
@DrHyde I am inverting the logic to make a point. Let me try to be clearer.Ever… |