This may sound like a weird thing to do, but I realised that many crawlers and bots are somehow still able to get past my Anubis. I presume they have gotten smarter and are capable of using JavaScript.
To counter this, I want to link my Anubis to an Iocane setup such that:
Internet > nginx reverse proxy > Anubis > Iocane > my site/app
My hope is that two different filtering mechanisms (one of which will actively poison and waste the bot’s resourced) will protect my system better.
I thought I’d ask before actually trying out something like this.
Does anubis actually catch anything?
Have you tried fucking with the status codes?
There is a great defcon talk about that:
So you could e.g. return a 401 and still show the page. Most automated systems will probably ignore the response of an ‘unauthorized’ message.
Context:
https://en.wikipedia.org/wiki/Anubis_(software)
Anubis is an open source software program that adds a proof of work challenge to websites before users can access them in order to deter web scraping. It has been adopted mainly by Git forges and free and open-source software projects.[4][5]
Iocaine is a defense mechanism against unwanted scrapers, sitting between upstream resources and the fronting reverse proxy.
Iocaine expects you know how to detect it the bots, if they can get past anubis do you have another detection process?




