Anubis and caddy-docker-proxy
As part of my work infrastructure I run an instance of the Comprehensive Knowledge Archive Network (CKAN) as a data center. Things were running along nicely for quite some time before I was alerted to the site being down yesterday afternoon.
Investigating things I found the server was getting hammered by requests with rotating IPs. It wasn’t so bad in the grand scheme of things (approximately 60 requests/second) but it was still enough to break access to the site.
Why can’t we have nice things?
I brought the site down while I figured out what I would do to mitigate this. No need to incur the egress traffic costs from the Google cloud. Investigating a little further showed me that a large sample of the IPs where being generated from Brazil.
When something like this happens I usually turn to the pixls-admin Matrix channel where I can ping my favorite computer nerds, @darix and mica. Not surprisingly, moments after lamenting my problem they piped up with a suggestion to look at Anubis.
Anubis
On it’s homepage, Anubis states that it will:
Weigh the soul of incoming HTTP requests using proof-of-work to stop AI crawlers
Well, this sounds like exactly what I need! The basic idea is that a request to my website hits Anubis where a challenge is presented to the client and a service worker on the client to compute an SHA256 hash. If successful, the challenge is validated by Anubis after which it sets a cookie that validates the (presumably valid) browser for a while (a week by default).
Subsequent requests to the site don’t need to re-validate if this cookie is set.
All of my services on my server are run as docker containers and I happen to use the awesome caddy-docker-proxy to do all of my proxying. This means that all I had to do to use Anubis was fire up the containers and tell caddy where to route things. This is a bit different than a normal caddy deployment due to the way that caddy-docker-proxy defines its caddyfile through the use of labels in the service compose files.
For instance, with caddy-docker-proxy handling all of my traffic, in my CKAN compose file I just need to include a labels
section in the compose file for my CKAN instance to define the routing:
...
labels:
caddy: data.disl.edu
caddy.reverse_proxy: "{{upstreams 5000}}"
...
When this service starts up caddy-docker-proxy will read the labels section and add it into an in-memory caddyfile automatically.
Adding Anubis
For convenience I just added the services for anubis
and httpdebug
into my compose file for caddy-docker-proxy
. (This is the pretty much the same as the docs for using it with the plain caddy
container.)
...
anubis:
image: ghcr.io/techarohq/anubis:latest
pull_policy: always
environment:
BIND: ":3000"
TARGET: "http://ckan:5000"
networks:
- caddy
httpdebug:
image: ghcr.io/xe/x/httpdebug
pull_policy: always
...
I did set the TARGET
environment variable to point to my CKAN container for when a valid request comes in.
I also added the anubis
container to my caddy
network.
Over on the CKAN container side of things, I just had to modify my labels to use anubis and forward some headers (again, per the anubis docs for a plain caddy
container):
services:
ckan:
...
labels:
caddy: data.disl.edu
caddy.reverse_proxy: http://anubis:3000
caddy.reverse_proxy.header_up_1: "X-Real-Ip {remote_host}"
caddy.reverse_proxy.header_up_2: "X-Http_Version {http.request.proto}"
...
Restart the containers and let Anubis do it’s thing!
Checking my docker logs for my caddy compose file shows all of the bot requests now being DENIED:
anubis | {"time":"2025-05-08T22:13:30.989738613Z","level":"INFO","source":{"function":"github.com/TecharoHQ/anubis/lib.(*Server).checkRules","file":"github.com/TecharoHQ/anubis/lib/anubis.go","line":164},"msg":"explicit deny","user_agent":"Opera/9.47.(X11; Linux x86_64; quz-PE) Presto/2.9.173 Version/10.00","accept_language":"en","priority":"","x-forwarded-for":"45.234.232.24","x-real-ip":"45.234.232.24","check_result":{"name":"bot/deny-aggressive-brazilian-scrapers","rule":"DENY"}}
Yay! And just like that I was able to connect to, and use, my site again.
References
- caddy-docker-proxy: https://github.com/lucaslorentz/caddy-docker-proxy
- anubis: https://github.com/TecharoHQ/anubis