Pat David Avatar
ABOUT ARCHIVE
Theme
3 min read

Anubis and caddy-docker-proxy

As part of my work infrastructure I run an instance of the Comprehensive Knowledge Archive Network (CKAN) as a data center. Things were running along nicely for quite some time before I was alerted to the site being down yesterday afternoon.

Investigating things I found the server was getting hammered by requests with rotating IPs. It wasn’t so bad in the grand scheme of things (approximately 60 requests/second) but it was still enough to break access to the site.

Why can’t we have nice things?

I brought the site down while I figured out what I would do to mitigate this. No need to incur the egress traffic costs from the Google cloud. Investigating a little further showed me that a large sample of the IPs where being generated from Brazil.

When something like this happens I usually turn to the pixls-admin Matrix channel where I can ping my favorite computer nerds, @darix and mica. Not surprisingly, moments after lamenting my problem they piped up with a suggestion to look at Anubis.

Anubis

On it’s homepage, Anubis states that it will:

Weigh the soul of incoming HTTP requests using proof-of-work to stop AI crawlers

Well, this sounds like exactly what I need! The basic idea is that a request to my website hits Anubis where a challenge is presented to the client and a service worker on the client to compute an SHA256 hash. If successful, the challenge is validated by Anubis after which it sets a cookie that validates the (presumably valid) browser for a while (a week by default).

Subsequent requests to the site don’t need to re-validate if this cookie is set.

All of my services on my server are run as docker containers and I happen to use the awesome caddy-docker-proxy to do all of my proxying. This means that all I had to do to use Anubis was fire up the containers and tell caddy where to route things. This is a bit different than a normal caddy deployment due to the way that caddy-docker-proxy defines its caddyfile through the use of labels in the service compose files.

For instance, with caddy-docker-proxy handling all of my traffic, in my CKAN compose file I just need to include a labels section in the compose file for my CKAN instance to define the routing:

...
  labels:
    caddy: data.disl.edu
    caddy.reverse_proxy: "{{upstreams 5000}}"
...

When this service starts up caddy-docker-proxy will read the labels section and add it into an in-memory caddyfile automatically.

Adding Anubis

For convenience I just added the services for anubis and httpdebug into my compose file for caddy-docker-proxy. (This is the pretty much the same as the docs for using it with the plain caddy container.)

...
  anubis:
    image: ghcr.io/techarohq/anubis:latest
    pull_policy: always
    environment:
      BIND: ":3000"
      TARGET: "http://ckan:5000"
    networks:
      - caddy

  httpdebug:
    image: ghcr.io/xe/x/httpdebug
    pull_policy: always
...

I did set the TARGET environment variable to point to my CKAN container for when a valid request comes in. I also added the anubis container to my caddy network.

Over on the CKAN container side of things, I just had to modify my labels to use anubis and forward some headers (again, per the anubis docs for a plain caddy container):

services:
  ckan:
    ...
    labels:
      caddy: data.disl.edu
      caddy.reverse_proxy: http://anubis:3000
      caddy.reverse_proxy.header_up_1: "X-Real-Ip {remote_host}"
      caddy.reverse_proxy.header_up_2: "X-Http_Version {http.request.proto}"
    ...

Restart the containers and let Anubis do it’s thing!

Checking my docker logs for my caddy compose file shows all of the bot requests now being DENIED:

anubis     | {"time":"2025-05-08T22:13:30.989738613Z","level":"INFO","source":{"function":"github.com/TecharoHQ/anubis/lib.(*Server).checkRules","file":"github.com/TecharoHQ/anubis/lib/anubis.go","line":164},"msg":"explicit deny","user_agent":"Opera/9.47.(X11; Linux x86_64; quz-PE) Presto/2.9.173 Version/10.00","accept_language":"en","priority":"","x-forwarded-for":"45.234.232.24","x-real-ip":"45.234.232.24","check_result":{"name":"bot/deny-aggressive-brazilian-scrapers","rule":"DENY"}}

Yay! And just like that I was able to connect to, and use, my site again.

References


Filed under: infrastructure, server, caddy, anubis

Share this on: Twitter | Facebook