toot.wales is one of the many independent Mastodon servers you can use to participate in the fediverse.
We are the Open Social network for Wales and the Welsh, at home and abroad! Y rhwydwaith cymdeithasol annibynnol i Gymru, wedi'i bweru gan Mastodon!

Administered by:

Server stats:

672
active users

#anubis

6 posts6 participants0 posts today
Replied in thread

@fanf Sure that does make sense. I'll try to verify jmeter indeed doesn't reuse connections (I already have debug logging in place that should tell me).

If that's really the reason, I guess the sane thing to do is to add a hint to the docs to just disable TLS for very busy sites. The intended usecase for #swad is operation behind #nginx to serve its "auth_request". I don't intend to implement HTTP/2 or beyond, but it would be pretty pointless here anyways, nginx defaults to HTTP/1.0 for proxy requests and can be configured to use HTTP/1.1 instead, but *still* doesn't reuse connections by default, and my experiments so far to enable it weren't successful, maybe I didn't fully understand it yet. Using TLS behind nginx would make sense from a "defense in depth" point of view, but it's probably impractical once your load exceeds a certain threshold.

For background how I arrived there, I observed stupid #AI #scraper #bots clog my DSL connection by downloading gigabytes of build logs produced by my #poudriere. They're not secret in any way and having a simple way to share them is great for community bug hunting, but this had to stop. I had a simple C library doing a fully portable reactor event loop on top of select (so, not really scalable), and some very limited HTTP/1.1 server code from experiments with TOR hidden services ... so I put that together to add some web-form + cookies auth to my private nginx to lock out the bots. Later, I added a "guest login" doing the same "proof of work" stuff known from #anubis, and then I suddenly had the idea in mind to make my little service (that already solved the problem perfectly for myself) suitable for large-scale installations. So, added kqueue, epoll etc support, added a "multi-reactor with acceptor-connector" design, etc .... and now I'm a bit frustrated enabling TLS spoils all the performance 🙈

New #blog post: Deploying #anubis to protect against #AI scrapers

I noticed that some of my services have been being hammered by requests - although the requests come from a range of IPs, there's clearly some level of coordination between them.

Some of the services they're hitting are dynamically generated, so they're burning CPU cycles for no good reason.

I decided to put Anubis in to protect my resources from being wasted - this post describes how

bentasker.co.uk/posts/blog/the

www.bentasker.co.uk · Deploying Anubis to protect against AI scrapers
More from Ben Tasker

We know how hard it could be to handle the ever-increasing number of bad actors scraping your websites too well. Therefore, we decided to sponsor #Anubis¹ as a tool that helps users to block or at least slow down these bad actors.

We are currently evaluating how we could integrate Anubis or some similar solution to our stack and make it available as a uberspace command and backend.

¹ github.com/TecharoHQ/anubis
#BadRobots #AIBots

Weighs the soul of incoming HTTP requests to stop AI crawlers - TecharoHQ/anubis
GitHubGitHub - TecharoHQ/anubis: Weighs the soul of incoming HTTP requests to stop AI crawlersWeighs the soul of incoming HTTP requests to stop AI crawlers - TecharoHQ/anubis

Anubis ist ein tolles Projekt, um AI-Crawler von der eigenen Webseite fernzuhalten. Es wäre natürlich noch besser, wenn die Bots bereits auf der Firewall blockiert werden, um noch mehr Ressourcen zu sparen.

Dazu bräuchte man eine dynamische Liste von IP-Adressen, der aktuellen Bots.

Crowdsec bietet sowas ja an. Es gibt sogar eine Liste, die man abonnieren kann, aber die kostet (bzw. für Community Projekte auf Anfrage nicht).

Welche Nachteile gibt es?

Just released: #swad 0.11 -- the session-less swad is done!

Swad is the "Simple Web Authentication Daemon", it adds cookie/form #authentication to your reverse #proxy, designed to work with #nginx' "auth_request". Several modules for checking credentials are included, one of which requires solving a crypto challenge like #Anubis does, to allow "bot-safe" guest logins. Swad is written in pure #C, compiles to a small (200-300kiB) binary, has minimal dependencies (zlib, OpenSSL/LibreSSL and optionally libpam) and *should* work on many #POSIX-alike systems (#FreeBSD tested a lot, #Linux and #illumos also tested)

This release is the first one not to require a server-side session (which consumes a significant amount of RAM on really busy sites), instead signed Json Web Tokens are now implemented. For now, they are signed using HMAC-SHA256 with a random key generated at startup. A future direction could be support for asymmetric keys (RSA, ED25519), which could open up new possibilities like having your reverse proxy pass the signed token to a backend application, which could then verify it, but still not forge it.

Read more, grab the latest .tar.xz, build and install it ... here: 😎

github.com/Zirias/swad

GitHubGitHub - Zirias/swad: Simple Web Authentication DaemonSimple Web Authentication Daemon. Contribute to Zirias/swad development by creating an account on GitHub.

[New Note] - That’s the 3rd time today that I had to reach out to someone because they deployed an anti AI service like #anubis or #goaway . I have nothing against these people that just try to protect their service / assets from these shitty bot scrapping the web like they own it (when normal people do it, they get sued though)…


That’s what the AI slop is giving us right now: shitty AI services that most don’t care about (and the one that does should be a lot more careful about their results) and are a pain in the a** for everyone else: people trying to run small services / websites and the one that try to follow them outside of proprietary jails…


I really hate the state the web is now… NO thank you big tech sh*t heads.

https://bacardi55.io/notes/20250520-1908/

cc @bacardi55

bacardi55.io| Bacardi55's Web Cave