Skip to content
arrow_backBack to insights
April 14, 20263 min read

How Bot-Driven Search Queries Took Down Our MariaDB (And How We Fixed It)

A production incident on a high-traffic Drupal news site where uncached search queries from bots overwhelmed the database — the diagnosis, the immediate fix, and the long-term remediation.

It started with a slow site. Then a very slow site. Then no site.

The alert came in during peak traffic hours on a high-volume Drupal news portal serving several million pageviews a month. PHP-FPM workers were maxing out, MariaDB connections were saturating, and the server load average had climbed past 40 on an 8-core machine. The site was effectively down.

This is the story of what caused it, how we diagnosed it, and what we changed permanently as a result.

The Symptoms

The first thing I checked was SHOW PROCESSLIST in MariaDB. What I saw was a wall of identical queries, all running simultaneously:

SELECT * FROM search_index
WHERE word LIKE '%ipl%'
AND sid IN (SELECT nid FROM node_field_data WHERE status = 1)
ORDER BY score DESC
LIMIT 20 OFFSET 0;

Hundreds of them. All different search terms, all hitting the search_index table simultaneously, none of them cached. Response times on individual queries ranged from 800ms to 4 seconds depending on the term. With enough concurrent requests, the connection pool saturated and the site ground to a halt.

Diagnosing the Source

The next step was identifying who was making these requests. A quick scan of Nginx access logs made it obvious:

grep "search" /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -20

The top offenders were IP ranges associated with known crawlers — not Googlebot, not Bingbot, but aggressive scrapers and content aggregators cycling through search queries systematically. One IP had made over 12,000 search requests in a single hour.

The core problem was that Drupal's search results were not being cached at all. Every query — regardless of how often it was repeated — went straight to the database.

What We Changed

The remediation had three layers.

Layer 1: Cloudflare WAF Rules

At the edge, we wrote a Cloudflare firewall rule to challenge or block requests to /search from IPs not matching known search engine user agents, with a rate limit of 10 requests per minute per IP:

(http.request.uri.path contains "/search" and not cf.client.bot_management.score > 30)

We also enabled Bot Fight Mode and tuned the sensitivity so legitimate users were unaffected but scrapers were challenged with a JS challenge before being allowed through. This alone dropped the search traffic volume by roughly 85% within an hour.

Layer 2: Drupal Search Caching

Drupal core's search module does not cache results by default in a way that survives across requests for the same query. We addressed this at two levels.

First, we implemented a custom module that wraps search results in a cache bin with a reasonable TTL:

function mymodule_search_results_alter(&$results, $keys) {
  $cid = 'search_results:' . md5($keys);
  $cache = \Drupal::cache('search')->get($cid);
  if ($cache) {
    $results = $cache->data;
    return;
  }
  \Drupal::cache('search')->set($cid, $results, time() + 300);
}

Second, we moved the search cache bin to Redis so it survived across PHP-FPM worker restarts and could be shared across multiple instances if needed.

Layer 3: Database Query Optimisation

Even with caching and edge rules in place, we needed the underlying queries to be fast for legitimate uncached searches. The search_index table was missing a composite index that Drupal's search queries needed:

ALTER TABLE search_index ADD INDEX word_sid (word, sid);

This brought individual search query response times from 800ms–4s down to under 40ms.

The Result

Within two hours of applying all three layers, server load dropped from 40+ to under 2. MariaDB connections normalised. No further incidents of this type in the months following.

The Broader Lesson

The incident exposed a gap that is surprisingly common on high-traffic Drupal sites: the assumption that because page caching is in place, all database load is covered. It is not. Any dynamic endpoint — search, views with exposed filters, user-specific content — bypasses the page cache entirely and hits the database directly.

If your Drupal site has any form of public search, the checklist should be: rate-limit at the edge, cache results at the application layer, and ensure the underlying queries have proper indexes. Any one of these alone is insufficient. All three together make the site resilient.

Share this

X and LinkedIn only let you share a link. Copy the content below to paste the full post into either platform.