DevHeads.net

Blocking particular URL/file patterns

apache 2.4.39
linux 4.12.14-lp151.28.7-default x86_64

Our site has beset with numerous search engine queries for URLs that
have *never* existed on the site. They have the form:

/condalia1398.xml.gz
/heling348628-h1819-746-be2dochmiacal-97a2-/6a465d7hll78i1/

where the digits are randomly changed. The search bots of Google and
Bing are the most prevalent producing 1000s of 404s per day. Not a
particular CPU burden, to be sure. Annoying nevertheless.

The following blocks the bots but also legitimate requests as well.
deny from 66.249.0.0/16 googlebot
deny from 157.55.0.0/16 bingbot
deny from 40.77.167.0/24
deny from 207.46.13.0/24

Is there a way to write a filter that blocks the above URL patterns
without generating a 404 response?

Comments

Re: Blocking particular URL/file patterns

By LuKreme at 07/02/2019 - 19:36

On 2 Jul 2019, at 14:16, James Moe <jimoe@sohnen-moe.com.INVALID> wrote:

Have you looked into robots.txt? And a sitemap?