mirror of
https://github.com/searxng/searxng.git
synced 2025-07-22 20:59:16 +02:00
[mod] limiter -> botdetection: modularization and documentation
In order to be able to meet the outstanding requirements, the implementation is modularized and supplemented with documentation. This patch does not contain functional change, except it fixes issue #2455 ---- Aktivate limiter in the settings.yml and simulate a bot request by:: curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \ -H 'Accept: text/html' -H 'User-Agent: xyz' \ -H 'Accept-Encoding: gzip' \ 'http://127.0.0.1:8888/search?q=foo' In the LOG: DEBUG searx.botdetection.link_token : missing ping for this request: ..... Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time before you get a "Too Many Requests" response. Closes: https://github.com/searxng/searxng/issues/2455 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
parent
5226044c13
commit
1ec325adcc
15 changed files with 541 additions and 161 deletions
79
searx/botdetection/limiter.py
Normal file
79
searx/botdetection/limiter.py
Normal file
|
@ -0,0 +1,79 @@
|
|||
# SPDX-License-Identifier: AGPL-3.0-or-later
|
||||
# lint: pylint
|
||||
""".. _limiter src:
|
||||
|
||||
Limiter
|
||||
=======
|
||||
|
||||
.. sidebar:: info
|
||||
|
||||
The limiter requires a :ref:`Redis <settings redis>` database.
|
||||
|
||||
Bot protection / IP rate limitation. The intention of rate limitation is to
|
||||
limit suspicious requests from an IP. The motivation behind this is the fact
|
||||
that SearXNG passes through requests from bots and is thus classified as a bot
|
||||
itself. As a result, the SearXNG engine then receives a CAPTCHA or is blocked
|
||||
by the search engine (the origin) in some other way.
|
||||
|
||||
To avoid blocking, the requests from bots to SearXNG must also be blocked, this
|
||||
is the task of the limiter. To perform this task, the limiter uses the methods
|
||||
from the :py:obj:`searx.botdetection`.
|
||||
|
||||
To enable the limiter activate:
|
||||
|
||||
.. code:: yaml
|
||||
|
||||
server:
|
||||
...
|
||||
limiter: true # rate limit the number of request on the instance, block some bots
|
||||
|
||||
and set the redis-url connection. Check the value, it depends on your redis DB
|
||||
(see :ref:`settings redis`), by example:
|
||||
|
||||
.. code:: yaml
|
||||
|
||||
redis:
|
||||
url: unix:///usr/local/searxng-redis/run/redis.sock?db=0
|
||||
|
||||
"""
|
||||
|
||||
from typing import Optional, Tuple
|
||||
import flask
|
||||
|
||||
from searx.botdetection import (
|
||||
http_accept,
|
||||
http_accept_encoding,
|
||||
http_accept_language,
|
||||
http_connection,
|
||||
http_user_agent,
|
||||
ip_limit,
|
||||
)
|
||||
|
||||
|
||||
def filter_request(request: flask.Request) -> Optional[Tuple[int, str]]:
|
||||
|
||||
if request.path == '/healthz':
|
||||
return None
|
||||
|
||||
for func in [
|
||||
http_user_agent,
|
||||
]:
|
||||
val = func.filter_request(request)
|
||||
if val is not None:
|
||||
return val
|
||||
|
||||
if request.path == '/search':
|
||||
|
||||
for func in [
|
||||
http_accept,
|
||||
http_accept_encoding,
|
||||
http_accept_language,
|
||||
http_connection,
|
||||
http_user_agent,
|
||||
ip_limit,
|
||||
]:
|
||||
val = func.filter_request(request)
|
||||
if val is not None:
|
||||
return val
|
||||
|
||||
return None
|
Loading…
Add table
Add a link
Reference in a new issue