[mod] engines: migration of the individual cache solutions to EngineCache

The EngineCache class replaces all previously individual solutions for caches in
the context of the engines.

- demo_offline.py
- duckduckgo.py
- radio_browser.py
- soundcloud.py
- startpage.py
- wolframalpha_api.py
- wolframalpha_noapi.py

Search term to test most of the modified engines::

    !ddg !rb !sc !sp !wa test

    !ddg !rb !sc !sp !wa foo

For introspection of the DB, jump into developer environment and run command to
show cache state::

    $ ./manage pyenv.cmd bash --norc --noprofile
    (py3) python -m searx.enginelib cache state

    cache tables and key/values
    ===========================
    [demo_offline        ] 2025-04-22 11:32:50 count        --> (int) 4
    [startpage           ] 2025-04-22 12:32:30 SC_CODE      --> (str) fSOBnhEMlDfE20
    [duckduckgo          ] 2025-04-22 12:32:31 4dff493e.... --> (str) 4-128634958369380006627592672385352473325
    [duckduckgo          ] 2025-04-22 12:40:06 3e2583e2.... --> (str) 4-263126175288871260472289814259666848451
    [radio_browser       ] 2025-04-23 11:33:08 servers      --> (list) ['https://de2.api.radio-browser.info',  ...]
    [soundcloud          ] 2025-04-29 11:40:06 guest_client_id --> (str) EjkRJG0BLNEZquRiPZYdNtJdyGtTuHdp
    [wolframalpha        ] 2025-04-22 12:40:06 code         --> (str) 5aa79f86205ad26188e0e26e28fb7ae7
    number of tables: 6
    number of key/value pairs: 7

In the "cache tables and key/values" section, the table name (engine name) is at
first position on the second there is the calculated expire date and on the
third and fourth position the key/value is shown.

About duckduckgo: The *vqd coode* of ddg depends on the query term and therefore
the key is a hash value of the query term (to not to store the raw query term).

In the "properties of ENGINES_CACHE" section all properties of the SQLiteAppl /
ExpireCache and their last modification date are shown::

    properties of ENGINES_CACHE
    ===========================
    [last modified: 2025-04-22 11:32:27] DB_SCHEMA           : 1
    [last modified: 2025-04-22 11:32:27] LAST_MAINTENANCE    :
    [last modified: 2025-04-22 11:32:27] crypt_hash          : ca612e3566fdfd7cf7efe2b1c9349f461158d07cb78a3750e5c5be686aa8ebdc
    [last modified: 2025-04-22 11:32:30] CACHE-TABLE--demo_offline: demo_offline
    [last modified: 2025-04-22 11:32:30] CACHE-TABLE--startpage: startpage
    [last modified: 2025-04-22 11:32:31] CACHE-TABLE--duckduckgo: duckduckgo
    [last modified: 2025-04-22 11:33:08] CACHE-TABLE--radio_browser: radio_browser
    [last modified: 2025-04-22 11:40:06] CACHE-TABLE--soundcloud: soundcloud
    [last modified: 2025-04-22 11:40:06] CACHE-TABLE--wolframalpha: wolframalpha

These properties provide information about the state of the ExpireCache and
control the behavior.  For example, the maintenance intervals are controlled by
the last modification date of the LAST_MAINTENANCE property and the hash value
of the password can be used to detect whether the password has been changed (in
this case the DB entries can no longer be decrypted and the entire cache must be
discarded).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
Markus Heiser 2025-04-21 14:17:49 +02:00 committed by Markus Heiser
parent 4cbfba9d7b
commit bdfe1c2a15
7 changed files with 169 additions and 103 deletions

View file

@ -5,7 +5,9 @@
https://de1.api.radio-browser.info/#Advanced_station_search
"""
from __future__ import annotations
import typing
import random
import socket
from urllib.parse import urlencode
@ -13,9 +15,15 @@ import babel
from flask_babel import gettext
from searx.network import get
from searx.enginelib import EngineCache
from searx.enginelib.traits import EngineTraits
from searx.locales import language_tag
if typing.TYPE_CHECKING:
import logging
logger = logging.getLogger()
traits: EngineTraits
about = {
@ -52,11 +60,24 @@ none filters are applied. Valid filters are:
"""
servers = []
CACHE: EngineCache
"""Persistent (SQLite) key/value cache that deletes its values after ``expire``
seconds."""
def init(_):
# see https://api.radio-browser.info
global CACHE # pylint: disable=global-statement
CACHE = EngineCache("radio_browser")
server_list()
def server_list() -> list[str]:
servers = CACHE.get("servers", [])
if servers:
return servers
# hint: can take up to 40sec!
ips = socket.getaddrinfo("all.api.radio-browser.info", 80, 0, 0, socket.IPPROTO_TCP)
for ip_tuple in ips:
_ip: str = ip_tuple[4][0] # type: ignore
@ -65,8 +86,22 @@ def init(_):
if srv not in servers:
servers.append(srv)
# update server list once in 24h
CACHE.set(key="servers", value=servers, expire=60 * 60 * 24)
return servers
def request(query, params):
servers = server_list()
if not servers:
logger.error("Fetched server list is empty!")
params["url"] = None
return
server = random.choice(servers)
args = {
'name': query,
'order': 'votes',
@ -87,8 +122,7 @@ def request(query, params):
if countrycode in traits.custom['countrycodes']: # type: ignore
args['countrycode'] = countrycode
params['url'] = f"{random.choice(servers)}/json/stations/search?{urlencode(args)}"
return params
params['url'] = f"{server}/json/stations/search?{urlencode(args)}"
def response(resp):
@ -154,8 +188,9 @@ def fetch_traits(engine_traits: EngineTraits):
babel_reg_list = get_global("territory_languages").keys()
language_list = get(f'{servers[0]}/json/languages').json() # type: ignore
country_list = get(f'{servers[0]}/json/countries').json() # type: ignore
server = server_list()[0]
language_list = get(f'{server}/json/languages').json() # type: ignore
country_list = get(f'{server}/json/countries').json() # type: ignore
for lang in language_list: