The previous implementation for determining the description of an engine did not
take into account that the UI languages can also have a region tag and/or a
script tag:
el-GR: Ελληνικά, Ελλάδα (Greek, Greece)
fa-IR: فارسی, ایران (Persian, Iran)
nb-NO: Norsk bokmål, Norge (Norwegian bokmål, Norway)
nl-BE: Nederlands, België (Dutch, Belgium)
pt-BR: Português, Brasil (Portuguese, Brazil)
zh-HK: 中文, 中國香港特別行政區 (Chinese, Hong Kong SAR China)
zh-Hans-CN: 中文, 中国 (Chinese, China)
zh-Hant-TW: 中文, 台灣 (Chinese, Taiwan)
Closes: https://github.com/searxng/searxng/issues/4842
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Sometimes (e.g. when ddg does not have a result item) there is no content and
the engine will fail with an IndexError:
* Error: IndexError
* Percentage: 10
* Parameters: `()`
* File name: `searx/engines/duckduckgo.py:375`
* Function: `response`
* Code: `item["content"] = extract_text(eval_xpath(div_result, './/a[contains(@class, "result__snippet")]')[0])`
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Changes:
- Add trailing slash to base URL to prevent potential redirects
- Remove advanced search syntax filtering (no longer guarantees a CAPTCHA)
- Correct pagination offset calculation: Page 2 now starts at offset 10,
subsequent pages use 10 + (n-2)*15 formula instead of the previous
broken 20 + (n-2)*50 calculation that caused CAPTCHAs
- Restructure request parameter building to better match a real request
- "kt" cookie is no longer an empty string if the language/region is "all"
- Group related parameter assignments together
- Add header logging to debugging output
Related:
- https://github.com/searxng/searxng/issues/4824
There is currently no known z-library, and all known URLs are dead [1]. To avoid
interrupting automated updates, a connection error to a z-library is treated as
a *known error*, and the old properties of the z-library are retained.
[1] https://github.com/searxng/searxng/issues/3610
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In the previous implementation, all databases were loaded into memory when
importing the searx.data package, regardless of whether they were ever needed.
Regardless of this, it is an antipattern to load entire databases into memory
when importing a package or module; databases should be loaded when needed.
Lazy loading is a first step toward improving memory usage and also improves
performance when setting up the runtime environment. Building on this,
subsequent PRs will be able to further optimize memory behavior, e.g., by using
a real database application such as the one already available via
searx.cache.ExpireCache
Related:
- https://github.com/searxng/searxng/discussions/1892
- https://github.com/searxng/searxng/pull/3458
- https://github.com/searxng/searxng/pull/4650
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- apparently the API now requires a `X-Pinterest-PWS-Handler` in order to
properly function (extracted from their web UI)
- the other `X-Pinterest` headers here are added in case they become mandatory
too
Closes: https://github.com/searxng/searxng/issues/4812
Icons category makes sense because it allows to quickly search for free SVG
icons to use for websites / other designs with a quick `!icons` query
Icons don't seem to fit into the normal images category that well because icons
are quite a special type of images
The global variable CACHE is not initialized when DDG images or DDG videos
import the get_vqd() function (please remember: the engine modules are imported
using the importlib method and not via the `import` keyword).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
container.yml will run after integration.yml COMPLETES successfully and in master branch.
Style changes, cleanup and improved integration with CI by leveraging the use of
shared cache between all workflows.
* Podman is now supported to build the container images (Docker also received a refactor, merging both build and buildx)
* Container images are being built by Buildah instead of Docker BuildKit.
* Container images are tested before release.
* Splitting "modern" (amd64 & arm64) and "legacy" (armv7) arches on different Dockerfiles allowing future optimizations.