Apparently the layout of https://annas-archive.org has changed, making changes necessary.
The issue has been reported in #5146, see there for more details.
- closes#5146
The plugin uses the ``GeoLocation`` class, which is already implemented in the
context of weather forecasts, to determine the time zone. The ``DateTime`` class
is used for the localized display of date and time.
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
TL;DR; For all the issues that comes with HTTP POST I recommend instance
maintainers to switch to GET and lock the property in the preferences:
```yaml
server:
method: GET
preferences:
lock:
- method
```
We don't want this in the defaults of the SearXNG distributions for the pros vs
cons listed in this discussion:
- https://github.com/searxng/searxng/pull/3619
- Adds a new engine `searx/engines/openalex.py` that integrates the OpenAlex
Works API to return scientific paper results using the `paper.html` template.
- Uses the official API (no auth required); supports OpenAlex polite pool via `mailto`.
- Adds developer docs at `docs/dev/engines/online/openalex.rst`.
OpenAlex API reference: https://docs.openalex.org/how-to-use-the-api/api-overview
To avoid an `unsafe-inline` in the CSP header, the JS code must be moved to the
client side [1].
The `<script>` tag at the end of the HTML originates from the old implementation
of the JS client. Since PR-5073 [2] was merged, the `type` is now `module`, and
the tag must be moved to the beginning of the HTML.
> We need to inline this "JS is enabled?" thing to prevent layout shifts and
> temporary "no JS enabled" visuals as ESM scripts loads and evals everything
> deferred from initial DOM render [3]
That's true in theory, but in practice, this effect is unnoticeable because it's
masked by another effect (which we can't avoid): If we load the page with a
severely throttled connection, the HTML (result list) takes a long time to
load. Then the CSS is loaded, which also takes longer. Until the CSS has loaded,
there's no layout. A layout shift is therefore largely determined by the loading
of the HTML and CSS itself.
The running times of the ESM script can be neglected compared to the loading
times of HTML & CSS.
[1] https://github.com/searxng/searxng-docker/pull/424#issuecomment-3199494256
[2] https://github.com/searxng/searxng/pull/5073
[3] https://github.com/searxng/searxng-docker/pull/424#issuecomment-3199622504
- the previous CDN icon list url no longer works
- a list of all icons is mirrored to the JSDelivr CDN however
- there's no reason to set the engine to inactive now that we use public CDNs
This patch adds GitHub Code Search [1] engine to allow querying the codebases.
Template code.html is changed to allow passthrough of strip and highlighting
options.
Engine Searchcode is adjusted to pass filename and not rely on hardcoded
extensions.
GitHub search code API does not return the exact code line indices, this
implementation assigns the code arbitrary numbers starting from 1
(effectively relabeling the code).
The API allows for unauth calls, and the default engine settings default to
that, although the calls are heavily rate limited.
The 'text' lexer is the default pygments lexer when parsing fails.
[1] https://docs.github.com/en/rest/search/search?apiVersion=2022-11-28#search-code
Co-authored-by: Markus Heiser <markus.heiser@darmarIT.de>
pyrightconfig.json :
for the paths searx, searxng_extra and tests, individual rules were
defined (for example, in test fewer / different rules are needed than in the
searx package
searx/engines/__builtins__.pyi :
The builtin types that are added to the global namespace of a module by the
intended monkey patching of the engine modules / replaces the previous
filtering of the stdout using grep.
test.pyright_modified (utils/lib_sxng_test.sh) :
static type check of local modified files not yet commited
make test :
prerequisite 'test.pyright' has been replaced by 'test.pyright_modified'
searx/engines/__init__.py, searx/enginelib/__init__.py :
First, minimal typifications that were considered necessary.
TypeScript is a superset of JavaScript, converting the entire theme to
TypeScript allows us to receive much more feedback on possible issues made in
package updates or our own typos, furthermore, it allows to transpile properly
to lower specs. This PR couldn't be done in smaller commits, a lot of work
needed to make everything *work properly*:
- A browser baseline has been set that requires minimum **Chromium 93, Firefox
92 and Safari 15** (proper visuals/operation on older browser versions is not
guaranteed)
- LightningCSS now handles minification and prefix creation for CSS.
- All hardcoded polyfills and support for previous browser baseline versions
have been removed.
- Convert codebase to TypeScript.
- Convert IIFE to ESM, handling globals with IIFE is cumbersome, ESM is the
standard for virtually any use of JS nowadays.
- Vite now builds the theme without the need for `vite-plugin-static-copy`.
- `searxng.ready` now accepts an array of conditions for the callback to be
executed.
- Replace `leaflet` with `ol` as there were some issues with proper Vite
bundling.
- Merged `head` with `main` script, as head was too small now.
- Add `assertElement` to properly check the existence of critical DOM elements.
- `searxng.on` renamed to `searxng.listen` with some handling improvements.
FIxes publishedDate format in reuters engine to encompass ISO 8601 times both with and without milliseconds.
Why is this change important?
Previously, the engine would sometimes fail saying:
2025-08-12 21:13:23,091 ERROR:searx.engines.reuters: exception : time data '2024-04-15T19:08:30.833Z' does not match format '%Y-%m-%dT%H:%M:%SZ'
Traceback (most recent call last):
...
File "/usr/local/searxng/searx/engines/reuters.py", line 87, in response
publishedDate=datetime.strptime(result["display_time"], "%Y-%m-%dT%H:%M:%SZ"),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
Note that most queries seem to work with Reuters, but there are some results that have the additional milliseconds and fail. Regardless, the change is backwards compatible as both the formats (with and without the ms) should now parse correctly.
Add Baidu Captcha detection to reduce `JSONDecodeError` error
Baidu will redirect to `wappass.baidu.com` and return a captcha challenge.
Current behavior will get the data from `wappass.baidu.com` then return a
`json.decoder.JSONDecodeError` error.
The HTTP X-Forwarded-Proto (XFP) request header is a *de-facto* standard header
for identifying the protocol (HTTP or HTTPS) that a client used to connect to a
proxy or load balancer.[1]
The ``X-Scheme`` header was added 10 years ago, why ``X-Scheme`` was used back
then and not ``X-Forwarded-Proto``, nobody knows today / possibly because
``X-Forwarded-Proto`` wasn't a *de-facto* standard back then.
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/X-Forwarded-Proto
[2] 6ef7c3276
Sometimes, there's only an `adaptivestreaming` field in `streams`, which
is usually an m3u8 file. That's however not supported by the video player
of any browser, so we can't use and must build a different url instead.
WhiteNoise requires all headers to be strings, however it's common to use
other primitive types (e.g. numbers) in the header, e.g. `X-XSS-Protection: 0`.
Thus, we must convert all types of values (i.e. numbers) to strings.
- closes https://github.com/searxng/searxng/issues/5091
Building the container currently does not work properly.
When rebuilding several times with `make container`, `version_frozen.py`
is recreated, which wouldn't be an issue if the file’s timestamp was constant.
Now, when creating `version_frozen.py`, it will have the same timestamp as the
commit when it was created. (`version_frozen.py` is moved to a dedicated layer).
Reusing "builder" cache when building "dist" could be slow
(CD reports 2 seconds, but locally I've seen it take up to 10 seconds),
so the Dockerfile is now split and we save a couple steps
by importing the "builder" image directly.
The last changes made it possible to remove the layer cache in "builder",
since the overhead is now greater than building the layers from scratch.
Until now, all "dist" layers were squashed into a single layer,
which in most cases is a good idea
(except for storage/delivery pricing/overhead), but in our case,
since we manage the entire pipeline, we can ignore this
and share layers between builds.
This means (for example) that if we change files unrelated to the container
in several consecutive commits (documentation changes), we don't have to push
the entire image to registry, but only the different layers
(`version_frozen.py` in this example).
The same applies when pulling, as only the layers that have changed
compared to the local layers will be downloaded (that's the theory,
we'll see if this works as expected or if we need to tweak something else).
- not 100% sure about the condition code mapping, there are no real matches for most of the codes from Apple WeatherKit to the weather codes we have in SearXNG
- related: https://github.com/searxng/searxng/issues/4885
* [fix] CI task "update_engine_traits.py" fails
To catch all problems with an HTTP request, the more general class
``httpx.HTTPError`` must be caught, for your test use::
$ ./manage dev.env
$ python ./searxng_extra/update/update_engine_traits.py
Closes: https://github.com/searxng/searxng/issues/5068
* [data] update searx.data - update_engine_traits.py