[fix] update wikidata units - remove URL prefix from Q-name

Sometimes the URL prefix switches from a http to a https, this patch harden the
code that removes the URL prefix from wikidata Q-name, issue has been reported
in [1].

[1] https://github.com/searxng/searxng/pull/3437#issuecomment-2082121730

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
Markus Heiser 2024-05-01 18:25:22 +02:00 committed by Markus Heiser
parent c8d0b6529b
commit 11fe88bb40
2 changed files with 138 additions and 6 deletions

View file

@ -51,16 +51,18 @@ WHERE
ORDER BY ?item DESC(?rank) ?symbol
"""
_wikidata_url = "https://www.wikidata.org/entity/"
def get_data():
results = collections.OrderedDict()
response = wikidata.send_wikidata_query(SARQL_REQUEST)
for unit in response['results']['bindings']:
name = unit['item']['value'].replace(_wikidata_url, '')
symbol = unit['symbol']['value']
si_name = unit.get('tosiUnit', {}).get('value', '').replace(_wikidata_url, '')
name = unit['item']['value'].rsplit('/', 1)[1]
si_name = unit.get('tosiUnit', {}).get('value', '')
if si_name:
si_name = si_name.rsplit('/', 1)[1]
to_si_factor = unit.get('tosi', {}).get('value', '')
if name not in results:
# ignore duplicate: always use the first one