fix: skip empty tokens in SearchOptions.Tokens() (#8261)
Some checks are pending
/ release (push) Waiting to run
testing-integration / test-unit (push) Waiting to run
testing-integration / test-sqlite (push) Waiting to run
testing / frontend-checks (push) Waiting to run
testing / backend-checks (push) Waiting to run
testing / test-unit (push) Blocked by required conditions
testing / test-e2e (push) Blocked by required conditions
testing / test-remote-cacher (redis) (push) Blocked by required conditions
testing / test-remote-cacher (valkey) (push) Blocked by required conditions
testing / test-remote-cacher (garnet) (push) Blocked by required conditions
testing / test-remote-cacher (redict) (push) Blocked by required conditions
testing / test-mysql (push) Blocked by required conditions
testing / test-pgsql (push) Blocked by required conditions
testing / test-sqlite (push) Blocked by required conditions
testing / security-check (push) Blocked by required conditions

Query string tokenizer could return a list containing empty tokens when the query string was `\` or `"` (probably in other scenarios as well).

This seems undesirable and is what triggered #8260, but I'm posting this separately from that fix in case I'm wrong. Feel free to reject if so.

The actual change in behavior is that now searching for `\` or `"` behaves the same as if the query were empty (the bleve/elastic code checks that the tokenizer actually returned, anything rather than just query being non-empty).

### Tests

- I added test coverage for Go changes...
  - [x] in their respective `*_test.go` for unit tests.

### Documentation

- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.

### Release notes

- [x] I do not want this change to show in the release notes.
- [ ] I want the title to show in the release notes with a link to this pull request.
- [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title.

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/8261
Reviewed-by: Shiny Nematoda <snematoda@noreply.codeberg.org>
Co-authored-by: Danko Aleksejevs <danko@very.lv>
Co-committed-by: Danko Aleksejevs <danko@very.lv>
This commit is contained in:
Danko Aleksejevs 2025-07-03 11:29:04 +02:00 committed by Earl Warren
parent bc2e4942fc
commit 4935e6e1a3
5 changed files with 146 additions and 19 deletions

View file

@ -45,12 +45,9 @@ func (t *Tokenizer) next() (tk Token, err error) {
// skip all leading white space
for {
if r, _, err = t.in.ReadRune(); err == nil && r == ' ' {
//nolint:staticcheck,wastedassign // SA4006 the variable is used after the loop
r, _, err = t.in.ReadRune()
continue
if r, _, err = t.in.ReadRune(); err != nil || r != ' ' {
break
}
break
}
if err != nil {
return tk, err
@ -107,11 +104,17 @@ nextEnd:
// Tokenize the keyword
func (o *SearchOptions) Tokens() (tokens []Token, err error) {
if o.Keyword == "" {
return nil, nil
}
in := strings.NewReader(o.Keyword)
it := Tokenizer{in: in}
for token, err := it.next(); err == nil; token, err = it.next() {
tokens = append(tokens, token)
if token.Term != "" {
tokens = append(tokens, token)
}
}
if err != nil && err != io.EOF {
return nil, err