Feature "Autocomplete did-you-mean completion suggesters"

From dataspects::Wiki
C0950167937
Jump to navigation Jump to search





_search?typed_keys takes a suggest section on the same level as query and _source.

Story

TypeAhead and Search are run against tokens:

  • in specific fields of specific predicates ("PredicateFieldsFacet", currently specified in{" "} typeaheadAndSearchThesePredicates)
  • of specific documents ("DocumentsFacet", currently specified in es.search(taquery, index_alias))

The suggest section contains multiple named suggesters which specify:

  1. the text to check for suggestions (or it can be taken from the global suggest text field),
  2. which field to check and
  3. how to check the field (term, phrase, completion/context).

The suggest section and the query are independent of each other.

The search result will contain a section suggest on the same level as hits. The suggest section will contain the results for each named suggester specifying:

  1. the text that was checked and
  2. an array of suggested scored options.

Did-you-mean/Spell-correction

Completion/Autocomplete/Typeahead

HasEntityTitle

For "dock tok" autocomplete should match "Delete Docker authentication token".

 {
   "_source": {
     "includes": [
       "HasEntityTitle"
     ],
     "excludes": []
   },
   "query": {
     "match": {
       "HasEntityTitle.completion": {
         "query": "dock tok",
         "operator": "and"
       }
     }
   },
   "highlight": {
     "fields": {
       "HasEntityTitle.completion": {}
     }
   }
 }

Keyword

HasEntityType

Predicate

In dataspects predicates are CamelCased. E.g. for "CanBeDeveloped" autocomplete should match on case-insensitive "can", "dev", "can dev" and "dev can".

 {
   "size": 0,
   "query": {
     "match": {
       "predicate.completion": {
         "query": "can be",
         "operator": "and"
       }
     }
   },
   "highlight": {
     "fields": {
       "predicate.completion": {}
     }
   },
   "aggs": {
     "entityTypes": {
       "terms": {
         "field": "predicate.keyword"
       }
     }
   }
 }

Where should the completion text come from?

  • Past user queries (problematic)
  • Text of the content being searched (better)

Completions based on the text of the content being searched

  1. Which fields?
  2. How analyze the text?
    • Preserve readability
    • Support phrase suggestions
  3. Construct completion search