Skip to content

Consolidate all access to document values through SearchLookup #66256

Description

@romseygeek

Followup to #62511

The various Fetch subphases need to access a number of document-specific values: stored fields, docvalue fields, _source, script outputs. This is currently done via a mix of access to a SearchLookup object (for docvalue fields, _source and script) and a special fields visitor for stored fields. The stored fields logic also handles loading source, and is dealt with specially in FetchPhase. Nested documents also handle loading source differently, and some of this logic is in FetchPhase. We then have value fetchers, which are built by passing a QueryShardContext (which refers to a SearchLookup) and then used by passing a SourceLookup. This all makes things very difficult to reason about, as responsibility for loading data, caching across multiple calls, and advancing to new documents is split across several different classes.

I'd like to try consolidating all access to document values to be through SearchLookup.

  • StoredFieldsLookup should be able to load more than one field at a time (currently we do everything via a SingleFieldVisitor) and should make use of the fast stored fields reader where possible
  • SourceLookup should use StoredFieldsLookup, again to reduce the number of calls to IndexReader#document
  • SourceLookup should be aware of nested mappings and automatically return the correct source when positioned on nested or root documents
  • Adding stored fields to a search hit should be done via a dedicated fetch sub phase that reads data from SearchLookup
  • Positioning of the search lookup should be done by whatever owns it - probably the root collector in the query phase, or the FetchPhase main loop in the fetch phase.
  • ValueFetchers should load data via a passed-in SearchLookup and not have to worry about settings reader contexts.

I think organising things in this way will make reasoning about how data is fetched and which classes have responsibility for what much easier.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions