Phase 0 - Inception
Phase 1 - Mappings
Phase 2 - Ingest
Phase 2.1 Ingest follow ups
- [ ] Build the _id from dimension values
- [ ] Investigate moving timestamp to the front of the _id to automatically get an optimization on _id searches. Not sure if worth it - but possible. #84928 could be an alternative
Phase 3.1 QL storage API (Postponed)
Phase 3.2 - Search MVP
Plans time series support in _search api are superceded by plans for this in ES|QL.
Phase 3.3 - Rollup / Downsampling
Phase 3.4 - TSID aggs (superseded by tsdb in ES|QL)
~~ - [ ] Update min, max, sum, avg pipeline aggs for intermediate result filtering optimization ~~
~~ - [ ] Sliding window aggregation ~~
~~ - [ ] A way to filter to windows within the sliding window. Like "measurements take in the last 30 seconds of the window". ~~
~~ - [ ] Open transform issue for newly added time series aggs ~~
~~ - [ ] Benchmarks for the tsid agg ~~
Phase 3.5 - Downsampling follow ups
Phase 4.0 - Compression
Phase 5.0 - Follow-ups and Nice-to-have-s
Phase 0 - Inception
Phase 1 - Mappings
time_series_dimensionmapping parameter to fieldsdimensionmapping parameter totime_series_dimension#78012 @csouliosPhase 2 - Ingest
Dimension-based tsid generator
_tsidfield to time_series indices #80276 @csoulios_tsid#81382 @csoulios (prototype)_tsidfield #81998 @csouliosRouting
BulkOperation. Maybe we can make this simpler.idsquery on time series index #81436 @csoulios_idfor tsid (TSDB: Support GET and DELETE and doc versioning #82633)_idis automatically generated (TSDB: improve document description on error #84903, TSDB: Add dimensions and timestamp to parse errors #84962)_tsidand@timestamp(TSDB: Expand _id on version conflict #84957)@timestampcomponent of the_idfrom little endian to big endian. That should mean there are more common prefixes. TSDB: shrink _id inverted index #85008 cuts the size of the inverted index for_idby 37%. That's not a lot of the index in total, but it sure does feel good for such a small change._idinRecoverySourceHandlerTests.javaandEngineTests.javaTest time series id in RecoverySourceHandlerTests #84996, Use tsdb's id in Engine tests #85055@timestampor dimensions in reindex TSDB: Initial reindex fix #86647 + Reindex support for TSDB creating indices #86704_idwith the securitycreate_docprivilege. Can a user withcreate_doc(only) ingest new TSDB docs? Doescreate_docprevent a user from overwriting an existing TSDB doc? (create_docrelies on theOpTypeof theIndexRequest, which is automatically set toCREATEfor docs with auto-generated ids) TSDB: Testcreate_docpermission #86638Handling Time Boundaries
start_time,end_timeindex settings ) @weizijunindex.time_series.start_timeandindex.time_series.end_timeindex settings) that don't match with the@timestamprange in a search request. Skip backing indices with a disjoint range on @timestamp field. #85162 (@martijnvg)Other tasks
index_modesetting isn't good enough. It requires additional config to be specified (time_series_dimensionattribute in mappings andindex.routing_pathas index settings) elsewhere and it doesn't allow the data stream tsdb features (routing based on@timestampfield) to be enabled without enabled the index level tsdb features.index.modesetting is set totime_series.index.routing_pathindex setting if not defined in composable index template that creates a tsdb data stream. All mapped fields of typekeywordandtime_series_dimensionenabled will be included in the generatedindex.routing_pathindex setting. Auto generate index.routing_path from mapping #86790 (@martijnvg)- [ ] Theindex.routing_pathindex setting generation doesn't kick in when index.mode and dimension fields are defined in component templates. (@martijnvg).Phase 2.1 Ingest follow ups
- [ ] Build the_idfrom dimension values- [ ] Investigate moving timestamp to the front of the_idto automatically get an optimization on_idsearches. Not sure if worth it - but possible. #84928 could be an alternative_idin lucene stored fields. We could regenerate it from the_sourceor from doc values for the@timestampand the_tsid. That'd save some bytes per document.IndexRequest#autoGeneratId? It's a bit spook where it is but I don't like it any other place._update_by_querywhen modifying the dimensions or@timestamp_idand assert that it matches the_idfrom the primary. Should we? Probably. Let's make sure.- [ ] Document best practices for using dimensions-based ID generator including how to use this with component templatesPhase 3.1 QL storage API (Postponed)
- [ ] Reimplement QL storage API for TSDB database (depends on completion of Phase 2 and 3.2) (Postponed)Phase 3.2 - Search MVP
Plans time series support in _search api are superceded by plans for this in ES|QL.
- [ ] Aggregation results filtering- [ ] Retrieve the last value for a time series metric within a parent bucket- [ ] Add a new histogram field subtype to support Prometheus-style histograms- [ ] TSDB indices could speed up cardinality aggregations on dimension fields #85523- [ ] Should the _tsid agg return doc_counts by default?- [ ] Shortcut aggs for TSDB #90423Phase 3.3 - Rollup / Downsampling
TimeSeriesIndexSearcherand compute rollups docs and add them to the rollup indexaggregate_metric_doublefields #90029 @csouliosfixed_intervalvscalendar_intervaltime_zonedate_histogramresolutionaggregate_metric_doublefields as their own field type instead ofdouble#87849 @csouliosPhase 3.4 - TSID aggs (superseded by tsdb in ES|QL)
~~ - [ ] Update min, max, sum, avg pipeline aggs for intermediate result filtering optimization ~~
~~ - [ ] Sliding window aggregation ~~
~~ - [ ] A way to filter to windows within the sliding window. Like "measurements take in the last 30 seconds of the window". ~~
~~ - [ ] Open transform issue for newly added time series aggs ~~
~~ - [ ] Benchmarks for the tsid agg ~~
Phase 3.5 - Downsampling follow ups
Phase 4.0 - Compression
_source@nik9000 Synthetic Source #86603Phase 5.0 - Follow-ups and Nice-to-have-s
_id's values. Right now it's aStringthat we encode withUid.encodeId. That was reasonable. Maybe it still is. But it feels complex and for tsdb who's_idis always some bytes. And encoding it also wastes a byte about 1/128 of the time. It's a common prefix byte so this is probably not really an issue. But still. This is a big change but it'd make ES easier to read. Probably wouldn't really improve the storage though.index.routin_path), the source needs to be parsed on coordinating node. However in the case that an ingest pipeline is executed this, then the source of document will be parsed for the second time. Ideally the routing values should be extracted when ingest is performed. Similar to how the@timestampfield is already retrieved from a document during pipeline execution.Instant. The format being used is:strict_date_optional_time_nanos||strict_date_optional_time||epoch_millis. This to allow regular data format, data nanos date format and epoch since mills defined as string. We can optimise the data parsing if we know the exact format being used. For example if on data stream there is parameter that indices that exact data format we can optimise parsing by either usingstrict_date_optional_time_nanos,strict_date_optional_timeorepoch_millis.