[Structured Streaming] Exactly-once guarantee with ILM/Rollover #1386

danielyahn · 2019-11-07T19:08:34Z

What kind an issue is this?

Feature Request.

Feature description

When using ILM (index lifecycle management) and its roll over API implementation, your ingest job needs to point at the write alias. ILM, especially with its ability to roll over by size, has benefits on operation.

However, it's not possible to have exactly-once guarantee when using write alias. Current checkpoint implementation for ES Sink doesn't capture the actual index that write alias points to.

Therefore, when if you're replaying some batches (whether one batch that failed halfway or multiple batches for any operational reason), you can't guarantee that your records are going to same index.

jbaiera · 2019-11-08T19:00:07Z

I'm marking this as discuss for the team since this is a general problem across multiple streaming write workloads that are using ILM for managing indices.

jakelandis · 2019-11-14T22:06:34Z

Related: elastic/elasticsearch#44794

jbaiera added the discuss label Nov 8, 2019

jbaiera added the :Core label Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Structured Streaming] Exactly-once guarantee with ILM/Rollover #1386

[Structured Streaming] Exactly-once guarantee with ILM/Rollover #1386

danielyahn commented Nov 7, 2019

jbaiera commented Nov 8, 2019

jakelandis commented Nov 14, 2019

[Structured Streaming] Exactly-once guarantee with ILM/Rollover #1386

[Structured Streaming] Exactly-once guarantee with ILM/Rollover #1386

Comments

danielyahn commented Nov 7, 2019

What kind an issue is this?

Feature description

jbaiera commented Nov 8, 2019

jakelandis commented Nov 14, 2019