exactly once
Some notes I took while watching Introducing Exactly Once Semantics talk, presented by Apurva Mehta of Confluent.
1) Idempotent Producer
producer.send
will always lead to one copy in the log- not in the consumer - you can still consume a message multiple times
Done with configuration:
How is this done? Metadata on each message.
- producer ID (assigned by the broker)
- sequence ID for the message - producer and topic leader agree. valid for the producer session only.
- this metadata is kept in the log (enabling resilience around changing leaders for example)
- If producer doesn’t get an ack, it resends the message. If broker has already processed the message, it just sends an
ack
without writing it to the log
2) Transaction API
New components in the Kafka ecosystem
- Transaction coordinator - maintains transaction state on a per-producer basis. Runs within broker.
- Transaction log - persists the state
Producer side:
Consumers can read_committed
or read_uncommitted
which is a configuration
Messages are still in offset order. Transaction order can differ but is invisible to consumers.