Real-Time Data Processing
Process data as it arrives.
Frameworks
Apache Kafka: messaging. Flink: processing. Spark Streaming: micro-batching.
Concepts
Event time vs processing time. Watermarks: handle late data. Windows: aggregate over time.
Patterns
Exactly-once: end-to-end. ETL: extract, transform, load. Enrichment: add features.
Key Takeaways
- Kafka for messaging, Flink for processing
- Handle event time carefully
- Exactly-once guarantees