Thursday, December 10, 2020

Kafka vs Flume

Kafka: pub-sub mechanism, durable distributed store, producer consumer.

Messages published to topics & read with offsets.

Publishers: post to topics. Consumers: not overwhelmed.

RF: 2 also supported.

HDFS consumer possible. No processing possible in Kafka.


Flume: push mechanism. Memory & file channels. Memory not durable but fast. 

Source & sink. Push to channel which pushes to sinks.

Many consumers: Flume topology: add channels. Memory to file possible. File channels to HDFS or HBase or Cassandra.

Adv: optimized for HDFS. HDFS sink part of same ecosystem. Data processing possible in topology, such as PARQ parquet columnar format, instream transform.


Flafka: Flume to Kafka


Both can coexist.


No comments:

Post a Comment

Free AI Chat tools

https://grok.com https://x.com/i/grok https://chatgpt.com https://copilot.microsoft.com https://chat.deepseek.com https://www.meta.ai https:...