Kafka: pub-sub mechanism, durable distributed store, producer consumer.
Messages published to topics & read with offsets.
Publishers: post to topics. Consumers: not overwhelmed.
RF: 2 also supported.
HDFS consumer possible. No processing possible in Kafka.
Flume: push mechanism. Memory & file channels. Memory not durable but fast.
Source & sink. Push to channel which pushes to sinks.
Many consumers: Flume topology: add channels. Memory to file possible. File channels to HDFS or HBase or Cassandra.
Adv: optimized for HDFS. HDFS sink part of same ecosystem. Data processing possible in topology, such as PARQ parquet columnar format, instream transform.
Flafka: Flume to Kafka
Both can coexist.
No comments:
Post a Comment