https://community.hortonworks.com/articles/43057/rack-awareness-1.html
https://community.hortonworks.com/questions/71458/can-anyone-explain-kafka-rack-awareness-feature.html
Rack awareness is having the knowledge of Cluster topology or more specifically how the different data nodes are distributed across the racks of a Hadoop cluster. The importance of this knowledge relies on this assumption that collocated data nodes inside a specific rack will have more bandwidth and less latency whereas two data nodes in separate racks will have comparatively less bandwidth and higher latency.
The main purpose of Rack awareness is:
- Increasing the availability of data block
- Better cluster performance
https://community.hortonworks.com/questions/71458/can-anyone-explain-kafka-rack-awareness-feature.html
As you pointed-out, Kafka 0.10.0.0 supports rack awareness. KAFKA-1215 added a rack-id to kafka config. You can specify that a broker belongs to a particular rack by adding a property to the broker config: broker.rack=my-rack-id.
The rack awareness feature spreads replicas of the same partition across different racks. This extends the guarantees Kafka provides for broker-failure to cover rack-failure, limiting the risk of data loss should all the brokers on a rack fail at once. The feature can also be applied to other broker groupings such as availability zones in EC2.