Skip to main content

Why is ClickHouse Keeper recommended over ZooKeeper?

· 2 min read
ClickHouse Keeper improves upon ZooKeeper with features like reduced disk space usage, faster recovery, and less memory consumption, offering better performance for ClickHouse clusters.

Question

ClickHouse Keeper provides the coordination system for data replication and distributed DDL queries execution. ClickHouse Keeper is compatible with ZooKeeper, but it might not be obvious why you should use ClickHouse Keeper instead of ZooKeeper. This article discusses some of the benefits of Keeper.

Answer

ClickHouse Cloud uses clickhouse-keeper at large scale for thousands of services in a multi-tenant environment. We designed and built Keeper so that we could remove our dependency on the Java-based ZooKeeper implementation. ClickHouse Keeper solves many well-known drawbacks of ZooKeeper and makes additional improvements, including:

  • Snapshots and logs consume much less disk space due to better compression
  • No limit on the default packet and node data size (it is 1 MB in ZooKeeper)
  • No zxid overflow issue (it forces a restart for every 2B transactions in ZooKeeper)
  • Faster recovery after network partitions due to the use of a better distributed consensus protocol
  • It uses less memory for the same volume of data
  • It is easier to setup, and it does not require specifying the JVM heap size or a custom garbage collection implementation
  • A few custom commands in the protocol enable faster operations in ReplicatedMergeTree tables
  • A larger coverage by Jepsen tests

In addition, ClickHouse Support has observed a massive decrease in cluster problems in cases with sites who use clickhouse-keeper rather than ZooKeeper.

Check out the Keeper docs page for more details on how to configure and run ClickHouse Keeper.