ClickHouse Cluster with ClickHouse Keeper

This project sets up a ClickHouse Cluster with one shard and two replicas using ClickHouse Keeper. Each component is containerized using Docker Compose.

Features

1 Shard, 2 Replicas: Provides data replication and failover capabilities.
ClickHouse Keeper: Manages cluster state and leader election.
Separation of Components: Each ClickHouse and Keeper instance runs in separate Docker Compose files, allowing for flexibility in deployment across different servers.

Project Structure

clickhouse-01: First replica of the ClickHouse database.
clickhouse-02: Second replica of the ClickHouse database.
clickhouse-keeper-01, clickhouse-keeper-02, clickhouse-keeper-03: ClickHouse Keeper instances.
ch-proxy: Proxy to route requests to the ClickHouse cluster.

Setup

Clone the repository:

git clone https://github.com/saratqpy/clickhouse-cluster.git

Start the containers using Docker Compose:
```
docker-compose up -d
```

Verify the setup by connecting to the cluster:

docker exec -it clickhouse-01 clickhouse-client

Creating the Clustered Database and Tables

Open DBeaver or any SQL client and create a new SQL script:
```
CREATE DATABASE test ON CLUSTER cluster_1S_2R;
```
The cluster name cluster_1S_2R is defined in the config.xml under the remote_servers section.

Create a replicated table across the cluster:

CREATE TABLE test.my_table ON CLUSTER cluster_1S_2R (
    id UInt32,
    name String
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/my_table', '{replica}') 
ORDER BY id;

This uses the ReplicatedMergeTree engine to maintain a copy of the data across replicas. The table is named my_table.

Create a distributed table:
```
CREATE TABLE test.my_table_distributed ON CLUSTER cluster_1S_2R AS test.my_table
ENGINE = Distributed(cluster_1S_2R, test, my_table, rand());
```
This will distribute data across shards, ensuring that if multiple shards exist, tables in the same shard hold the same data.
View cluster information:
```
SELECT * FROM system.clusters;
```

Inserting Data

There are two ways to insert data into the tables:

Manual SQL Execution:
```
INSERT INTO test.my_table_distributed VALUES (1, 'abc');
```
Note: Direct inserts will throw an error if one of the nodes is down.
Using CH Proxy: You can use CH Proxy to automatically detect and avoid sending requests to nodes that are down. Here's an example of inserting data through the proxy:
```
curl -u admin_1S_2R:1 -X POST http://127.0.0.1:80/ -d "INSERT INTO test.my_table VALUES (1, 'abc')"
```
- admin_1S_2R: Username set in the config.yml file for CH Proxy.
- 1: Password defined in the same file.
- Port 80: The configured port for CH Proxy.

Querying Data

To query data and see which node executed the query:

curl -u admin_1S_2R:1 -X POST http://127.0.0.1:80/ -d "SELECT hostName(), * FROM test.my_table LIMIT 1"
‍‍‍

The hostName() function returns the name of the node that executed the query.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ClickHouse Cluster with ClickHouse Keeper

Features

Project Structure

Setup

Creating the Clustered Database and Tables

Inserting Data

Querying Data

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ch-proxy		ch-proxy
clickhouse-01		clickhouse-01
clickhouse-02		clickhouse-02
clickhouse-keeper-01		clickhouse-keeper-01
clickhouse-keeper-02		clickhouse-keeper-02
clickhouse-keeper-03		clickhouse-keeper-03
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

License

Saratqpy/clickhouse-cluster

Folders and files

Latest commit

History

Repository files navigation

ClickHouse Cluster with ClickHouse Keeper

Features

Project Structure

Setup

Creating the Clustered Database and Tables

Inserting Data

Querying Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages