• +91-8107108740
  • F-5, F-6 4th Floor Dana Pani Restaurant, Central Spine, Vidhyadhar Nagar Jaipur.
UseCase

The client requested assistance in setting up Cross-Cluster Replication (CCR) between two Elasticsearch clusters: one serving as the leader cluster and the other as the follower cluster. The client sought a detailed step-by-step guide to configure CCR in a Windows environment, including data synchronization and replication processes.

Upon receiving the request, the expert gathered details about the client’s Elasticsearch setup, including:

· Cluster information: Version, single-node setup, and network configuration.

· Licenses: Ensuring both clusters support CCR (requires Platinum or Enterprise license)

· Security: Confirmation that the clusters will operate without security for simplicity

· Replication scope: Identifying the indices requiring replication.

After gathering the details, the expert provided the client with a structured implementation plan and scheduled a follow-up meeting to address queries.

The expert designed a comprehensive plan to implement CCR as follows:

1. Pre-Configuration Preparations

A. Prerequisites:

Install Elasticsearch (7.16 or higher) on both leader and follower machines. Ensure both clusters can communicate over the network. Set up Java Runtime Environment (if not included in your Elasticsearch distribution). Verify that both clusters have compatible configurations and licenses.

B. Take Snapshots:

1. On the leader cluster:

· Create a backup of indices you wish to replicate

PUT _snapshot/my_backup/snapshot_1

· Confirm the snapshot is successful.

2. On the follower cluster:

· Import the snapshot repository to ensure access to any critical pre-existing data.

2. Setup Leader and Follower Clusters:

A. Configure Elasticsearch YAML Files:

1. On the Leader Cluster:

				

cluster.name: leader-cluster

node.name: leader-node

network.host: 0.0.0.0

http.port: 9200

discovery.seed_hosts: []

cluster.initial_master_nodes: ["leader-node"]

2. On the Follower Cluster:

				

cluster.name: follower-cluster

node.name: follower-node

network.host: 0.0.0.0

http.port: 9200

discovery.seed_hosts: []

cluster.initial_master_nodes: ["follower-node"]

3. Restart Elasticsearch services after modifying the configuration

B. Connect Clusters:

Use the proxy mode for CCR to simplify the setup in Windows environments:

				

PUT /_cluster/settings

{

"persistent": {

"cluster.remote.leader-cluster.mode": "proxy",

"cluster.remote.leader-cluster.proxy_address": "LEADER_CLUSTER_IP:9300",

"cluster.remote.leader-cluster.skip_unavailable": true

}

}

3. Create Leader Index for Replication

A. On the leader cluster, create an index for replication:

				

PUT /leader-index

{

"settings": {

"index.number_of_shards": 1,

"index.number_of_replicas": 0

}

}

B. Add data to the leader index:

				

POST /leader-index/_doc

{

"field1": "value1",

"field2": "value2"

}

4. Set Up CCR on Follower Cluster

A. Create a Follower Index:

				

PUT /follower-index/_ccr/follow

{

"remote_cluster": "leader-cluster",

"leader_index": "leader-index"

}

B. Verify Replication:

POST /leader-index/_doc

C. Test Data Replication:

1. Add data to the leader index:

				

POST /leader-index/_doc

{

"field1": "new value",

"field2": "additional data"

}

2. Query the follower index to confirm the replication:

GET /follower-index/_search

5. Post-Setup Monitoring

A. Cluster Health Checks:

Monitor the health of both clusters:

GET /_cluster/health

B. Log Monitoring:

Monitor Elasticsearch logs for any errors or warnings related to CCR:

On Windows, check logs in the logs directory of the Elasticsearch installation.

C. Performance Metrics:

Review replication performance using:

GET /_ccr/stats

The solution provided a step-by-step approach for setting up Cross-Cluster Replication in Elasticsearch. By ensuring proper pre-configuration, establishing reliable communication between clusters, and systematically setting up leader and follower indices, the client can achieve seamless data replication.

Benefits:

· High Availability: Data is replicated for fault tolerance.

· Real-Time Syncing: Changes on the leader cluster are mirrored in near real-time to the follower cluster.

· Ease of Management: Proxy mode simplifies cluster communication setup. The client is now equipped with a detailed guide to implement and maintain CCR, ensuring reliable data synchronization between the two clusters.