Kiến Thức Linux

How to setup cluster Redpanda on Production

redpanda

Redpanda is a distributed real-time data processing platform designed to replace and enhance Apache Kafka with higher performance, easier deployment, and cost optimization. It is a powerful tool, ideal for applications requiring stream data processing, such as log analysis, system monitoring, or real-time event handling.

1.Prerequisites

Make sure you meet the hardware and software requirements.

Operating system

  • Minimum version required of RHEL/CentOS: 8. Recommended: 9+
  • Minimum version required of Ubuntu: 20.04 LTS. Recommended: 22.04+

Recommendation: Linux kernel 4.19 or later for better performance.

Number of nodes

Recommendations: Deploy at least three Redpanda brokers.

Tuning


Before deploying Redpanda to production, each node that runs Redpanda must be tuned to optimize the Linux kernel for Redpanda processes.
See Deploy for Production: Manual. At “Tune the Linux kernel for production

TCP/IP ports

Redpanda uses the following default ports:

PortPurpose
9092Kafka API
8082HTTP Proxy
8081Schema Registry
9644Admin API and Prometheus
33145internal RPC

2. Setup Cluster Redpanda

2.1. Install Redpanda

Install Redpanda on each system you want to be part of your cluster. There are binaries available for Fedora/RedHat or Debian systems.

Unless you intend to run Redpanda in FIPS-compliance mode, the following packages should accommodate your needs (for both Debian and Redhat based systems):

redpanda

  • Contains the Redpanda application and all supporting libraries
  • Depends on redpanda-tunersand either redpanda-rpkor redpanda-rpk-fips

redpanda-rpk

  • Contains the pure GoLang compiled rpkapplication
  • If you wish to use rpkonly, then this is the only required install package

redpanda-tuners

  • Contains the files used to run Redpanda tuners
  • Depends on redpanda-rpkor redpanda-rpk-fips

Fedora/RedHat

curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.rpm.sh' | \sudo -E bash && sudo yum install redpanda -y

Debian/Ubuntu


curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.deb.sh' | \sudo -E bash && sudo apt install redpanda -y

OUTPUT:


To get the most out of the fastest queue in the west, enable production mode by
running the following:
 
sudo rpk redpanda mode production
 
followed by:
 
sudo rpk redpanda tune all
sudo systemctl start redpanda
 
This will autotune your system to give you the best performance from Redpanda.
You can get more information on the tuning parameters here:
https://docs.redpanda.com/docs/introduction/autotune/

2.2. Install Redpanda Console

Redpanda Console is a developer-friendly web UI for managing and debugging your Redpanda cluster and your applications.
For each new release, Redpanda compiles the Redpanda Console to a single binary for Linux, macOS, and Windows. You can find the binaries in the attachments of each release on GitHub.

Fedora/RedHat


curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.rpm.sh' | \sudo -E bash && sudo yum install redpanda-console -y

Debian/Ubuntu


curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.deb.sh' | \sudo -E bash && sudo apt-get install redpanda-console -y

OUTPUT:


Unpacking redpanda-console (2.8.0) ...
Setting up redpanda-console (2.8.0) ...
redpanda:x:997:998::/var/lib/redpanda:/bin/sh
redpanda:x:998:
 
Redpanda Console is installed succesfully. To start Console, run the
following:
 
sudo systemctl start redpanda-console

2.3. Tune the Linux kernel for production

To get the best performance from your hardware, set Redpanda to production mode on each node and run the autotuner tool. The autotuner identifies the hardware configuration of your node and optimizes the Linux kernel to give you the best performance.
By default, Redpanda is installed in development mode, which turns off hardware optimization.

  1. Make sure that your current Linux user has root privileges. The autotuner requires privileged access to the Linux kernel settings.
  2. Set Redpanda to run in production mode:
sudo rpk redpanda mode production

OUTPUT:Successfully set mode to "production".

  3. Tune the Linux kernel:

sudo rpk redpanda tune all

Changes to the Linux kernel are not persisted. If a node restarts, make sure to run the autotuner again.
To automatically tune the Linux kernel on a Redpanda broker after the node restarts, enable the redpanda-tuner service, which runs rpk redpanda tune all:

  • For RHEL, after installing the rpm package, run systemctl to both start and enable the redpanda-tuner service:
sudo systemctl start redpanda-tuner
sudo systemctl enable redpanda-tuner
sudo systemctl status redpanda-tuner
  • For Ubuntu, after installing the apt package, run systemctl to start the redpanda-tuner service (which is already enabled):
sudo systemctl start redpanda-tuner
sudo systemctl enable redpanda-tuner
sudo systemctl status redpanda-tuner

For more details, see the autotuner reference.

2.4. Generate optimal I/O configuration settings


After tuning the Linux kernel, you can optimize Redpanda for the I/O capabilities of your worker node by using rpk to run benchmarks that capture its read/write IOPS and bandwidth capabilities. After running the benchmarks rpk saves the results to an I/O configuration file (io-config.yaml) that Redpanda reads upon startup to optimize itself for the node.

Noted:
Unlike the autotuner, it isn’t necessary to run rpk iotune each time Redpanda is started, as its I/O output configuration file can be reused for each node that runs on the same type of hardware.

Run rpk iotune: # takes 10mins

sudo rpk iotune 

OUTPUT:


Starting iotune...
IO configuration file stored as "/etc/redpanda/io-config.yaml"

For reference, a local NVMe SSD should yield around 1 GB/s sustained writes. rpk iotune captures SSD wear and tear and gives accurate measurements of what your hardware is capable of delivering. Run this before benchmarking.
If you’re on AWS, GCP, or Azure, creating a new instance and upgrading to an image with a recent Linux kernel version is often the easiest way to work around bad devices.

2.5. REBOOT ALL SERVER

reboot

2.6. Configure redpanda servers

Example:
Node 01: 192.168.1.100
Node 02: 192.168.1.101
Node 03: 192.168.1.102

Seed servers help new brokers join a cluster by directing requests from newly-started brokers to an existing cluster. The seed_serversbroker configuration property controls how Redpanda finds its peers when initially forming a cluster. It is dependent on the empty_seed_starts_clusterbroker configuration property.

Starting with Redpanda version 22.3, you should explicitly set empty_seed_starts_clusterto falseon every broker, and every broker in the cluster should have the same value set for seed_servers. With this set of configurations, Redpanda clusters form with these guidelines:

  • When a broker starts and it is a seed server (its address is in the seed_serverslist), it waits for all other seed servers to start up, and it forms a cluster with all seed servers as members.
  • When a broker starts and it is not a seed server, it sends requests to the seed servers to join the cluster.

It is essential that all seed servers have identical values for the seed_serverslist. Redpanda strongly recommends at least three seed servers when forming a cluster. Each seed server decreases the likelihood of unintentionally forming a split brain cluster. To ensure brokers can always discover the cluster, at least one seed server should be available at all times.

By default, for backward compatibility, empty_seed_starts_clusteris set to true, and Redpanda clusters form with the guidelines used prior to version 22.3:

  • When a broker starts with an empty seed_serverslist, it creates a single broker cluster with itself as the only member.
  • When a broker starts with a non-empty seed_serverslist, it sends requests to the brokers in that list to join the cluster.

You should never have more than one broker with an empty seed_serverslist, which would result in the creation of multiple clusters.

Do not configure broker IDs

Redpanda automatically generates unique broker IDs for each new broker and assigns it to the node_idfield in the broker configuration. This ensures safe and consistent cluster operations without requiring manual configuration.

WARNING:


Do not set node_id manually.
Redpanda assigns unique IDs automatically to prevent issues such as:
 
- Brokers with empty disks rejoining the cluster.
- Conflicts during recovery or scaling.
 
Manually setting or reusing node_id values, even for decommissioned brokers, can cause cluster inconsistencies and operational failures.

Bootstrap broker configurations

Each broker requires a set of broker configurations that determine how all brokers communicate with each other and with clients. Bootstrapping a cluster configures the listener, seed servers, and advertised listeners, which ensure proper network connectivity and accessibility.

Node 1: ( 192.168.1.100 )


cat << FLAG | tee  /etc/redpanda/redpanda.yaml
redpanda:
    data_directory: /mnt/data/redpanda/data
    seed_servers: []
    rpc_server:
        address: 192.168.1.100
        port: 33145
    kafka_api:
        - address: 0.0.0.0
          port: 9092
          name: private
        - address: 0.0.0.0
          port: 19092
          name: public
    admin:
        - address: 192.168.1.100
          port: 9644
    advertised_rpc_api:
        address: 192.168.1.100
        port: 33145
    advertised_kafka_api:
        - address: 192.168.1.100
          port: 9092
          name: private
        #- address: 116.118.89.100
        #  port: 19092
        #  name: public
    rpc_server_tcp_recv_buf: 65536
rpk:
    smp: 4
    tune_network: true
    tune_disk_scheduler: true
    tune_disk_nomerges: true
    tune_disk_write_cache: true
    tune_disk_irq: true
    tune_cpu: true
    tune_aio_events: true
    tune_clocksource: true
    tune_swappiness: true
    coredump_dir: /mnt/data/redpanda/coredump
    tune_ballast_file: true
pandaproxy: {}
schema_registry: {}
cluster_id: redpanda
organization: redpanda-cluster
FLAG

Recommendations

  • Redpanda Data strongly recommends at least three seed servers when forming a cluster. A larger number of seed servers increases the robustness of consensus and minimizes any chance that new clusters get spuriously formed after brokers are lost or restarted without any data.
  • It’s important to have one or more seed servers in each fault domain (for example, in each rack or cloud AZ). A higher number provides a stronger guarantee that clusters don’t fracture unintentionally.
  • It’s possible to change the seed servers for a short period of time after a cluster has been created. For example, you may want to designate one additional broker as a seed server to increase availability. To do this without cluster downtime, add the new broker to the seed_serversproperty and restart Redpanda to apply the change on a broker-by-broker basis.

Listeners for mixed environments

For clusters serving both internal and external clients, configure multiple listeners for the Kafka API to separate internal from external traffic.
For more details, see Configure Listeners.

Start Redpanda on Node 1


sudo systemctl restart redpanda-tuner 
sudo systemctl enable redpanda-tuner
sudo systemctl status redpanda-tuner
 
sudo systemctl restart redpanda
sudo systemctl enable redpanda
sudo systemctl status redpanda

Start Redpanda Console on Node 1


sudo systemctl restart redpanda-console
sudo systemctl enable redpanda-console
sudo systemctl status redpanda-console

Node 2: ( 192.168.1.101 )


cat << FLAG | tee /etc/redpanda/redpanda.yaml
redpanda:
    data_directory: /mnt/data/redpanda/data
    seed_servers:
        - host:
            address: 192.168.1.100
            port: 33145
        - host:
            address: 192.168.1.101
            port: 33145
        - host:
            address: 192.168.1.102
            port: 33145
    rpc_server:
        address: 192.168.1.101
        port: 33145
    kafka_api:
        - address: 0.0.0.0
          port: 9092
          name: private
        - address: 0.0.0.0
          port: 19092
          name: public
    admin:
        - address: 192.168.1.101
          port: 9644
    advertised_rpc_api:
        address: 192.168.1.101
        port: 33145
    advertised_kafka_api:
        - address: 192.168.1.101
          port: 9092
          name: private
        #- address: 116.118.89.101
        #  port: 19092
        #  name: public
    rpc_server_tcp_recv_buf: 65536
rpk:
    smp: 4
    tune_network: true
    tune_disk_scheduler: true
    tune_disk_nomerges: true
    tune_disk_write_cache: true
    tune_disk_irq: true
    tune_cpu: true
    tune_aio_events: true
    tune_clocksource: true
    tune_swappiness: true
    coredump_dir: /mnt/data/redpanda/coredump
    tune_ballast_file: true
pandaproxy: {}
schema_registry: {}
cluster_id: redpanda
organization: redpanda-cluster
FLAG
 
 
sudo systemctl restart redpanda-tuner
sudo systemctl enable redpanda-tuner
sudo systemctl status redpanda-tuner
 
sudo systemctl restart redpanda
sudo systemctl enable  redpanda
sudo systemctl status  redpanda
 
sudo systemctl restart redpanda-console
sudo systemctl enable redpanda-console
sudo systemctl status redpanda-console

Node 3: ( 192.168.1.102 )


cat << FLAG | tee /etc/redpanda/redpanda.yaml
redpanda:
    data_directory: /mnt/data/redpanda/data
    seed_servers:
        - host:
            address: 192.168.1.100
            port: 33145
        - host:
            address: 192.168.1.101
            port: 33145
        - host:
            address: 192.168.1.102
            port: 33145
    rpc_server:
        address: 192.168.1.102
        port: 33145
    kafka_api:
        - address: 0.0.0.0
          port: 9092
          name: private
        - address: 0.0.0.0
          port: 19092
          name: public
    admin:
        - address: 192.168.1.102
          port: 9644
    advertised_rpc_api:
        address: 192.168.1.102
        port: 33145
    advertised_kafka_api:
        - address: 192.168.1.102
          port: 9092
          name: private
        #- address: 116.118.89.102
        #  port: 19092
        #  name: public
    rpc_server_tcp_recv_buf: 65536
rpk:
    smp: 4
    tune_network: true
    tune_disk_scheduler: true
    tune_disk_nomerges: true
    tune_disk_write_cache: true
    tune_disk_irq: true
    tune_cpu: true
    tune_aio_events: true
    tune_clocksource: true
    tune_swappiness: true
    coredump_dir: /mnt/data/redpanda/coredump
    tune_ballast_file: true
pandaproxy: {}
schema_registry: {}
cluster_id: redpanda
organization: redpanda-cluster
FLAG
 
 
sudo systemctl restart redpanda-tuner
sudo systemctl enable redpanda-tuner
sudo systemctl status redpanda-tuner
 
sudo systemctl restart redpanda
sudo systemctl enable  redpanda
sudo systemctl status  redpanda
 
sudo systemctl restart redpanda-console
sudo systemctl enable redpanda-console
sudo systemctl status redpanda-console

2.8. Verify the installation

To verify that the Redpanda cluster is up and running, use rpkto get information about the cluster:

rpk cluster info


OUTPUT

CLUSTER
=======
redpanda.3d97e763-1825-48a7-9e9b-32824bad8341
 
BROKERS
=======
ID    HOST          PORT
0*    192.168.1.100  9092
1     192.168.1.101  9092
2     192.168.1.102  9092

To create a topic:

rpk topic create <topic-name>
Example:
rpk topic create topic-hello

If topics were initially created in a test environment with a replication factor of 1, use rpk topic alter-configto change the topic replication factor:

rpk topic alter-config <topic-names> --set replication.factor=3
Example:
rpk topic alter-config topic-hello --set replication.factor=3

2.9. Access GUI console

Example: 
http://192.168.1.100:8080/

In conclusion

Setting up a Redpanda cluster is a straightforward process that combines simplicity with powerful performance. By following the installation steps and best practices, you can leverage Redpanda’s capabilities to handle real-time data streaming with ease and efficiency. Whether you’re scaling a small project or deploying a robust enterprise solution, Redpanda’s modern architecture ensures reliability and high throughput. With your cluster up and running, you’re now ready to build innovative applications and unlock the full potential of real-time data processing.

Referrence:

Đăng ký liền tay Nhận Ngay Bài Mới

Subscribe ngay

Cám ơn bạn đã đăng ký !

Lỗi đăng ký !

Tags

Add Comment

Click here to post a comment

Đăng ký liền tay
Nhận Ngay Bài Mới

Subscribe ngay

Cám ơn bạn đã đăng ký !

Lỗi đăng ký !