Set up PMM in High Availability (HA) mode¶

High Availability (HA) is a critical aspect of any monitoring system, as it ensures that your monitoring infrastructure remains resilient and continues to function seamlessly, even if one or more instances encounter issues. HA implements redundant systems that are ready to take over to minimize downtime and maintain continuous visibility into the performance and health of PMM.

HA options PMM¶

Since HA can add complexity, before considering HA for PMM, keep in mind that:

Critical systems requiring immediate response benefit from sub-second failover HA, while less critical applications with some tolerance for downtime (seconds or minutes) have more flexibility.
PMM itself has a one-minute minimum alerting interval, so even with perfect HA, the fastest you’ll know about an issue is one minute after it occurs.
Consider your specific uptime needs, performance requirements, and potential data loss you can tolerate, while also keeping in mind PMM’s limitations.

1. Simple Docker restart with data caching¶

The most straightforward approach to increase availability in PMM is to launch the PMM Server within Docker using the --restart=always flag. See Setting up PMM Server with Docker for more information.

This ensures that the PMM Server automatically restarts if a minor issue occurs. Additionally, PMM’s data caching feature stores data locally on the PMM Client when the connection to the PMM Server is interrupted.

Once the connection is restored, the cached data is transferred to the PMM Server, ensuring no data loss during the restart process.

This option is suitable for scenarios where the primary concern is the ability to investigate potential issues later. However, it’s important to note that this approach is limited by the underlying physical infrastructure. If the failure stems from a hardware issue, automatic recovery might be challenging.

2. Leverage Kubernetes for enhanced isolation¶

If you are running PMM in a Kubernetes (K8s) environment, PMM offers a Helm chart that facilitates running PMM with enhanced isolation. See Setting up PMM Server with Helm.

In this setup, even if the physical infrastructure encounters a problem, K8s automatically handles failover, migrating the PMM instance to a healthy node.

While restarts within K8s can take up to several minutes (depending on your infrastructure configuration), PMM’s data caching ensures that information is preserved during this transition. Alerts will still be triggered to keep you informed about any issues that started during PMM’s restart and continue after PMM is back.

3. Fully-clustered PMM in Kubernetes (coming Q3/2024)¶

If you have a large deployment with numerous instances and distributed locations, you might find that a fully clustered PMM setup in Kubernetes is better suited to your needs. We are actively developing this solution, which is slated for release in Q3/2024, to cater specifically to users managing extensive and complex monitoring environments.

This option will provide a comprehensive HA solution, including clustered database setups (ClickHouse, VictoriaMetrics, and PostgreSQL). In this setup, multiple PMM instances will be configured, with one being the leader and the others as followers.

Leader election will be managed using the Raft consensus algorithm, ensuring a smooth transition of the leader role if the current leader fails. The architecture will consist of:

Multiple PMM instances for redundancy
Clustered PostgreSQL for storing metadata and configuration data
Clustered ClickHouse for storing query performance metrics (Query Analytics)
Clustered VictoriaMetrics for storing operational metrics from monitored databases and hosts
HAProxy for managing and directing network traffic to the current leader PMM instance

4. Manual setup for HA¶

Important

Manual setup for HA is feature is currently in Technical Preview. Early adopters are advised to use this feature for testing purposes only as it is subject to change.

If none of the above options work for your specific use case, consider setting up PMM in HA mode manually by following the steps below.

To enable communication and coordination among the PMM Server instances, two key protocols are used:

Gossip protocol: Enables PMM servers to discover and share information about their states. It is used for managing the PMM server list and failure detection, ensuring that all instances are aware of the current state of the cluster.
Raft protocol: Ensures that PMM servers agree on a leader and that logs are replicated among all machines to maintain data consistency.

These protocols work in tandem to ensure that the PMM Server instances can effectively store and manage the data collected from your monitored databases and systems.

In an HA configuration, three PMM Server instances are configured: one as the leader and the others as followers. The leader server handles all client requests. If the leader fails, the followers take over, minimizing downtime.

To eliminate single points of failure and provide better service level agreements (SLAs), the critical services typically bundled with PMM Server are extracted and set up as separate, clustered instances:

ClickHouse: A clustered setup of ClickHouse is used to store Query Analytics (QAN) metrics. This ensures that QAN data remains highly available and can be accessed even if one of the ClickHouse nodes fails.
VictoriaMetrics: A clustered setup of VictoriaMetrics is used to store Prometheus metrics. This provides a highly available and scalable solution for storing and querying metrics data.
PostgreSQL: A clustered setup of PostgreSQL is used to store PMM data, such as inventory and settings. This ensures that PMM’s configuration and metadata remain highly available and can be accessed by all PMM Server instances.

Prerequisites¶

You will need the following before you can begin the deployment:

Docker installed and configured on your system. If you haven’t installed Docker, you can follow this guide.

Note

The sections below provide instructions for setting up the services on both the same and separate instances. However, it is not recommended to run the services on a single machine for production purposes. This approach is only recommended for the development environment.
It is recommended to use clustered versions of PosgreSQL, Victoriametrics, Clickhouse, etc., instead of standalone versions when setting up the services.

To set up PMM in HA mode manually:

Step 1: Define environment variables¶

Before you start with the setup, define the necessary environment variables on each instance where the services will be running. These variables will be used in subsequent commands.

For all IP addresses, use the format 17.10.1.x, and for all usernames and passwords, use a string format like example.

Variable	Description
`CH_HOST_IP`	The IP address of the instance where the ClickHouse service is running or the desired IP address for the ClickHouse container within the Docker network, depending on your setup. Example: `17.10.1.2`
`VM_HOST_IP`	The IP address of the instance where the VictoriaMetrics service is running or the desired IP address for the VictoriaMetrics container within the Docker network, depending on your setup. Example: `17.10.1.3`
`PG_HOST_IP`	The IP address of the instance where the PostgreSQL service is running or the desired IP address for the PostgreSQL container within the Docker network, depending on your setup. Example: `17.10.1.4`
`PG_USERNAME`	The username for your PostgreSQL server. Example: `pmmuser`
`PG_PASSWORD`	The password for your PostgreSQL server. Example: `pgpassword`
`GF_USERNAME`	The username for your Grafana database user. Example: `gfuser`
`GF_PASSWORD`	The password for your Grafana database user. Example: `gfpassword`
`PMM_ACTIVE_IP`	The IP address of the instance where the active PMM server is running or the desired IP address for your active PMM server container within the Docker network, depending on your setup. Example: `17.10.1.5`
`PMM_ACTIVE_NODE_ID`	The unique ID for your active PMM server node. Example: `pmm-server-active`
`PMM_PASSIVE_IP`	The IP address of the instance where the first passive PMM server is running or the desired IP address for your first passive PMM server container within the Docker network, depending on your setup. Example: `17.10.1.6`
`PMM_PASSIVE_NODE_ID`	The unique ID for your first passive PMM server node. Example: `pmm-server-passive`
`PMM_PASSIVE2_IP`	The IP address of the instance where the second passive PMM server is running or the desired IP address for your second passive PMM server container within the Docker network, depending on your setup. Example: `17.10.1.7`
`PMM_PASSIVE2_NODE_ID`	The unique ID for your second passive PMM server node. Example: `pmm-server-passive2`
`PMM_DOCKER_IMAGE`	The specific PMM Server Docker image for this guide. Example: `percona/pmm-server:2`

Expected output

export CH_HOST_IP=17.10.1.2
export VM_HOST_IP=17.10.1.3
export PG_HOST_IP=17.10.1.4
export PG_USERNAME=pmmuser
export PG_PASSWORD=pgpassword
export GF_USERNAME=gfuser
export GF_PASSWORD=gfpassword
export PMM_ACTIVE_IP=17.10.1.5
export PMM_ACTIVE_NODE_ID=pmm-server-active
export PMM_PASSIVE_IP=17.10.1.6
export PMM_PASSIVE_NODE_ID=pmm-server-passive
export PMM_PASSIVE2_IP=17.10.1.7
export PMM_PASSIVE2_NODE_ID=pmm-server-passive2
export PMM_DOCKER_IMAGE=percona/pmm-server:2

Note

Ensure that you have all the environment variables from Step 1 set in each instance where you run these commands.

Step 2: Create Docker network (Optional)¶

Set up a Docker network for PMM services if you plan to run all the services on the same instance. As a result of this Docker network, your containers will be able to communicate with each other, which is essential for the High Availability (HA) mode to function properly in PMM. This step may be optional if you run your services on separate instances.

Run the following command to create a Docker network:

docker network create pmm-network --subnet=17.10.1.0/16

Step 3: Set up ClickHouse¶

ClickHouse is an open-source column-oriented database management system. In PMM, ClickHouse stores Query Analytics (QAN) metrics, which provide detailed information about your queries.

To set up ClickHouse:

Pull the ClickHouse Docker image.

docker pull clickhouse/clickhouse-server:23.8.2.7-alpine

Create a Docker volume for ClickHouse data.
```
docker volume create ch_data
```
Run the ClickHouse container.
Run services on same instanceRun services on a seperate instance
docker run -d \ --name ch \ --network pmm-network \ --ip ${CH_HOST_IP} \ -p 9000:9000 \ -v ch_data:/var/lib/clickhouse \ clickhouse/clickhouse-server:23.8.2.7-alpine
docker run -d \ --name ch \ -p 9000:9000 \ -v ch_data:/var/lib/clickhouse \ clickhouse/clickhouse-server:23.8.2.7-alpine
Note
- If you run the services on the same instance, the --network and --ip flags assign a specific IP address to the container within the Docker network created in the previous step. This IP address is referenced in subsequent steps as the ClickHouse service address.
- The --network and --ip flags are not required if the services are running on separate instances since ClickHouse will bind to the default network interface.

Step 4: Set up VictoriaMetrics¶

VictoriaMetrics provides a long-term storage solution for your time-series data. In PMM, it is used to store Prometheus metrics.

To set up VictoriaMetrics:

Pull the VictoriaMetrics Docker image.

docker pull victoriametrics/victoria-metrics:v1.93.4

Create a Docker volume for VictoriaMetrics data.
```
docker volume create vm_data
```

Run the VictoriaMetrics container.

You can either run all the services on the same instance or a separate instance.

Run services on same instanceRun services on a seperate instance

docker run -d \
--name vm \
--network pmm-network \
--ip ${VM_HOST_IP} \
-p 8428:8428 \
-p 8089:8089 \
-p 8089:8089/udp \
-p 2003:2003 \
-p 2003:2003/udp \
-p 4242:4242 \
-v vm_data:/storage \
victoriametrics/victoria-metrics:v1.93.4 \
--storageDataPath=/storage \
--graphiteListenAddr=:2003 \
--opentsdbListenAddr=:4242 \
--httpListenAddr=:8428 \
--influxListenAddr=:8089

docker run -d \
--name vm \
-p 8428:8428 \
-p 8089:8089 \
-p 8089:8089/udp \
-p 2003:2003 \
-p 2003:2003/udp \
-p 4242:4242 \
-v vm_data:/storage \
victoriametrics/victoria-metrics:v1.93.4 \
--storageDataPath=/storage \
--graphiteListenAddr=:2003 \
--opentsdbListenAddr=:4242 \
--httpListenAddr=:8428 \
--influxListenAddr=:8089

Note

If you run the services on the same instance, the --network and --ip flags are used to assign a specific IP address to the container within the Docker network created in Step 2. This IP address is referenced in subsequent steps as the VictoriaMetrics service address.
The --network and --ip flags are not required if the services are running on separate instances, as VictoriaMetrics will bind to the default network interface.

Step 5: Set up PostgreSQL¶

PostgreSQL is a powerful, open-source object-relational database system. In PMM, it’s used to store data related to inventory, settings, and other features.

To set up PostgreSQL:

Pull the Postgres Docker image.
```
docker pull postgres:14
```
Create a Docker volume for Postgres data:
```
docker volume create pg_data
```
Create a directory to store init SQL queries:
```
mkdir -p /path/to/queries
```
Replace /path/to/queries with the path where you want to store your init SQL queries.

Create an init.sql.template file in newly created directory with the following content:

CREATE DATABASE "pmm-managed";
CREATE USER <YOUR_PG_USERNAME> WITH ENCRYPTED PASSWORD '<YOUR_PG_PASSWORD>';
GRANT ALL PRIVILEGES ON DATABASE "pmm-managed" TO <YOUR_PG_USERNAME>;
CREATE DATABASE grafana;
CREATE USER <YOUR_GF_USERNAME> WITH ENCRYPTED PASSWORD '<YOUR_GF_PASSWORD>';
GRANT ALL PRIVILEGES ON DATABASE grafana TO <YOUR_GF_USERNAME>;

\c pmm-managed

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Use sed to replace the placeholders with the environment variables and write the output to init.sql.

sed -e 's/<YOUR_PG_USERNAME>/'"$PG_USERNAME"'/g' \
    -e 's/<YOUR_PG_PASSWORD>/'"$PG_PASSWORD"'/g' \
    -e 's/<YOUR_GF_USERNAME>/'"$GF_USERNAME"'/g' \
    -e 's/<YOUR_GF_PASSWORD>/'"$GF_PASSWORD"'/g' \
    init.sql.template > init.sql

Run the PostgreSQL container.

You can either run all the services on the same instance or on a seperate instance.

Note

It is recommended to use absolute paths instead of relative paths for volume mounts.
Run services on same instanceRun services on a seperate instance
docker run -d \ --name pg \ --network pmm-network \ --ip ${PG_HOST_IP} \ -p 5432:5432 \ -e POSTGRES_PASSWORD=${PG_PASSWORD} \ -v /path/to/queries:/docker-entrypoint-initdb.d/ \ -v pg_data:/var/lib/postgresql/data \ postgres:14 \ postgres -c shared_preload_libraries=pg_stat_statements
docker run -d \ --name pg \ -p 5432:5432 \ -e POSTGRES_PASSWORD=${PG_PASSWORD} \ -v /path/to/queries:/docker-entrypoint-initdb.d \ -v pg_data:/var/lib/postgresql/data \ postgres:14 \ postgres -c shared_preload_libraries=pg_stat_statements
Replace /path/to/queries with the path to your init.sql file. This command mounts the init.sql file to the docker-entrypoint-initdb.d directory, which is automatically executed upon container startup.
Note
- If you run the services on the same instance, the --network and --ip flags are used to assign a specific IP address to the container within the Docker network created in Step 2. This IP address is referenced in subsequent steps as the PostgreSQL service address.
- The --network and --ip flags are not required if the services are running on separate instances, as PostgreSQL will bind to the default network interface.

Step 6: Running PMM Services¶

The PMM server orchestrates the collection, storage, and visualization of metrics. In our high-availability setup, we’ll have one active PMM server and two passive PMM servers.

Pull the PMM Server Docker image:
```
docker pull ${PMM_DOCKER_IMAGE}
```

Create a Docker volume for PMM-Server data:

docker volume create pmm-server-active_data
docker volume create pmm-server-passive_data
docker volume create pmm-server-passive-2_data

Run the active PMM managed server. This server will serve as the primary monitoring server.

You can either run all the services on the same instance or a separate instance.

Run services on same instanceRun services on a seperate instance

docker run -d \
--name ${PMM_ACTIVE_NODE_ID} \
--hostname ${PMM_ACTIVE_NODE_ID} \
--network pmm-network \
--ip ${PMM_ACTIVE_IP} \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=1 \
-e PMM_TEST_HA_NODE_ID=${PMM_ACTIVE_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_ACTIVE_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v pmm-server-active_data:/srv \
${PMM_DOCKER_IMAGE}

docker run -d \
--name ${PMM_ACTIVE_NODE_ID} \
-p 80:80 \
-p 443:443 \
-p 9094:9094 \
-p 9096:9096 \
-p 9094:9094/udp \
-p 9096:9096/udp \
-p 9097:9097 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=1 \
-e PMM_TEST_HA_NODE_ID=${PMM_ACTIVE_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_ACTIVE_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v pmm-server-active_data:/srv \
${PMM_DOCKER_IMAGE}

Run the first passive PMM managed server. This server will act as a standby server, ready to take over if the active server fails.

You can either run all the services on the same instance or a separate instance.

Run services on same instanceRun services on a seperate instance

docker run -d \
--name ${PMM_PASSIVE_NODE_ID} \
--hostname ${PMM_PASSIVE_NODE_ID} \
--network pmm-network \
--ip ${PMM_PASSIVE_IP} \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=0 \
-e PMM_TEST_HA_NODE_ID=${PMM_PASSIVE_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_PASSIVE_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v pmm-server-passive_data:/srv \
${PMM_DOCKER_IMAGE}

docker run -d \
--name ${PMM_PASSIVE_NODE_ID} \
-p 80:80 \
-p 443:443 \
-p 9094:9094 \
-p 9096:9096 \
-p 9094:9094/udp \
-p 9096:9096/udp \
-p 9097:9097 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=0 \
-e PMM_TEST_HA_NODE_ID=${PMM_PASSIVE_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_PASSIVE_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v pmm-server-passive_data:/srv \
${PMM_DOCKER_IMAGE}

Run the second passive PMM managed server. Like the first passive server, this server will also act as a standby server.

You can either run all the services on the same instance or a separate instance.

Run services on same instanceRun services on a seperate instance

docker run -d \
--name ${PMM_PASSIVE2_NODE_ID} \
--hostname ${PMM_PASSIVE2_NODE_ID} \
--network pmm-network \
--ip ${PMM_PASSIVE2_IP} \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=0 \
-e PMM_TEST_HA_NODE_ID=${PMM_PASSIVE2_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_PASSIVE2_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v pmm-server-passive-2_data:/srv \
${PMM_DOCKER_IMAGE}

docker run -d \
--name ${PMM_PASSIVE2_NODE_ID} \
-p 80:80 \
-p 443:443 \
-p 9094:9094 \
-p 9096:9096 \
-p 9094:9094/udp \
-p 9096:9096/udp \
-p 9097:9097 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_CLICKHOUSE=1 \
-e PERCONA_TEST_PMM_DISABLE_BUILTIN_POSTGRES=1 \
-e PERCONA_TEST_PMM_CLICKHOUSE_ADDR=${CH_HOST_IP}:9000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_DATABASE=pmm \
-e PERCONA_TEST_PMM_CLICKHOUSE_BLOCK_SIZE=10000 \
-e PERCONA_TEST_PMM_CLICKHOUSE_POOL_SIZE=2 \
-e PERCONA_TEST_POSTGRES_ADDR=${PG_HOST_IP}:5432 \
-e PERCONA_TEST_POSTGRES_USERNAME=${PG_USERNAME} \
-e PERCONA_TEST_POSTGRES_DBPASSWORD=${PG_PASSWORD} \
-e GF_DATABASE_URL=postgres://${GF_USERNAME}:${GF_PASSWORD}@${PG_HOST_IP}:5432/grafana \
-e PMM_VM_URL=http://${VM_HOST_IP}:8428 \
-e PMM_TEST_HA_ENABLE=1 \
-e PMM_TEST_HA_BOOTSTRAP=0 \
-e PMM_TEST_HA_NODE_ID=${PMM_PASSIVE2_NODE_ID} \
-e PMM_TEST_HA_ADVERTISE_ADDRESS=${PMM_PASSIVE2_IP} \
-e PMM_TEST_HA_GOSSIP_PORT=9096 \
-e PMM_TEST_HA_RAFT_PORT=9097 \
-e PMM_TEST_HA_GRAFANA_GOSSIP_PORT=9094 \
-e PMM_TEST_HA_PEERS=${PMM_ACTIVE_IP},${PMM_PASSIVE_IP},${PMM_PASSIVE2_IP} \
-v /srv/pmm-data:/srv \
${PMM_DOCKER_IMAGE}

Note

Ensure to set the environment variables from Step 1 in each instance where you run these commands.
If you run the service on the same instance, remove the -p flags.
If you run the service on a separate instance, remove the --network and --ip flags.

Step 7: Running HAProxy¶

HAProxy provides high availability for your PMM setup by directing traffic to the current leader server via the /v1/leaderHealthCheck endpoint.

Pull the HAProxy Docker image.
```
docker pull haproxy:2.4.2-alpine
```
Create a directory to store the SSL certificate.
```
mkdir -p /path/to/certs
```
Replace /path/to/certs with the path where you want to store your SSL certificates.
Navigate to this directory and generate a new private key.
```
openssl genrsa -out pmm.key 2048
```
This command generates a 2048-bit RSA private key and saves it to a file named pmm.key.
Using the private key, generate a self-signed certificate.
```
openssl req -new -x509 -key pmm.key -out pmm.crt -days 365
```
Enter country, state, organization name, etc. when asked. Use -days 365 option for 365-day certificate validity.
Copy your SSL certificate and private key to the directory you created in step 2. Ensure that the certificate file is named pmm.crt and the private key file is named pmm.key.

Concatenate these two files to create a PEM file:
```
cat pmm.crt pmm.key > pmm.pem
```
Create a directory to store HA Proxy configuration.
```
mkdir -p /path/to/haproxy-config
```
Replace /path/to/haproxy-config with the path where you want to store your HAProxy configuration.

Create an HAProxy configuration file named haproxy.cfg.template in that directory. This configuration tells HAProxy to use the /v1/leaderHealthCheck endpoint of each PMM server to identify the leader.

global
    log stdout    local0 debug
    log stdout    local1 info
    log stdout    local2 info
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend http_front
    bind *:80
    default_backend http_back

frontend https_front
    bind *:443 ssl crt /etc/haproxy/certs/pmm.pem
    default_backend https_back

backend http_back
    option httpchk
    http-check send meth POST uri /v1/leaderHealthCheck ver HTTP/1.1 hdr Host www
    http-check expect status 200
    server pmm-server-active-http PMM_ACTIVE_IP:80 check
    server pmm-server-passive-http PMM_PASSIVE_IP:80 check backup
    server pmm-server-passive-2-http PMM_PASSIVE2_IP:80 check backup

backend https_back
    option httpchk
    http-check send meth POST uri /v1/leaderHealthCheck ver HTTP/1.1 hdr Host www
    http-check expect status 200
    server pmm-server-active-https PMM_ACTIVE_IP:443 check ssl verify none
    server pmm-server-passive-https PMM_PASSIVE_IP:443 check ssl verify none
    server pmm-server-passive-2-https PMM_PASSIVE2_IP:443 check ssl verify none

Before starting the HAProxy container, use sed to replace the placeholders in haproxy.cfg.template with the environment variables, and write the output to haproxy.cfg.

sed -e "s/PMM_ACTIVE_IP/$PMM_ACTIVE_IP/g" \
    -e "s/PMM_PASSIVE_IP/$PMM_PASSIVE_IP/g" \
    -e "s/PMM_PASSIVE2_IP/$PMM_PASSIVE2_IP/g" \
    /path/to/haproxy.cfg.template > /path/to/haproxy.cfg

Run the HAProxy container.

docker run -d \
  --name haproxy \
  --network pmm-network \
  -p 80:80 \
  -p 443:443 \
  -v /path/to/haproxy-config:/usr/local/etc/haproxy \
  -v /path/to/certs:/etc/haproxy/certs \
  haproxy:2.4.2-alpine

Replace /path/to/haproxy-config with the path to the haproxy.cfg file you created in step 6, and /path/to/certs with the path to the directory containing the SSL certificate and private key.

Note

It is recommended to use absolute paths instead of relative paths for volume mounts.
If you’re running services on separate instances, you can remove the --network flag.

HAProxy is now configured to redirect traffic to the leader PMM managed server. This ensures highly reliable service by redirecting requests to the remainder of the servers in the event that the leader server goes down.

Step 8: Access PMM¶

You can access the PMM web interface via HAProxy once all the components are set up and configured:

Access the PMM services by navigating to https://<HAProxy_IP> in your web browser. Replace <HAProxy_IP> with the IP address or hostname of the machine running the HAProxy container.
You should now see the PMM login screen. Log in using the default credentials, unless you changed them during setup.
You can use the PMM web interface to monitor your database infrastructure, analyze metrics, and perform various database management tasks.

When you register PMM Clients, you must use the HAProxy IP address (or hostname) rather than the PMM Server address once your PMM environment has been set up in high-availability (HA) mode. Even if one PMM server becomes unavailable, clients will still be able to communicate with the servers.

You have now successfully set up PMM in HA mode using Docker containers. Your PMM environment is more resilient to failures and can continue providing monitoring services if any of the instances fail.

Note

Ensure that all containers are running and accessible. You can use docker ps to check the status of your Docker containers. If a container is not running, you can view its logs using the command docker logs <container_name> to investigate the issue.

Get expert help¶

If you need assistance, visit the community forum for comprehensive and free database knowledge, or contact our Percona Database Experts for professional support and services.

Community Forum Get a Percona Expert