With the growing demand for databases for several applications in the modern world, scaling them is equally important to increase the database capability to store and process more data. Achieving the same can be critical. Database scaling can be done horizontally or vertically. While vertical scaling means adding resources such as a vCPU's, disk, or memory to the existing nodes, horizontal scaling deals with adding nodes with the existing nodes behind the load balancer. Horizontal scalability eliminates the need to add the new hardware in the existing node and cuts the related cost too.
Implementing horizontal scalability is not as easy as it appears and demands development effort. PostgreSQL is a stable database management system, known for its resilience and integrity. The qualities of this open-source relational database management system make it preferable for large-scale web, mobile, and analytical applications. Due to its added stability, PostgreSQL offers both vertical and horizontal scalability. This article explains one of the most important concepts of PostgreSQL, that is, multi-master replication, and how it is implemented practically for scaling resources.
What is Multi-master Replication?
Multi-Master Replication is a concept where data is replicated in a group of connected database servers and anyone can update the database which is replicated to other systems updated by those servers. The multi-master replication adds a bi-directionality element to the process.
In Multi-master replication, multiple nodes accept write queries, and all the nodes house the same data. The primary usage of multi-master replication is ensuring all-time availability, thus when the primary server experiences failure, standby is not required.
In the further section, we will explain the actual Multi-master Replication in PostgreSQL.
PostgreSQL Multi-master Replication
PostgreSQL is by default built to offer single-master replication and multi-master replication is not present in it. At present, some vendors offer Bidirectional replication products supporting multi-master replication in PostgreSQL. The latest PostgreSQL versions are not open-source.
Multi-master Replication is achievable using Bidirectional replication (BDR), which is explained further.
PostgreSQL Multi-master Replication Process
Multimaster replication in PostgreSQL is done in six parts and starts once the system configuration is done.
Part 1: System Configuration
This is the first part of replication where BDR installation is performed with the pglogical plugin (logical replication solution). Here, postgresql. conf and pg_hba.conf files are modified and the service is restarted explained in the following steps.
Step1: Install pglogical plugin and BDR.
Step 2: Configure postgresql.conf file using the commands:
(For resolving conflict).
Step 3: Create a user and grant it superuser privileges for managing BDR connectivity.
CREATE USER user1 WITH SUPERUSER REPLICATION PASSWORD ‘user_password’;
Step 4: Alter pg_hba.conf file using the commands
Step 5: Add .pgpass file user.
Step 6: Now, restart Postgresql
The next part will explain how a single BDR node is created.
Part 2: Single BDR Node Creation
For creating the BDR node, activate BDR on db1 and the host ‘host1’ as ‘user1’.
Step 1: Create an extension.
CREATE EXTENSION bdr CASCADE; (CASCADE is used to create the extension.)
Step 2: Initialise the current node.
SELECT bdr.create_node( node_name:= ‘initail_node’, local_dsn:= ‘dbname=db1 host= host1 user=user1’);
Step 3: Create BDR cluster definition.
Step 4: Wait for the bdr.wait_for_join_completion function to complete.
We have created the BDR cluster, now let’s create another BDR Node.
Part 3: Another BDR Node Creation
Step 1: Create a BDR extension using the command:
CREATE EXTENSION bdr CASCADE;
Step 2: Initialise the current node with the command on another host:
SELECT bdr.create_node( node_name:= ‘next_node’, local_dsn:= ‘dbname=db1 host= host2 user=user1’);
Step 3: Create the BDR cluster definition.
SELECT bdr.join_node_group(join_target_dsn:= ‘dbname=db1 host= host1 user=user1’, wait_for_completion:=True);
Part 4: HAProxy Configuration
For getting higher availability, install HAProxy software by following the steps below:
Step 1: Modify the haproxy.cfg file global section.
state socket /var/run/haproxy/sock level admin (For sending HAProxy commands and retrieving information)
Step 2: Edit the haproxy.cfg file’s bk_db.
stick-table type ip size 1
stick on dst
server bdr_initial_node host1: 5432 check
server bdr_next_node my_host2: 5432 backup check
Step 3: Reload the configuration file.
Part 5: Merge PgBouncer to HAProxy
PgBouncer pools PostgreSQL connection with which PostgreSQL interacts with the clients. HAProxy redirects traffic and PgBouncer completes the transactions. PgBouncer is using the below steps.
Step 1: In ft_postgresql, modify the haproxy.cfg file.
bind *: 5433
Step 2: Change the databases section in the pgbouncer.ini file
* = host= proxy_server port= 5433
For modifying the pgbouncer section, use:
listen_port = 5432
Step 3: Restart HAProxy and PgBouncer.
Part 6: Switching over the Node
Lastly, we will be moving away from the connections from the initial node by making some changes and reinitialising the server.
Step 1: Disable HAProxy initial_node.
echo “disable server bk_db/bdr_initial_node” | socat /var/run/haproxy/sock –
Step 2: Reconnect PgBouncer should reconnect and wait for connections to be re-established.
psql –h proxy_server –U pgbouncer pgbouncer –c “For reconnection”
psql –h proxy_server –U pgbouncer pgbouncer –c “For waiting”
Step 3: Manage the initial node before re-enabling.
Step 4: Re-enable the initial node.
echo “enable server bk_db/bdr_initial_node” | socat /var/run/haproxy/sock –
Here, we conclude the multi-master replication process in PostgreSQL, which goes through several stages: installation, upgradation, and testing. We have tried to explain how the process is accomplished using simple steps. We hope it proves to be useful for you.