DRBD

Table of Content

Getting stat

CommandDescription
watch -n1 -d 'cat /proc/drbd'shows you the actuale state and connection
drbd-overviewshows you the state and connection with bit less details

Description of output

If you ask your self what the last two line means from cat /proc/drbd here you have a summary of it:

ShortLongDescription
csConnection StateThis is the connection state. Possible states: Connected1, WFConnection2, StandAlone3, Disconnecting4, Unconnected5, WFReportParams6, SyncSource7, SyncTarget8
roRole StateThis is the role of the nodes. The first role is the local one, and the second role is the remote one. Possible state: Primary9, Secondary10, Unknown11
dsDisk StateThis is the disk state. The first state is the local disk state, and the second state is the remote disk state. Possible states: Diskless12, Attaching13, Failed14, Negotiation[^ds], Inconsistent15, Outdated16, DUnknown17, Consistent18, UpToDate19
pReplication ProtocolProtocol used by the resource. Either A20,B21 or C22
I/OI/O State r-----State flag filed contains information about the current state of I/O operations associated with the resource. Possible states: r23, a24, p25, u26, Locally blocked (d27, b28, n29, a24, s30)
nsNetwork SendThe amount of data that has been sent to the secondary instance via network connection (in KB).
nrNetwork ReceiveThe amount of data received by the network from the primary instance via network connection (in KB).
dwDisk WriteThe amount of data that has been written on the local disk (in KB).
drDisk ReadThe amount of data that has been read from the local disk (in KB).
alActivity LogThe number of updates of the activity log area of the meta data.
bmBitmapThe number of bitmap updates of the neta data. This is not the amount of bits set in the bitmap.
loLocal CountThe number of requests that the drbd user-land process has issued but that have not been answered yet by the drbd kernel module (open requests).
pePendingThe number of requests that have been sent to the network layer by the drbd kernel module but that have not been acknowledged yet by the drbd peer.
uaUnacknowledgedThe number of requests that have been received by the drbd peer via the network connection but that have not been answered yet.
apApplication PendingThe number of block I/O requests forwarded to drbd, but not yet answered by drbd.
epEpochsThe number of Epoch objects. An Epoch object is internally used by drbd to track write requests that need to be replicated. Usually 1, might increase under I/O load.
woWrite OrderCurrently used write ordering (b = barrier, f = flush, d = drain, n = none)
oosOut of SyncThe amount of data that is out of sync (in KB)
1

The normal and operating state; the host is communicating with its peer.

2

The host is waiting for its peer node connection; usually seen when other node is rebooting.

3

The node is functioning alone because of a lack of network connection with its peer. It does not try to reconnect. If the cluster is in this state, it means that data is not being replicated. Manual intervention is required to fix this problem.

4

Temporary state during disconnection. The next state is StandAlone.

5

Temporary state, prior to a connection attempt. Possible next states: WFConnection and WFReportParams.

6

TCP connection has been established, this node waits for the first network packet from the peer.

7

Synchronization is currently running, with the local node being the source of synchronization.

8

Synchronization is currently running, with the local node being the target of synchronization.

9

The resource is currently in the primary role, and may be read from and written to. This role only occurs on one of the two nodes.

10

The resource is currently in the secondary role. It normally receives updates from its peer (unless running in disconnected mode), but may neither be read from nor written to. This role may occur on one or both nodes.

11

The resource’s role is currently unknown. The local resource role never has this status. It is only displayed for the peer’s resource role, and only in disconnected mode.

12

No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error.

13

Transient state while reading meta data.

14

Transient state following an I/O failure report by the local block device. Next state: Diskless

31

Transient state when an Attach is carried out on an already-Connected DRBD device.

15

The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization.

16

Resource data is consistent, but outdated.

17

This state is used for the peer disk if no network connection is available.

18

Consistent data of a node without connection. When the connection is established, it is decided whether the data is UpToDate or Outdated.

19

Consistent, up-to-date state of the data. This is the normal state.

20

Asynchronous replication protocol. Local write operations on the primary node are considered completed as soon as the local disk write has finished, and the replication packet has been placed in the local TCP send buffer. In the event of forced fail-over, data loss may occur. The data on the standby node is consistent after fail-over, however, the most recent updates performed prior to the crash could be lost. Protocol A is most often used in long distance replication scenarios. When used in combination with DRBD Proxy it makes an effective disaster recovery solution.

21

Memory synchronous (semi-synchronous) replication protocol. Local write operations on the primary node are considered completed as soon as the local disk write has occurred, and the replication packet has reached the peer node. Normally, no writes are lost in case of forced fail-over. However, in the event of simultaneous power failure on both nodes and concurrent, irreversible destruction of the primary’s data store, the most recent writes completed on the primary may be lost.

22

Synchronous replication protocol. Local write operations on the primary node are considered completed only after both the local and the remote disk write have been confirmed. As a result, loss of a single node is guaranteed not to lead to any data loss. Data loss is, of course, inevitable even with this replication protocol if both nodes (or their storage subsystems) are irreversibly destroyed at the same time.

23

I/O suspension, r = running, s = suspended I/O, Normally r

24

Serial resynchronisation, When resources is awaiting resynchronisation but has deferred hit because of a resync after dependency, Normally -

25

Peer-initiated sync suspension. When resource is awaiting resynchronization, but the peer node has suspended it for any reason, Normally -

26

Locally iniated sync suspension. when resource is awaiting resynchronization, but a user on the local node has supended it, Normally -

27

Locally blocked I/O, blocked for a reason internal to DBRD, sucha as a transient disk state, Normally -

28

Locally blocked I/O, Backing device I/O is blocking, Normally -

29

Locally blocked I/O, congestion on the network socket, Normally -

24

Locally blocked I/O, Simultaneous combination of blocking device I/O and network congestion, Normally -

30

Activity log update suspension, When updates to the Activity log are suspended, Normall -

Create drbd on lvm

To create the drbd, you first need to setup the disk/partition/lv short summary below with lv:

$ pvcreate /dev/sdx
$ vgcreate drbdvg /dev/sdx
$ lvcreate --name r0lv --size 10G drbdvg

and you need to have ofcourse the package installed ;)

$ apt install drbd-utils

Next is to create the drbd configuration. In our sample we use r0 as resource name.

In here you specify the hosts which are part of the drbd cluster and where the drbd gets stored at.

This config needs to be present on all drbd cluster members, same goes of course for the package drbd-utils and the needed space where to store the drbd

$ cat << EOF >  /etc/drbd.d/r0.res
resource r0 {
  device    /dev/drbd0;
  disk      /dev/drbdvg/r0lv;
  meta-disk internal;

  on server01 {
    address   10.0.0.1:7789;
  }
  on server02 {
    address   10.0.0.2:7789;
  }
}
EOF

Now we are ready to create the resource r0 in drbd and start up the service

$ drbdadm create-md r0
$ systemctl start drbd.service

You can also startup the drbd manually by running the following:

$ drbdadm up r0

Make sure that the members are now conntected to each other, by checking drbd-overview or cat /proc/drbd

$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 12341234123412341234123
 0: cs:Connected ro:Secondary/Secondayr ds:UpToDate/UpToDate C r-----
    ns:0 nr:100 dw:100 dr:0 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

If it looks like the above, you are good to go, if not, then you need to figure out, why the connection is not getting established, check tcpdump and so on

Now we set one of the members to primary

$ drbdadm primary --force r0

If you are facing issues with the command bove, use this one:

$ drbdadm -- --overwrite-data-of-peer primary r0

Extend drbd live

To extend a drbd, you first need to extend the underlying lv/pv/partition/md or what ever you use on all drbd cluster members, in our sample we go with lv

#connect to master and extend lvm
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv>         # e.g. lvextend -L +24G /dev/drbdvg/r0lv

#connect to slave and do the same ( be carefull it mus have the !!! SAME SIZE !!! )
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv>         # e.g. lvextend -L +24G /dev/drbdvg/r0lv

Now you should start to monitor the drbd state, with one of the commands in Getting stat

On the primary server, we perform the resize command. Right afyer you have executed it, you will see that drbd starts to sync from scratch the “data” to other cluster members.

$ drbdadm resize r0

This resync can take a while, depending on your drbd size, network, hardware,…

If you have more then one drbd resoucre, you could use instea of the resoucre name the keyword all, but make sure that you have prepared everything

$ drbdadm resize all

Lets assume the resync finished, now you are ready to extend the filesystem inside the drbd itself, again run this on the primary server

$ xfs_growfs /mnt/drbd_r0_data

Remove DRBD resource/device

Lets assume we want to remove the resource r1

First you need to see which resources you have

$ drbd-overview
NOTE: drbd-overview will be deprecated soon.
Please consider using drbdtop.

 0:r0/0  Connected    Secondary/Primary UpToDate/UpToDate
 1:r1/0  Connected    Secondary/Primary UpToDate/UpToDate

If the system where you are currently connected is set to Secondary you are good already, otherwiese you need to change it first to have that state.

Now you can disconnect it by running drbdadm disconnect r1, drbd-overview or a cat /proc/drbd wil show you tan the state StandAlone

Next step is to detech it like this drbdadm detach r1. If you check again drbd-overview it will look differnt to cat /proc/drbd

$ drbd-overview | grep r1
 1:r1/0  .         .                 .

$ cat /proc/drbd | grep "1:"
 1: cs:Unconfigured

Good so far, as you dont want to keep data on there, you should wipe it

$ drbdadm wipe-md r1

Do you really want to wipe out the DRBD meta data?
[need to type 'yes' to confirm] yes

Wiping meta data...
DRBD meta data block successfully wiped out.

echo "yes" | drbdadm wipe-md r1 is working, if you need it in a script

Now we are nealy done, nex is to remove the minor. The minor wants to have the resource number, which you can see in the drbd-overview 2>&1, just pipe it to the greps grep -E '^ *[0-9]:' | grep -E "[0-9]+"

$ drbdsetup del-minor 1

Now we are good to go and remove the resource fully

$ drbdsetup del-resource r1

Last step, is to remove the resources file beneath /etc/drbd.d/r1.res if you don’t have it automated ;)

Solving issues

one part of drbd is corrupt

assuming r0 is your resoruce name

First we want to diconnect the cluster, run the commands on one of the server, mostly done on the corrupted one

$ drbdadm disconnect r0
$ drbdadm detach r0

If they are not disconnected, restart the drbd service

Now remove the messedup device and start to recreate it

$ drbdadm wipe-md r0
$ drbdadm create-md r0

If you had to stop the drbd service, make sure that it is started again.

Next step is to go to the server which holds the working data and run:

$ drbdadm connect r0

If its not working or they are in the Secondary/Secondary state run (only after they are in sync):

$ drbdadm -- --overwrite-data-of-peer primary r0

Situation Primary/Unknown - Secondary/Unknown

Connect to the slave and run

$ drbdadm -- --discard-my-data connect all

Secondary returns:

r0: Failure: (102) Local address(port) already in use.
Command 'drbdsetup-84 connect r0 ipv4:10.42.13.37:7789 ipv4:10.13.37.42:7789 --max-buffers=40k --discard-my-data' terminated with exit code 10

Then just perform a drbdadm disconnect r0 and run again the command from above

Connect to the master

$ drbdadm connect all

Situation primary/primay

Option 1

Connect to the server which should be secondary

Just make sure that this one really has no needed data onit

$ drbdadm secondary r0

Option2

Connnect to the real master and run to make it the only primary

$ drbdadm -- --overwrite-data-of-peer primary r0

Now you have the state Primary/Unknown and Secondary/Unknown

Connect to the slave and remove the data

$ drbdadm -- --discard-my-data connect all

Situation r0 Unconfigured

drbd shows status on slave:

$ drbd-overview
Please consider using drbdtop.

 0:r0/0  Unconfigured . .

run drbd up to bring the device up again

$ drbdadm up r0

and check out the status

$ drbd-overview
Please consider using drbdtop.

 0:r0/0  SyncTarget Secondary/Primary Inconsistent/UpToDate
    [=================>..] sync'ed: 94.3% (9084/140536)K

situation Connected Secondary/Primary Diskless/UpToDate

$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 473968AD625BA317874A57E
 0: cs:Connected ro:Secondary/Primary ds:Diskless/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Rrecreate the resource, as seems like it was not fully created and bring the resouce up

$ drbdadm create-md r0
$ drbdadm up r0