DRBD
Table of Content
Getting stat
Command | Description |
---|---|
watch -n1 -d 'cat /proc/drbd' | shows you the actuale state and connection |
drbd-overview | shows you the state and connection with bit less details |
Description of output
If you ask your self what the last two line means from cat /proc/drbd
here you have a summary of it:
Short | Long | Description |
---|---|---|
cs | Connection State | This is the connection state. Possible states: Connected1, WFConnection2, StandAlone3, Disconnecting4, Unconnected5, WFReportParams6, SyncSource7, SyncTarget8 |
ro | Role State | This is the role of the nodes. The first role is the local one, and the second role is the remote one. Possible state: Primary9, Secondary10, Unknown11 |
ds | Disk State | This is the disk state. The first state is the local disk state, and the second state is the remote disk state. Possible states: Diskless12, Attaching13, Failed14, Negotiation[^ds], Inconsistent15, Outdated16, DUnknown17, Consistent18, UpToDate19 |
p | Replication Protocol | Protocol used by the resource. Either A20,B21 or C22 |
I/O | I/O State r----- | State flag filed contains information about the current state of I/O operations associated with the resource. Possible states: r23, a24, p25, u26, Locally blocked (d27, b28, n29, a24, s30) |
ns | Network Send | The amount of data that has been sent to the secondary instance via network connection (in KB). |
nr | Network Receive | The amount of data received by the network from the primary instance via network connection (in KB). |
dw | Disk Write | The amount of data that has been written on the local disk (in KB). |
dr | Disk Read | The amount of data that has been read from the local disk (in KB). |
al | Activity Log | The number of updates of the activity log area of the meta data. |
bm | Bitmap | The number of bitmap updates of the neta data. This is not the amount of bits set in the bitmap. |
lo | Local Count | The number of requests that the drbd user-land process has issued but that have not been answered yet by the drbd kernel module (open requests). |
pe | Pending | The number of requests that have been sent to the network layer by the drbd kernel module but that have not been acknowledged yet by the drbd peer. |
ua | Unacknowledged | The number of requests that have been received by the drbd peer via the network connection but that have not been answered yet. |
ap | Application Pending | The number of block I/O requests forwarded to drbd , but not yet answered by drbd . |
ep | Epochs | The number of Epoch objects. An Epoch object is internally used by drbd to track write requests that need to be replicated. Usually 1, might increase under I/O load. |
wo | Write Order | Currently used write ordering (b = barrier, f = flush, d = drain, n = none) |
oos | Out of Sync | The amount of data that is out of sync (in KB) |
The normal and operating state; the host is communicating with its peer.
The host is waiting for its peer node connection; usually seen when other node is rebooting.
The node is functioning alone because of a lack of network connection with its peer. It does not try to reconnect. If the cluster is in this state, it means that data is not being replicated. Manual intervention is required to fix this problem.
Temporary state during disconnection. The next state is StandAlone.
Temporary state, prior to a connection attempt. Possible next states: WFConnection and WFReportParams.
TCP connection has been established, this node waits for the first network packet from the peer.
Synchronization is currently running, with the local node being the source of synchronization.
Synchronization is currently running, with the local node being the target of synchronization.
The resource is currently in the primary role, and may be read from and written to. This role only occurs on one of the two nodes.
The resource is currently in the secondary role. It normally receives updates from its peer (unless running in disconnected mode), but may neither be read from nor written to. This role may occur on one or both nodes.
The resource’s role is currently unknown. The local resource role never has this status. It is only displayed for the peer’s resource role, and only in disconnected mode.
No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error.
Transient state while reading meta data.
Transient state following an I/O failure report by the local block device. Next state: Diskless
Transient state when an Attach is carried out on an already-Connected DRBD device.
The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization.
Resource data is consistent, but outdated.
This state is used for the peer disk if no network connection is available.
Consistent data of a node without connection. When the connection is established, it is decided whether the data is UpToDate or Outdated.
Consistent, up-to-date state of the data. This is the normal state.
Asynchronous replication protocol. Local write operations on the primary node are considered completed as soon as the local disk write has finished, and the replication packet has been placed in the local TCP send buffer. In the event of forced fail-over, data loss may occur. The data on the standby node is consistent after fail-over, however, the most recent updates performed prior to the crash could be lost. Protocol A is most often used in long distance replication scenarios. When used in combination with DRBD Proxy it makes an effective disaster recovery solution.
Memory synchronous (semi-synchronous) replication protocol. Local write operations on the primary node are considered completed as soon as the local disk write has occurred, and the replication packet has reached the peer node. Normally, no writes are lost in case of forced fail-over. However, in the event of simultaneous power failure on both nodes and concurrent, irreversible destruction of the primary’s data store, the most recent writes completed on the primary may be lost.
Synchronous replication protocol. Local write operations on the primary node are considered completed only after both the local and the remote disk write have been confirmed. As a result, loss of a single node is guaranteed not to lead to any data loss. Data loss is, of course, inevitable even with this replication protocol if both nodes (or their storage subsystems) are irreversibly destroyed at the same time.
I/O suspension, r = running, s = suspended I/O, Normally r
Serial resynchronisation, When resources is awaiting resynchronisation but has deferred hit because of a resync after dependency, Normally -
Peer-initiated sync suspension. When resource is awaiting resynchronization, but the peer node has suspended it for any reason, Normally -
Locally iniated sync suspension. when resource is awaiting resynchronization, but a user on the local node has supended it, Normally -
Locally blocked I/O, blocked for a reason internal to DBRD, sucha as a transient disk state, Normally -
Locally blocked I/O, Backing device I/O is blocking, Normally -
Locally blocked I/O, congestion on the network socket, Normally -
Locally blocked I/O, Simultaneous combination of blocking device I/O and network congestion, Normally -
Activity log update suspension, When updates to the Activity log are suspended, Normall -
Create drbd on lvm
To create the drbd
, you first need to setup the disk
/partition
/lv
short summary below with lv
:
$ pvcreate /dev/sdx
$ vgcreate drbdvg /dev/sdx
$ lvcreate --name r0lv --size 10G drbdvg
and you need to have ofcourse the package installed ;)
$ apt install drbd-utils
Next is to create the drbd
configuration.
In our sample we use r0
as resource name.
In here you specify the hosts which are part of the drbd
cluster and where the drbd gets stored at.
This config needs to be present on all
drbd
cluster members, same goes of course for the packagedrbd-utils
and the needed space where to store thedrbd
$ cat << EOF > /etc/drbd.d/r0.res
resource r0 {
device /dev/drbd0;
disk /dev/drbdvg/r0lv;
meta-disk internal;
on server01 {
address 10.0.0.1:7789;
}
on server02 {
address 10.0.0.2:7789;
}
}
EOF
Now we are ready to create the resource r0
in drbd
and start up the service
$ drbdadm create-md r0
$ systemctl start drbd.service
You can also startup the drbd manually by running the following:
$ drbdadm up r0
Make sure that the members are now conntected to each other, by checking drbd-overview
or cat /proc/drbd
$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 12341234123412341234123
0: cs:Connected ro:Secondary/Secondayr ds:UpToDate/UpToDate C r-----
ns:0 nr:100 dw:100 dr:0 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
If it looks like the above, you are good to go, if not, then you need to figure out, why the connection is not getting established, check tcpdump and so on
Now we set one of the members to primary
$ drbdadm primary --force r0
If you are facing issues with the command bove, use this one:
$ drbdadm -- --overwrite-data-of-peer primary r0
Extend drbd live
To extend a drbd, you first need to extend the underlying lv
/pv
/partition
/md
or what ever you use on all drbd
cluster members, in our sample we go with lv
#connect to master and extend lvm
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv> # e.g. lvextend -L +24G /dev/drbdvg/r0lv
#connect to slave and do the same ( be carefull it mus have the !!! SAME SIZE !!! )
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv> # e.g. lvextend -L +24G /dev/drbdvg/r0lv
Now you should start to monitor the drbd
state, with one of the commands in Getting stat
On the primary server, we perform the resize command.
Right afyer you have executed it, you will see that drbd
starts to sync from scratch the “data” to other cluster members.
$ drbdadm resize r0
This resync can take a while, depending on your drbd size, network, hardware,…
If you have more then one
drbd
resoucre, you could use instea of the resoucre name the keywordall
, but make sure that you have prepared everything$ drbdadm resize all
Lets assume the resync finished, now you are ready to extend the filesystem inside the drbd
itself, again run this on the primary server
$ xfs_growfs /mnt/drbd_r0_data
Remove DRBD resource/device
Lets assume we want to remove the resource r1
First you need to see which resources you have
$ drbd-overview
NOTE: drbd-overview will be deprecated soon.
Please consider using drbdtop.
0:r0/0 Connected Secondary/Primary UpToDate/UpToDate
1:r1/0 Connected Secondary/Primary UpToDate/UpToDate
If the system where you are currently connected is set to Secondary
you are good already, otherwiese you need to change it first to have that state.
Now you can disconnect it by running drbdadm disconnect r1
, drbd-overview
or a cat /proc/drbd
wil show you tan the state StandAlone
Next step is to detech it like this drbdadm detach r1
. If you check again drbd-overview
it will look differnt to cat /proc/drbd
$ drbd-overview | grep r1
1:r1/0 . . .
$ cat /proc/drbd | grep "1:"
1: cs:Unconfigured
Good so far, as you dont want to keep data on there, you should wipe it
$ drbdadm wipe-md r1
Do you really want to wipe out the DRBD meta data?
[need to type 'yes' to confirm] yes
Wiping meta data...
DRBD meta data block successfully wiped out.
echo "yes" | drbdadm wipe-md r1
is working, if you need it in a script
Now we are nealy done, nex is to remove the minor.
The minor wants to have the resource number, which you can see in the drbd-overview 2>&1
, just pipe it to the greps grep -E '^ *[0-9]:' | grep -E "[0-9]+"
$ drbdsetup del-minor 1
Now we are good to go and remove the resource fully
$ drbdsetup del-resource r1
Last step, is to remove the resources file beneath /etc/drbd.d/r1.res
if you don’t have it automated ;)
Solving issues
one part of drbd is corrupt
assuming
r0
is your resoruce name
First we want to diconnect the cluster, run the commands on one of the server, mostly done on the corrupted one
$ drbdadm disconnect r0
$ drbdadm detach r0
If they are not disconnected, restart the drbd
service
Now remove the messedup device and start to recreate it
$ drbdadm wipe-md r0
$ drbdadm create-md r0
If you had to stop the drbd
service, make sure that it is started again.
Next step is to go to the server which holds the working data and run:
$ drbdadm connect r0
If its not working or they are in the Secondary/Secondary
state run (only after they are in sync):
$ drbdadm -- --overwrite-data-of-peer primary r0
Situation Primary/Unknown - Secondary/Unknown
Connect to the slave and run
$ drbdadm -- --discard-my-data connect all
Secondary returns:
r0: Failure: (102) Local address(port) already in use. Command 'drbdsetup-84 connect r0 ipv4:10.42.13.37:7789 ipv4:10.13.37.42:7789 --max-buffers=40k --discard-my-data' terminated with exit code 10
Then just perform a
drbdadm disconnect r0
and run again the command from above
Connect to the master
$ drbdadm connect all
Situation primary/primay
Option 1
Connect to the server which should be secondary
Just make sure that this one really has no needed data onit
$ drbdadm secondary r0
Option2
Connnect to the real master and run to make it the only primary
$ drbdadm -- --overwrite-data-of-peer primary r0
Now you have the state
Primary/Unknown
andSecondary/Unknown
Connect to the slave and remove the data
$ drbdadm -- --discard-my-data connect all
Situation r0 Unconfigured
drbd
shows status on slave:
$ drbd-overview
Please consider using drbdtop.
0:r0/0 Unconfigured . .
run drbd up
to bring the device up again
$ drbdadm up r0
and check out the status
$ drbd-overview
Please consider using drbdtop.
0:r0/0 SyncTarget Secondary/Primary Inconsistent/UpToDate
[=================>..] sync'ed: 94.3% (9084/140536)K
situation Connected Secondary/Primary Diskless/UpToDate
$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 473968AD625BA317874A57E
0: cs:Connected ro:Secondary/Primary ds:Diskless/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
Rrecreate the resource, as seems like it was not fully created and bring the resouce up
$ drbdadm create-md r0
$ drbdadm up r0