drbd: if the replication link breaks during handshake, keep retrying
The 8.3.12 commit drbd: Bugfix for the connection behavior fixes a
"wasted established connection", if a former connection attempt failed
during its early stages.
However it opened a window for a regression, if a connection attempt
fails during its last stages. The result was a terminated receiver
thread, that left behind the supposedly transient "C_UNCONNECTED" state.
Any later requests to change the connection state fail, as they wait for
the connection state to "stabilize".
Fix: short circuit and keep retrying to restablish a new connection,
if we don't reach C_WF_REPORT_PARAMS.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 9a9d4fd..1599a1a 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1051,7 +1051,7 @@
rcu_read_unlock();
rv = conn_request_state(tconn, NS(conn, C_WF_REPORT_PARAMS), CS_VERBOSE);
- if (rv < SS_SUCCESS) {
+ if (rv < SS_SUCCESS || tconn->cstate != C_WF_REPORT_PARAMS) {
clear_bit(STATE_SENT, &tconn->flags);
return 0;
}