From 7c33b1208632a9581d0ee7aabd1e0584a5d1fb20 Mon Sep 17 00:00:00 2001
From: David Gibson <david@gibson.dropbear.id.au>
Date: Sat, 15 Feb 2025 00:08:41 +1100
Subject: [PATCH] vhost_user: Clear ring address on GET_VRING_BASE

GET_VRING_BASE stops the queue, clearing the call and kick fds.  However,
we don't clear vring.avail.  That means that if vu_queue_notify() is called
it won't realise the queue isn't ready and will die with an EBADFD.

We get this during migration, because for some reason, qemu reconfigures
the vhost-user device when a migration is triggered.  There's a window
between the GET_VRING_BASE and re-establishing the call fd where the
notify function can be called, causing a crash.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 vhost_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/vhost_user.c b/vhost_user.c
index 7ab1377..be1aa94 100644
--- a/vhost_user.c
+++ b/vhost_user.c
@@ -732,6 +732,7 @@ static bool vu_get_vring_base_exec(struct vu_dev *vdev,
 	msg->hdr.size = sizeof(msg->payload.state);
 
 	vdev->vq[idx].started = false;
+	vdev->vq[idx].vring.avail = 0;
 
 	if (vdev->vq[idx].call_fd != -1) {
 		close(vdev->vq[idx].call_fd);