[dpdk-dev] vhost: fix mem share between VM and host

Message ID 1460309332-21981-1-git-send-email-zhe.tao@intel.com (mailing list archive)
State Changes Requested, archived
Headers

Commit Message

Zhe Tao April 10, 2016, 5:28 p.m. UTC
  The reason cause this problem is that in QEMU, when assign the
memory-backend-file without share option, will cause QEMU mmap the mem file
without using the MAP_SHARED flag, so the page cache for that file will not
shared between other processes, all the upated to the mapping area in the VM
will not carry through to the vhost-user process.

According to kernel implementation, data for the new hugetlbfs file will be
all zero, so check the first RX virtqueue descriptor next field to see
whether the mem is shared or not, if the mem is shared, the next field should
not equal to zero, otherwise this mem is not shared between VM and host.
 
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
---
 lib/librte_vhost/vhost-net.h                  |  1 +
 lib/librte_vhost/vhost_user/virtio-net-user.c |  7 ++++---
 lib/librte_vhost/virtio-net.c                 | 29 ++++++++++++++++++++++++++-
 3 files changed, 33 insertions(+), 4 deletions(-)
  

Comments

Thomas Monjalon April 10, 2016, 7:44 p.m. UTC | #1
2016-04-11 01:28, Zhe Tao:
> The reason cause this problem is that in QEMU, when assign the
> memory-backend-file without share option, will cause QEMU mmap the mem file
> without using the MAP_SHARED flag, so the page cache for that file will not
> shared between other processes, all the upated to the mapping area in the VM
> will not carry through to the vhost-user process.
> 
> According to kernel implementation, data for the new hugetlbfs file will be
> all zero, so check the first RX virtqueue descriptor next field to see
> whether the mem is shared or not, if the mem is shared, the next field should
> not equal to zero, otherwise this mem is not shared between VM and host.

I failed to understand.
Please try to do some short sentences and start by explaining what is the
bug you see. Then you can start explain the root cause and how it can be
fixed (using some short sentences).

I also think it is too late to integrate such code change in 16.04
(even if I don't understand how important it is).
  
Huawei Xie April 11, 2016, 6:22 a.m. UTC | #2
On 4/11/2016 1:29 AM, Zhe Tao wrote:
>  
> +/* Check the share memory in case the QEMU doesn't set the share option
> + * as on for the memory-backend-file object in the QEMU command line.
> + */
> +
> +int
> +vhost_check_mem_shared(struct vhost_device_ctx ctx)
> +{
> +	struct virtio_net *dev;
> +	struct vhost_virtqueue *vq;
> +	int ret = 0;
> +
> +	dev = get_device(ctx);
> +	if ((dev == NULL) || (dev->mem == NULL))
> +		return -1;
> +
> +	/* check first virtqueue 0 rxq. */
> +	vq = dev->virtqueue[VIRTIO_RXQ];
> +	ret = vq->desc[0].next == 0 ? -1 : 0;
> +
> +	if (ret)
> +		RTE_LOG(ERR, VHOST_CONFIG,
> +			"The mem is not shared between VM and host\n");
> +
> +	return ret;
> +}
> +

Zhe: This is known to vhost-user users that share=on is a must-to-have
option. I think this isn't an issue.
It is OK if we could do some early check in vhost, however we couldn't
assume the value of vq->desc[0].next.
Check if there is other simple and reliable way.
  
Yuanhan Liu April 11, 2016, 4:41 p.m. UTC | #3
On Mon, Apr 11, 2016 at 06:22:12AM +0000, Xie, Huawei wrote:
> On 4/11/2016 1:29 AM, Zhe Tao wrote:
> >  
> > +/* Check the share memory in case the QEMU doesn't set the share option
> > + * as on for the memory-backend-file object in the QEMU command line.
> > + */
> > +
> > +int
> > +vhost_check_mem_shared(struct vhost_device_ctx ctx)
> > +{
> > +	struct virtio_net *dev;
> > +	struct vhost_virtqueue *vq;
> > +	int ret = 0;
> > +
> > +	dev = get_device(ctx);
> > +	if ((dev == NULL) || (dev->mem == NULL))
> > +		return -1;
> > +
> > +	/* check first virtqueue 0 rxq. */
> > +	vq = dev->virtqueue[VIRTIO_RXQ];
> > +	ret = vq->desc[0].next == 0 ? -1 : 0;
> > +
> > +	if (ret)
> > +		RTE_LOG(ERR, VHOST_CONFIG,
> > +			"The mem is not shared between VM and host\n");
> > +
> > +	return ret;
> > +}
> > +
> 
> Zhe: This is known to vhost-user users that share=on is a must-to-have
> option. I think this isn't an issue.

Agreed.

> It is OK if we could do some early check in vhost, however we couldn't

I think if there is really a need to do the check, it might should be
done in QEMU: the option is from QEMU after all.

> assume the value of vq->desc[0].next.
> Check if there is other simple and reliable way.

I will not say no to an elegant check though.

	--yliu
  

Patch

diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index f193a1f..588008f 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -105,6 +105,7 @@  int vhost_set_backend(struct vhost_device_ctx, struct vhost_vring_file *);
 
 int vhost_set_owner(struct vhost_device_ctx);
 int vhost_reset_owner(struct vhost_device_ctx);
+int vhost_check_mem_shared(struct vhost_device_ctx);
 
 /*
  * Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index f5248bc..08dd2dd 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -286,9 +286,10 @@  user_set_vring_kick(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 		"vring kick idx:%d file:%d\n", file.index, file.fd);
 	vhost_set_vring_kick(ctx, &file);
 
-	if (virtio_is_ready(dev) &&
-		!(dev->flags & VIRTIO_DEV_RUNNING))
-			notify_ops->new_device(dev);
+	if (!vhost_check_mem_shared(ctx) &&
+	    virtio_is_ready(dev) &&
+	    !(dev->flags & VIRTIO_DEV_RUNNING))
+		notify_ops->new_device(dev);
 }
 
 /*
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 90da9ba..6b1a59d 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -712,7 +712,8 @@  vhost_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
 	 * If the device isn't already running and both backend fds are set,
 	 * we add the device.
 	 */
-	if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
+	if (!vhost_check_mem_shared(ctx) &&
+	    !(dev->flags & VIRTIO_DEV_RUNNING)) {
 		if (((int)dev->virtqueue[VIRTIO_TXQ]->backend != VIRTIO_DEV_STOPPED) &&
 			((int)dev->virtqueue[VIRTIO_RXQ]->backend != VIRTIO_DEV_STOPPED)) {
 			return notify_ops->new_device(dev);
@@ -724,6 +725,32 @@  vhost_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
 	return 0;
 }
 
+/* Check the share memory in case the QEMU doesn't set the share option
+ * as on for the memory-backend-file object in the QEMU command line.
+ */
+
+int
+vhost_check_mem_shared(struct vhost_device_ctx ctx)
+{
+	struct virtio_net *dev;
+	struct vhost_virtqueue *vq;
+	int ret = 0;
+
+	dev = get_device(ctx);
+	if ((dev == NULL) || (dev->mem == NULL))
+		return -1;
+
+	/* check first virtqueue 0 rxq. */
+	vq = dev->virtqueue[VIRTIO_RXQ];
+	ret = vq->desc[0].next == 0 ? -1 : 0;
+
+	if (ret)
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"The mem is not shared between VM and host\n");
+
+	return ret;
+}
+
 int rte_vhost_enable_guest_notification(struct virtio_net *dev,
 	uint16_t queue_id, int enable)
 {