[dpdk-dev,v2,1/1] vhost: fix leak of fds and mmaps
Commit Message
The common vhost code only supported a single mmap per device. vhost-user
worked around this by saving the address/length/fd of each mmap after the end
of the rte_virtio_memory struct. This only works if the vhost-user code frees
dev->mem, since the common code is unaware of the extra info. The
VHOST_USER_RESET_OWNER message is one situation where the common code frees
dev->mem and leaks the fds and mappings. This happens every time I shut down a
VM.
The new code calls back into the implementation (vhost-user or vhost-cuse) to
clean up these resources.
The vhost-cuse changes are only compile tested.
Signed-off-by: Rich Lane <rlane@bigswitch.com>
---
v1->v2:
- Call into vhost-user/vhost-cuse to free mmaps.
lib/librte_vhost/vhost-net.h | 6 ++++++
lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 12 ++++++++++++
lib/librte_vhost/vhost_user/vhost-net-user.c | 1 -
lib/librte_vhost/vhost_user/virtio-net-user.c | 25 ++++++++++---------------
lib/librte_vhost/vhost_user/virtio-net-user.h | 1 -
lib/librte_vhost/virtio-net.c | 8 +-------
6 files changed, 29 insertions(+), 24 deletions(-)
Comments
On Sun, Jan 17, 2016 at 11:57:18AM -0800, Rich Lane wrote:
> The common vhost code only supported a single mmap per device. vhost-user
> worked around this by saving the address/length/fd of each mmap after the end
> of the rte_virtio_memory struct. This only works if the vhost-user code frees
> dev->mem, since the common code is unaware of the extra info. The
> VHOST_USER_RESET_OWNER message is one situation where the common code frees
> dev->mem and leaks the fds and mappings. This happens every time I shut down a
> VM.
>
> The new code calls back into the implementation (vhost-user or vhost-cuse) to
> clean up these resources.
>
> The vhost-cuse changes are only compile tested.
>
> Signed-off-by: Rich Lane <rlane@bigswitch.com>
> ---
> v1->v2:
> - Call into vhost-user/vhost-cuse to free mmaps.
>
> lib/librte_vhost/vhost-net.h | 6 ++++++
> lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 12 ++++++++++++
> lib/librte_vhost/vhost_user/vhost-net-user.c | 1 -
> lib/librte_vhost/vhost_user/virtio-net-user.c | 25 ++++++++++---------------
> lib/librte_vhost/vhost_user/virtio-net-user.h | 1 -
> lib/librte_vhost/virtio-net.c | 8 +-------
> 6 files changed, 29 insertions(+), 24 deletions(-)
>
> diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
> index c69b60b..e8d7477 100644
> --- a/lib/librte_vhost/vhost-net.h
> +++ b/lib/librte_vhost/vhost-net.h
> @@ -115,4 +115,10 @@ struct vhost_net_device_ops {
>
>
> struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
> +
> +/*
> + * Implementation-specific cleanup. Defined by vhost-cuse and vhost-user.
> + */
> +void vhost_impl_cleanup(struct virtio_net *dev);
TBH, I am not quite like "_impl_"; maybe "_backend_" is better?
OTOH, what I thought of has slight difference than yours: not
necessary to export a function, but instead, call the vhost
backend specific unmap function inside the backend itself. Say,
call vhost_user_unmap() on RESET_OWNER and connection close.
What do you think of that?
--yliu
On Sun, Jan 17, 2016 at 11:58 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
wrote:
> On Sun, Jan 17, 2016 at 11:57:18AM -0800, Rich Lane wrote:
> > The common vhost code only supported a single mmap per device. vhost-user
> > worked around this by saving the address/length/fd of each mmap after
> the end
> > of the rte_virtio_memory struct. This only works if the vhost-user code
> frees
> > dev->mem, since the common code is unaware of the extra info. The
> > VHOST_USER_RESET_OWNER message is one situation where the common code
> frees
> > dev->mem and leaks the fds and mappings. This happens every time I shut
> down a
> > VM.
> >
> > The new code calls back into the implementation (vhost-user or
> vhost-cuse) to
> > clean up these resources.
> >
> > The vhost-cuse changes are only compile tested.
> >
> > Signed-off-by: Rich Lane <rlane@bigswitch.com>
> > ---
> > v1->v2:
> > - Call into vhost-user/vhost-cuse to free mmaps.
> >
> > lib/librte_vhost/vhost-net.h | 6 ++++++
> > lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 12 ++++++++++++
> > lib/librte_vhost/vhost_user/vhost-net-user.c | 1 -
> > lib/librte_vhost/vhost_user/virtio-net-user.c | 25
> ++++++++++---------------
> > lib/librte_vhost/vhost_user/virtio-net-user.h | 1 -
> > lib/librte_vhost/virtio-net.c | 8 +-------
> > 6 files changed, 29 insertions(+), 24 deletions(-)
> >
> > diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
> > index c69b60b..e8d7477 100644
> > --- a/lib/librte_vhost/vhost-net.h
> > +++ b/lib/librte_vhost/vhost-net.h
> > @@ -115,4 +115,10 @@ struct vhost_net_device_ops {
> >
> >
> > struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
> > +
> > +/*
> > + * Implementation-specific cleanup. Defined by vhost-cuse and
> vhost-user.
> > + */
> > +void vhost_impl_cleanup(struct virtio_net *dev);
>
> TBH, I am not quite like "_impl_"; maybe "_backend_" is better?
>
If you have a strong preference I will change it. Let me know.
> OTOH, what I thought of has slight difference than yours: not
> necessary to export a function, but instead, call the vhost
> backend specific unmap function inside the backend itself. Say,
> call vhost_user_unmap() on RESET_OWNER and connection close.
> What do you think of that?
The munmap must be done after the notify_ops->destroy_device callback. That
means
the backend can't call it before reset_owner() or destroy_device(). The
munmap could
be done afterwards, but that requires saving dev->mem in the caller in the
case of
destroy_device. The cleanest solution is for the vhost common code to ask
the
backend to clean up at the correct time.
Hey Rich,
Sorry for the long delay; I barely forgot it :(
On Tue, Jan 19, 2016 at 10:13:23AM -0800, Rich Lane wrote:
> On Sun, Jan 17, 2016 at 11:58 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
> wrote:
>
> On Sun, Jan 17, 2016 at 11:57:18AM -0800, Rich Lane wrote:
> > +/*
> > + * Implementation-specific cleanup. Defined by vhost-cuse and
> vhost-user.
> > + */
> > +void vhost_impl_cleanup(struct virtio_net *dev);
>
> TBH, I am not quite like "_impl_"; maybe "_backend_" is better?
>
>
> If you have a strong preference I will change it. Let me know.
"backend" is just a more common word to me, as well as to QEMU.
So, I would suggest you to do such change, and if so, you could
add my ACK:
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
>
> OTOH, what I thought of has slight difference than yours: not
> necessary to export a function, but instead, call the vhost
> backend specific unmap function inside the backend itself. Say,
> call vhost_user_unmap() on RESET_OWNER and connection close.
> What do you think of that?
>
>
> The munmap must be done after the notify_ops->destroy_device callback. That
> means
> the backend can't call it before reset_owner() or destroy_device().
Well, you could:
case VHOST_USER_RESET_OWNER:
ops->reset_owner();
vhost_user_unmap();
break;
Anyway, it's not a big deal. Let's go with your solution first.
--yliu
> The munmap
> could
> be done afterwards, but that requires saving dev->mem in the caller in the case
> of
> destroy_device. The cleanest solution is for the vhost common code to ask the
> backend to clean up at the correct time.
@@ -115,4 +115,10 @@ struct vhost_net_device_ops {
struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+
+/*
+ * Implementation-specific cleanup. Defined by vhost-cuse and vhost-user.
+ */
+void vhost_impl_cleanup(struct virtio_net *dev);
+
#endif /* _VHOST_NET_CDEV_H_ */
@@ -421,3 +421,15 @@ int cuse_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
return ops->set_backend(ctx, file);
}
+
+void
+vhost_impl_cleanup(struct virtio_net *dev)
+{
+ /* Unmap QEMU memory file if mapped. */
+ if (dev->mem) {
+ munmap((void *)(uintptr_t)dev->mem->mapped_address,
+ (size_t)dev->mem->mapped_size);
+ free(dev->mem);
+ dev->mem = NULL;
+ }
+}
@@ -347,7 +347,6 @@ vserver_message_handler(int connfd, void *dat, int *remove)
close(connfd);
*remove = 1;
free(cfd_ctx);
- user_destroy_device(ctx);
ops->destroy_device(ctx);
return;
@@ -339,21 +339,6 @@ user_set_vring_enable(struct vhost_device_ctx ctx,
}
void
-user_destroy_device(struct vhost_device_ctx ctx)
-{
- struct virtio_net *dev = get_device(ctx);
-
- if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
- notify_ops->destroy_device(dev);
-
- if (dev && dev->mem) {
- free_mem_region(dev);
- free(dev->mem);
- dev->mem = NULL;
- }
-}
-
-void
user_set_protocol_features(struct vhost_device_ctx ctx,
uint64_t protocol_features)
{
@@ -365,3 +350,13 @@ user_set_protocol_features(struct vhost_device_ctx ctx,
dev->protocol_features = protocol_features;
}
+
+void
+vhost_impl_cleanup(struct virtio_net *dev)
+{
+ if (dev->mem) {
+ free_mem_region(dev);
+ free(dev->mem);
+ dev->mem = NULL;
+ }
+}
@@ -55,5 +55,4 @@ int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
int user_set_vring_enable(struct vhost_device_ctx ctx,
struct vhost_vring_state *state);
-void user_destroy_device(struct vhost_device_ctx);
#endif
@@ -199,13 +199,7 @@ cleanup_device(struct virtio_net *dev, int destroy)
{
uint32_t i;
- /* Unmap QEMU memory file if mapped. */
- if (dev->mem) {
- munmap((void *)(uintptr_t)dev->mem->mapped_address,
- (size_t)dev->mem->mapped_size);
- free(dev->mem);
- dev->mem = NULL;
- }
+ vhost_impl_cleanup(dev);
for (i = 0; i < dev->virt_qp_nb; i++) {
cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy);