Message ID | 20151118025655.GW2326@yliu-dev.sh.intel.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 211CB5A9B; Wed, 18 Nov 2015 03:56:03 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 4F9F55A92 for <dev@dpdk.org>; Wed, 18 Nov 2015 03:56:02 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP; 17 Nov 2015 18:56:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,310,1444719600"; d="scan'208";a="822715177" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.66.49]) by orsmga001.jf.intel.com with ESMTP; 17 Nov 2015 18:55:59 -0800 Date: Wed, 18 Nov 2015 10:56:55 +0800 From: Yuanhan Liu <yuanhan.liu@linux.intel.com> To: Rich Lane <rich.lane@bigswitch.com> Message-ID: <20151118025655.GW2326@yliu-dev.sh.intel.com> References: <1447315353-42152-1-git-send-email-rlane@bigswitch.com> <20151112092305.GI2326@yliu-dev.sh.intel.com> <CAGSMBPOLNsc-+_Zj7FgBhmD0kpUAoy3fu5urxN74YTfmE20Qzw@mail.gmail.com> <20151117132349.GT2326@yliu-dev.sh.intel.com> <CAGSMBPPrustS5-2BdGmkKFUTQBnHcb6NRJ43TYuL2zRVAWkWqw@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <CAGSMBPPrustS5-2BdGmkKFUTQBnHcb6NRJ43TYuL2zRVAWkWqw@mail.gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] vhost: avoid buffer overflow in update_secure_len X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Yuanhan Liu
Nov. 18, 2015, 2:56 a.m. UTC
On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: > > I don't think that adding a SIGINT handler is the right solution, though. The > guest app could be killed with another signal (SIGKILL). Good point. > Worse, a malicious or > buggy guest could write to just that field. vhost should not crash no matter > what the guest writes into the virtqueues. Yeah, I agree with you: though we could fix this issue in the source side, we also should do some defend here. How about following patch then? Note that the vec_id overflow check should be done before referencing it, but not after. Hence I moved it ahead. --yliu ---
Comments
> -----Original Message----- > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com] > Sent: Wednesday, November 18, 2015 10:57 AM > To: Rich Lane <rich.lane@bigswitch.com> > Cc: dev@dpdk.org; Xie, Huawei <huawei.xie@intel.com>; Wang, Zhihong > <zhihong.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com> > Subject: Re: [PATCH] vhost: avoid buffer overflow in update_secure_len > > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: > > > > I don't think that adding a SIGINT handler is the right solution, > > though. The guest app could be killed with another signal (SIGKILL). > > Good point. > > > Worse, a malicious or > > buggy guest could write to just that field. vhost should not crash no > > matter what the guest writes into the virtqueues. > > Yeah, I agree with you: though we could fix this issue in the source side, we also > should do some defend here. > Exactly, DPDK should be able to take care of both ends: # Provide interface for resource cleanup # Be prepared if the app doesn't shutdown properly > How about following patch then? > > Note that the vec_id overflow check should be done before referencing it, but > not after. Hence I moved it ahead. > > --yliu > > --- > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index > 9322ce6..08f5942 100644 > --- a/lib/librte_vhost/vhost_rxtx.c > +++ b/lib/librte_vhost/vhost_rxtx.c > @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > > /* Get descriptor from available ring */ > desc = &vq->desc[head[packet_success]]; > + if (desc->len == 0) > + break; > > buff = pkts[packet_success]; > > @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > /* Buffer address translation. */ > buff_addr = gpa_to_vva(dev, desc->addr); > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset += vq->vhost_hlen; > hdr = 1; > } > @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t > id, > uint32_t vec_id = *vec_idx; > > do { > + if (vec_id >= BUF_VECTOR_MAX) > + break; > + > next_desc = 0; > len += vq->desc[idx].len; > vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; @@ -519,6 > +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, > goto merge_rx_exit; > } else { > update_secure_len(vq, res_cur_idx, &secure_len, > &vec_idx); > + if (secure_len == 0) > + goto merge_rx_exit; > res_cur_idx++; > } > } while (pkt_len > secure_len); > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, > uint16_t queue_id, > uint8_t alloc_err = 0; > > desc = &vq->desc[head[entry_success]]; > + if (desc->len == 0) > + break; > > /* Discard first buffer as it is the virtio header */ > if (desc->flags & VRING_DESC_F_NEXT) { @@ -638,6 +649,8 @@ > rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > vb_offset = 0; > vb_avail = desc->len; > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset = vq->vhost_hlen; > vb_avail = desc->len - vb_offset; > }
On Tue, Nov 17, 2015 at 6:56 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t > queue_id, > goto merge_rx_exit; > } else { > update_secure_len(vq, res_cur_idx, > &secure_len, &vec_idx); > + if (secure_len == 0) > + goto merge_rx_exit; > res_cur_idx++; > } > } while (pkt_len > secure_len); > I think this needs to check whether secure_len was modified. secure_len is read-write and could have a nonzero value going into the call. It could be cleaner to give update_secure_len a return value saying whether it was able to reserve any buffers. Otherwise looks good, thanks!
On Tue, Nov 17, 2015 at 09:26:57PM -0800, Rich Lane wrote: > On Tue, Nov 17, 2015 at 6:56 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> > wrote: > > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t > queue_id, > goto merge_rx_exit; > } else { > update_secure_len(vq, res_cur_idx, > &secure_len, &vec_idx); > + if (secure_len == 0) > + goto merge_rx_exit; > res_cur_idx++; > } > } while (pkt_len > secure_len); > > > I think this needs to check whether secure_len was modified. secure_len is > read-write and could have a nonzero value going into the call. It could be > cleaner to give update_secure_len a return value saying whether it was able to > reserve any buffers. Good suggestion. --yliu > > Otherwise looks good, thanks!
On 11/18/2015 10:56 AM, Yuanhan Liu wrote: > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: >> I don't think that adding a SIGINT handler is the right solution, though. The >> guest app could be killed with another signal (SIGKILL). > Good point. > >> Worse, a malicious or >> buggy guest could write to just that field. vhost should not crash no matter >> what the guest writes into the virtqueues. Rich, exactly, that has been in our list for a long time. We should ensure that "Any malicious guest couldn't crash host through vrings" otherwise this vhost implementation couldn't be deployed into production environment. There are many other known security holes in current dpdk vhost in my mind. A very simple example is we don't check the gpa_to_vva return value, so you could easily put a invalid GPA to vring entry to crash vhost. My plan is to review the vhost implementation, fix all the possible issues in one single patch set, and make the fix performance optimization friendly rather than fix them here and there. > Yeah, I agree with you: though we could fix this issue in the source > side, we also should do some defend here. > > How about following patch then? > > Note that the vec_id overflow check should be done before referencing > it, but not after. Hence I moved it ahead. > > --yliu > > --- > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > index 9322ce6..08f5942 100644 > --- a/lib/librte_vhost/vhost_rxtx.c > +++ b/lib/librte_vhost/vhost_rxtx.c > @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > > /* Get descriptor from available ring */ > desc = &vq->desc[head[packet_success]]; > + if (desc->len == 0) > + break; > > buff = pkts[packet_success]; > > @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > /* Buffer address translation. */ > buff_addr = gpa_to_vva(dev, desc->addr); > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset += vq->vhost_hlen; > hdr = 1; > } > @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, > uint32_t vec_id = *vec_idx; > > do { > + if (vec_id >= BUF_VECTOR_MAX) > + break; > + > next_desc = 0; > len += vq->desc[idx].len; > vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, > goto merge_rx_exit; > } else { > update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); > + if (secure_len == 0) > + goto merge_rx_exit; > res_cur_idx++; > } > } while (pkt_len > secure_len); > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > uint8_t alloc_err = 0; > > desc = &vq->desc[head[entry_success]]; > + if (desc->len == 0) > + break; > > /* Discard first buffer as it is the virtio header */ > if (desc->flags & VRING_DESC_F_NEXT) { > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > vb_offset = 0; > vb_avail = desc->len; > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset = vq->vhost_hlen; > vb_avail = desc->len - vb_offset; > } >
On Wed, Nov 18, 2015 at 06:13:08AM +0000, Xie, Huawei wrote: > On 11/18/2015 10:56 AM, Yuanhan Liu wrote: > > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: > >> I don't think that adding a SIGINT handler is the right solution, though. The > >> guest app could be killed with another signal (SIGKILL). > > Good point. > > > >> Worse, a malicious or > >> buggy guest could write to just that field. vhost should not crash no matter > >> what the guest writes into the virtqueues. > Rich, exactly, that has been in our list for a long time. We should > ensure that "Any malicious guest couldn't crash host through vrings" > otherwise this vhost implementation couldn't be deployed into production > environment. > There are many other known security holes in current dpdk vhost in my mind. > A very simple example is we don't check the gpa_to_vva return value, so > you could easily put a invalid GPA to vring entry to crash vhost. > My plan is to review the vhost implementation, fix all the possible > issues in one single patch set, and make the fix performance First of all, there is no way you could find all of them out at once, for we simply make mistakes, and may miss something here and there. And, fixing them in one single patch is not a good pratice; fixing them with one issue per patch is. That will make patch eaiser to review, yet easier to revert if it's a wrong fix. And it's friendly to bisect as well, if it breaks something. --yliu > optimization friendly rather than fix them here and there. > > > Yeah, I agree with you: though we could fix this issue in the source > > side, we also should do some defend here. > > > > How about following patch then? > > > > Note that the vec_id overflow check should be done before referencing > > it, but not after. Hence I moved it ahead. > > > > --yliu > > > > --- > > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > > index 9322ce6..08f5942 100644 > > --- a/lib/librte_vhost/vhost_rxtx.c > > +++ b/lib/librte_vhost/vhost_rxtx.c > > @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > > > > /* Get descriptor from available ring */ > > desc = &vq->desc[head[packet_success]]; > > + if (desc->len == 0) > > + break; > > > > buff = pkts[packet_success]; > > > > @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > > /* Buffer address translation. */ > > buff_addr = gpa_to_vva(dev, desc->addr); > > } else { > > + if (desc->len < vq->vhost_hlen) > > + break; > > vb_offset += vq->vhost_hlen; > > hdr = 1; > > } > > @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, > > uint32_t vec_id = *vec_idx; > > > > do { > > + if (vec_id >= BUF_VECTOR_MAX) > > + break; > > + > > next_desc = 0; > > len += vq->desc[idx].len; > > vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; > > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, > > goto merge_rx_exit; > > } else { > > update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); > > + if (secure_len == 0) > > + goto merge_rx_exit; > > res_cur_idx++; > > } > > } while (pkt_len > secure_len); > > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > > uint8_t alloc_err = 0; > > > > desc = &vq->desc[head[entry_success]]; > > + if (desc->len == 0) > > + break; > > > > /* Discard first buffer as it is the virtio header */ > > if (desc->flags & VRING_DESC_F_NEXT) { > > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > > vb_offset = 0; > > vb_avail = desc->len; > > } else { > > + if (desc->len < vq->vhost_hlen) > > + break; > > vb_offset = vq->vhost_hlen; > > vb_avail = desc->len - vb_offset; > > } > > >
On 11/18/2015 10:56 AM, Yuanhan Liu wrote: > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: >> I don't think that adding a SIGINT handler is the right solution, though. The >> guest app could be killed with another signal (SIGKILL). > Good point. > >> Worse, a malicious or >> buggy guest could write to just that field. vhost should not crash no matter >> what the guest writes into the virtqueues. > Yeah, I agree with you: though we could fix this issue in the source > side, we also should do some defend here. > > How about following patch then? > > Note that the vec_id overflow check should be done before referencing > it, but not after. Hence I moved it ahead. > > --yliu > > --- > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > index 9322ce6..08f5942 100644 > --- a/lib/librte_vhost/vhost_rxtx.c > +++ b/lib/librte_vhost/vhost_rxtx.c > @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > > /* Get descriptor from available ring */ > desc = &vq->desc[head[packet_success]]; > + if (desc->len == 0) > + break; > > buff = pkts[packet_success]; > > @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, > /* Buffer address translation. */ > buff_addr = gpa_to_vva(dev, desc->addr); > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset += vq->vhost_hlen; > hdr = 1; > } > @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, > uint32_t vec_id = *vec_idx; > > do { > + if (vec_id >= BUF_VECTOR_MAX) > + break; > + > next_desc = 0; > len += vq->desc[idx].len; > vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, > goto merge_rx_exit; > } else { > update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); > + if (secure_len == 0) > + goto merge_rx_exit; Why do we exit when secure_len is 0 rather than 1? :). Malicious guest could easily forge the desc len so that secure_len never reach pkt_len even it is not zero so that host enters into dead loop here. Generally speaking, we shouldn't fix for a specific issue, and the security checks should be as few as possible. We need to consider refactor the code here for the generic fix. > res_cur_idx++; > } > } while (pkt_len > secure_len); > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > uint8_t alloc_err = 0; > > desc = &vq->desc[head[entry_success]]; > + if (desc->len == 0) > + break; > > /* Discard first buffer as it is the virtio header */ > if (desc->flags & VRING_DESC_F_NEXT) { > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > vb_offset = 0; > vb_avail = desc->len; > } else { > + if (desc->len < vq->vhost_hlen) > + break; > vb_offset = vq->vhost_hlen; > vb_avail = desc->len - vb_offset; > } >
On 11/18/2015 2:25 PM, Yuanhan Liu wrote: > On Wed, Nov 18, 2015 at 06:13:08AM +0000, Xie, Huawei wrote: >> On 11/18/2015 10:56 AM, Yuanhan Liu wrote: >>> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: >>>> I don't think that adding a SIGINT handler is the right solution, though. The >>>> guest app could be killed with another signal (SIGKILL). >>> Good point. >>> >>>> Worse, a malicious or >>>> buggy guest could write to just that field. vhost should not crash no matter >>>> what the guest writes into the virtqueues. >> Rich, exactly, that has been in our list for a long time. We should >> ensure that "Any malicious guest couldn't crash host through vrings" >> otherwise this vhost implementation couldn't be deployed into production >> environment. >> There are many other known security holes in current dpdk vhost in my mind. >> A very simple example is we don't check the gpa_to_vva return value, so >> you could easily put a invalid GPA to vring entry to crash vhost. >> My plan is to review the vhost implementation, fix all the possible >> issues in one single patch set, and make the fix performance > First of all, there is no way you could find all of them out at > once, for we simply make mistakes, and may miss something here > and there. Agree. > > And, fixing them in one single patch is not a good pratice; fixing > them with one issue per patch is. That will make patch eaiser to > review, yet easier to revert if it's a wrong fix. And it's friendly > to bisect as well, if it breaks something. One patch set, not one big patch. Anyway it isn't the key point. The key point i want to make is we re-review the dpdk vhost implementation from security point's review, from high level. Otherwise as i commented in another mail, we add checks here and there, but actually the fix isn't the generic fix, and some checks could be merged. > > --yliu > >> optimization friendly rather than fix them here and there. >> >>> Yeah, I agree with you: though we could fix this issue in the source >>> side, we also should do some defend here. >>> >>> How about following patch then? >>> >>> Note that the vec_id overflow check should be done before referencing >>> it, but not after. Hence I moved it ahead. >>> >>> --yliu >>> >>> --- >>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c >>> index 9322ce6..08f5942 100644 >>> --- a/lib/librte_vhost/vhost_rxtx.c >>> +++ b/lib/librte_vhost/vhost_rxtx.c >>> @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, >>> >>> /* Get descriptor from available ring */ >>> desc = &vq->desc[head[packet_success]]; >>> + if (desc->len == 0) >>> + break; >>> >>> buff = pkts[packet_success]; >>> >>> @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, >>> /* Buffer address translation. */ >>> buff_addr = gpa_to_vva(dev, desc->addr); >>> } else { >>> + if (desc->len < vq->vhost_hlen) >>> + break; >>> vb_offset += vq->vhost_hlen; >>> hdr = 1; >>> } >>> @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, >>> uint32_t vec_id = *vec_idx; >>> >>> do { >>> + if (vec_id >= BUF_VECTOR_MAX) >>> + break; >>> + >>> next_desc = 0; >>> len += vq->desc[idx].len; >>> vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; >>> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, >>> goto merge_rx_exit; >>> } else { >>> update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); >>> + if (secure_len == 0) >>> + goto merge_rx_exit; >>> res_cur_idx++; >>> } >>> } while (pkt_len > secure_len); >>> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, >>> uint8_t alloc_err = 0; >>> >>> desc = &vq->desc[head[entry_success]]; >>> + if (desc->len == 0) >>> + break; >>> >>> /* Discard first buffer as it is the virtio header */ >>> if (desc->flags & VRING_DESC_F_NEXT) { >>> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, >>> vb_offset = 0; >>> vb_avail = desc->len; >>> } else { >>> + if (desc->len < vq->vhost_hlen) >>> + break; >>> vb_offset = vq->vhost_hlen; >>> vb_avail = desc->len - vb_offset; >>> } >>>
On Wed, Nov 18, 2015 at 07:53:24AM +0000, Xie, Huawei wrote: ... > > do { > > + if (vec_id >= BUF_VECTOR_MAX) > > + break; > > + > > next_desc = 0; > > len += vq->desc[idx].len; > > vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; > > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, > > goto merge_rx_exit; > > } else { > > update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); > > + if (secure_len == 0) > > + goto merge_rx_exit; > Why do we exit when secure_len is 0 rather than 1? :). Malicious guest I confess it's not a proper fix. Making it return an error code, as Rich suggested in early email, is better. It's generic enough, as we have to check the vec_buf overflow here. BTW, can we move the vec_buf outside `struct vhost_virtqueue'? It makes the structure huge. > could easily forge the desc len so that secure_len never reach pkt_len > even it is not zero so that host enters into dead loop here. > Generally speaking, we shouldn't fix for a specific issue, Agreed. > and the > security checks should be as few as possible. Idealy, yes. > We need to consider > refactor the code here for the generic fix. What's your thougths? --yliu > > > res_cur_idx++; > > } > > } while (pkt_len > secure_len); > > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > > uint8_t alloc_err = 0; > > > > desc = &vq->desc[head[entry_success]]; > > + if (desc->len == 0) > > + break; > > > > /* Discard first buffer as it is the virtio header */ > > if (desc->flags & VRING_DESC_F_NEXT) { > > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > > vb_offset = 0; > > vb_avail = desc->len; > > } else { > > + if (desc->len < vq->vhost_hlen) > > + break; > > vb_offset = vq->vhost_hlen; > > vb_avail = desc->len - vb_offset; > > } > > >
On 11/18/2015 4:47 PM, Yuanhan Liu wrote: > On Wed, Nov 18, 2015 at 07:53:24AM +0000, Xie, Huawei wrote: > ... >>> do { >>> + if (vec_id >= BUF_VECTOR_MAX) >>> + break; >>> + >>> next_desc = 0; >>> len += vq->desc[idx].len; >>> vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; >>> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, >>> goto merge_rx_exit; >>> } else { >>> update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); >>> + if (secure_len == 0) >>> + goto merge_rx_exit; >> Why do we exit when secure_len is 0 rather than 1? :). Malicious guest > I confess it's not a proper fix. Making it return an error code, as Rich > suggested in early email, is better. It's generic enough, as we have to > check the vec_buf overflow here. > > BTW, can we move the vec_buf outside `struct vhost_virtqueue'? It makes > the structure huge. > >> could easily forge the desc len so that secure_len never reach pkt_len >> even it is not zero so that host enters into dead loop here. >> Generally speaking, we shouldn't fix for a specific issue, > Agreed. > >> and the >> security checks should be as few as possible. > Idealy, yes. > >> We need to consider >> refactor the code here for the generic fix. > What's your thougths? Maybe we merge the update_secure_len with the outside loop into a simple inline function, in which we consider both the max vector number and desc count to avoid trapped into dead loop. This functions returns a buf vec with which we could copy securely afterwards. > > --yliu >>> res_cur_idx++; >>> } >>> } while (pkt_len > secure_len); >>> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, >>> uint8_t alloc_err = 0; >>> >>> desc = &vq->desc[head[entry_success]]; >>> + if (desc->len == 0) >>> + break; >>> >>> /* Discard first buffer as it is the virtio header */ >>> if (desc->flags & VRING_DESC_F_NEXT) { >>> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, >>> vb_offset = 0; >>> vb_avail = desc->len; >>> } else { >>> + if (desc->len < vq->vhost_hlen) >>> + break; >>> vb_offset = vq->vhost_hlen; >>> vb_avail = desc->len - vb_offset; >>> } >>>
On Wed, 18 Nov 2015 06:13:08 +0000 "Xie, Huawei" <huawei.xie@intel.com> wrote: > On 11/18/2015 10:56 AM, Yuanhan Liu wrote: > > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: > >> I don't think that adding a SIGINT handler is the right solution, though. The > >> guest app could be killed with another signal (SIGKILL). > > Good point. > > > >> Worse, a malicious or > >> buggy guest could write to just that field. vhost should not crash no matter > >> what the guest writes into the virtqueues. > Rich, exactly, that has been in our list for a long time. We should > ensure that "Any malicious guest couldn't crash host through vrings" > otherwise this vhost implementation couldn't be deployed into production > environment. > There are many other known security holes in current dpdk vhost in my mind. > A very simple example is we don't check the gpa_to_vva return value, so > you could easily put a invalid GPA to vring entry to crash vhost. > My plan is to review the vhost implementation, fix all the possible > issues in one single patch set, and make the fix performance > optimization friendly rather than fix them here and there. > Both virtio and vhost need to adopt the "other side is broken" flag model that is in Linux drivers. What this means is that the virtio and vhost driver would check parameters for consistency, and if out of bounds set a broken flag and refuse to do anything more with the device until reset.
On 11/18/2015 11:53 PM, Stephen Hemminger wrote: > On Wed, 18 Nov 2015 06:13:08 +0000 > "Xie, Huawei" <huawei.xie@intel.com> wrote: > >> On 11/18/2015 10:56 AM, Yuanhan Liu wrote: >>> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote: >>>> I don't think that adding a SIGINT handler is the right solution, though. The >>>> guest app could be killed with another signal (SIGKILL). >>> Good point. >>> >>>> Worse, a malicious or >>>> buggy guest could write to just that field. vhost should not crash no matter >>>> what the guest writes into the virtqueues. >> Rich, exactly, that has been in our list for a long time. We should >> ensure that "Any malicious guest couldn't crash host through vrings" >> otherwise this vhost implementation couldn't be deployed into production >> environment. >> There are many other known security holes in current dpdk vhost in my mind. >> A very simple example is we don't check the gpa_to_vva return value, so >> you could easily put a invalid GPA to vring entry to crash vhost. >> My plan is to review the vhost implementation, fix all the possible >> issues in one single patch set, and make the fix performance >> optimization friendly rather than fix them here and there. >> > Both virtio and vhost need to adopt the "other side is broken" flag > model that is in Linux drivers. What this means is that the virtio > and vhost driver would check parameters for consistency, and if out > of bounds set a broken flag and refuse to do anything more with the > device until reset. Stephen: You raise an important opinion. Current DPDK virtio driver implementation chooses to trust the vhost, so doesn't do any consistency check. What is the reason that virtio driver also needs consistency check? Is it that vhost might be buggy or that vhost might also not be trusted in some user case? /huawei >
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 9322ce6..08f5942 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, /* Get descriptor from available ring */ desc = &vq->desc[head[packet_success]]; + if (desc->len == 0) + break; buff = pkts[packet_success]; @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, /* Buffer address translation. */ buff_addr = gpa_to_vva(dev, desc->addr); } else { + if (desc->len < vq->vhost_hlen) + break; vb_offset += vq->vhost_hlen; hdr = 1; } @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, uint32_t vec_id = *vec_idx; do { + if (vec_id >= BUF_VECTOR_MAX) + break; + next_desc = 0; len += vq->desc[idx].len; vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, goto merge_rx_exit; } else { update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx); + if (secure_len == 0) + goto merge_rx_exit; res_cur_idx++; } } while (pkt_len > secure_len); @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, uint8_t alloc_err = 0; desc = &vq->desc[head[entry_success]]; + if (desc->len == 0) + break; /* Discard first buffer as it is the virtio header */ if (desc->flags & VRING_DESC_F_NEXT) { @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, vb_offset = 0; vb_avail = desc->len; } else { + if (desc->len < vq->vhost_hlen) + break; vb_offset = vq->vhost_hlen; vb_avail = desc->len - vb_offset; }