[dpdk-dev] vhost: avoid buffer overflow in update_secure_len

Message ID 20151118025655.GW2326@yliu-dev.sh.intel.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Yuanhan Liu Nov. 18, 2015, 2:56 a.m. UTC
  On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
> 
> I don't think that adding a SIGINT handler is the right solution, though. The
> guest app could be killed with another signal (SIGKILL).

Good point.

> Worse, a malicious or
> buggy guest could write to just that field. vhost should not crash no matter
> what the guest writes into the virtqueues.

Yeah, I agree with you: though we could fix this issue in the source
side, we also should do some defend here.

How about following patch then?

Note that the vec_id overflow check should be done before referencing
it, but not after. Hence I moved it ahead.

	--yliu

---
  

Comments

Zhihong Wang Nov. 18, 2015, 5:23 a.m. UTC | #1
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Wednesday, November 18, 2015 10:57 AM
> To: Rich Lane <rich.lane@bigswitch.com>
> Cc: dev@dpdk.org; Xie, Huawei <huawei.xie@intel.com>; Wang, Zhihong
> <zhihong.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH] vhost: avoid buffer overflow in update_secure_len
> 
> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
> >
> > I don't think that adding a SIGINT handler is the right solution,
> > though. The guest app could be killed with another signal (SIGKILL).
> 
> Good point.
> 
> > Worse, a malicious or
> > buggy guest could write to just that field. vhost should not crash no
> > matter what the guest writes into the virtqueues.
> 
> Yeah, I agree with you: though we could fix this issue in the source side, we also
> should do some defend here.
> 

Exactly, DPDK should be able to take care of both ends:
# Provide interface for resource cleanup
# Be prepared if the app doesn't shutdown properly

> How about following patch then?
> 
> Note that the vec_id overflow check should be done before referencing it, but
> not after. Hence I moved it ahead.
> 
> 	--yliu
> 
> ---
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index
> 9322ce6..08f5942 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
> 
>  		/* Get descriptor from available ring */
>  		desc = &vq->desc[head[packet_success]];
> +		if (desc->len == 0)
> +			break;
> 
>  		buff = pkts[packet_success];
> 
> @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  			/* Buffer address translation. */
>  			buff_addr = gpa_to_vva(dev, desc->addr);
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset += vq->vhost_hlen;
>  			hdr = 1;
>  		}
> @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t
> id,
>  	uint32_t vec_id = *vec_idx;
> 
>  	do {
> +		if (vec_id >= BUF_VECTOR_MAX)
> +			break;
> +
>  		next_desc = 0;
>  		len += vq->desc[idx].len;
>  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; @@ -519,6
> +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  					goto merge_rx_exit;
>  				} else {
>  					update_secure_len(vq, res_cur_idx, &secure_len,
> &vec_idx);
> +					if (secure_len == 0)
> +						goto merge_rx_exit;
>  					res_cur_idx++;
>  				}
>  			} while (pkt_len > secure_len);
> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
>  		uint8_t alloc_err = 0;
> 
>  		desc = &vq->desc[head[entry_success]];
> +		if (desc->len == 0)
> +			break;
> 
>  		/* Discard first buffer as it is the virtio header */
>  		if (desc->flags & VRING_DESC_F_NEXT) { @@ -638,6 +649,8 @@
> rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  			vb_offset = 0;
>  			vb_avail = desc->len;
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset = vq->vhost_hlen;
>  			vb_avail = desc->len - vb_offset;
>  		}
  
Rich Lane Nov. 18, 2015, 5:26 a.m. UTC | #2
On Tue, Nov 17, 2015 at 6:56 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
wrote:

> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t
> queue_id,
>                                         goto merge_rx_exit;
>                                 } else {
>                                         update_secure_len(vq, res_cur_idx,
> &secure_len, &vec_idx);
> +                                       if (secure_len == 0)
> +                                               goto merge_rx_exit;
>                                         res_cur_idx++;
>                                 }
>                         } while (pkt_len > secure_len);
>

I think this needs to check whether secure_len was modified. secure_len is
read-write and could have a nonzero value going into the call. It could be
cleaner to give update_secure_len a return value saying whether it was able
to reserve any buffers.

Otherwise looks good, thanks!
  
Yuanhan Liu Nov. 18, 2015, 5:32 a.m. UTC | #3
On Tue, Nov 17, 2015 at 09:26:57PM -0800, Rich Lane wrote:
> On Tue, Nov 17, 2015 at 6:56 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
> wrote:
> 
>     @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t
>     queue_id,
>                                             goto merge_rx_exit;
>                                     } else {
>                                             update_secure_len(vq, res_cur_idx,
>     &secure_len, &vec_idx);
>     +                                       if (secure_len == 0)
>     +                                               goto merge_rx_exit;
>                                             res_cur_idx++;
>                                     }
>                             } while (pkt_len > secure_len);
> 
> 
> I think this needs to check whether secure_len was modified. secure_len is
> read-write and could have a nonzero value going into the call. It could be
> cleaner to give update_secure_len a return value saying whether it was able to
> reserve any buffers.

Good suggestion.

	--yliu
> 
> Otherwise looks good, thanks!
  
Huawei Xie Nov. 18, 2015, 6:13 a.m. UTC | #4
On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
>> I don't think that adding a SIGINT handler is the right solution, though. The
>> guest app could be killed with another signal (SIGKILL).
> Good point.
>
>> Worse, a malicious or
>> buggy guest could write to just that field. vhost should not crash no matter
>> what the guest writes into the virtqueues.
Rich, exactly, that has been in our list for a long time. We should
ensure that "Any malicious guest couldn't crash host through vrings"
otherwise this vhost implementation couldn't be deployed into production
environment.
There are many other known security holes in current dpdk vhost in my mind.
A very simple example is we don't check the gpa_to_vva return value, so
you could easily put a invalid GPA to vring entry to crash vhost.
My plan is to review the vhost implementation, fix all the possible
issues in one single patch set, and make the fix performance
optimization friendly rather than fix them here and there.

> Yeah, I agree with you: though we could fix this issue in the source
> side, we also should do some defend here.
>
> How about following patch then?
>
> Note that the vec_id overflow check should be done before referencing
> it, but not after. Hence I moved it ahead.
>
> 	--yliu
>
> ---
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 9322ce6..08f5942 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  		/* Get descriptor from available ring */
>  		desc = &vq->desc[head[packet_success]];
> +		if (desc->len == 0)
> +			break;
>  
>  		buff = pkts[packet_success];
>  
> @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  			/* Buffer address translation. */
>  			buff_addr = gpa_to_vva(dev, desc->addr);
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset += vq->vhost_hlen;
>  			hdr = 1;
>  		}
> @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
>  	uint32_t vec_id = *vec_idx;
>  
>  	do {
> +		if (vec_id >= BUF_VECTOR_MAX)
> +			break;
> +
>  		next_desc = 0;
>  		len += vq->desc[idx].len;
>  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  					goto merge_rx_exit;
>  				} else {
>  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
> +					if (secure_len == 0)
> +						goto merge_rx_exit;
>  					res_cur_idx++;
>  				}
>  			} while (pkt_len > secure_len);
> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  		uint8_t alloc_err = 0;
>  
>  		desc = &vq->desc[head[entry_success]];
> +		if (desc->len == 0)
> +			break;
>  
>  		/* Discard first buffer as it is the virtio header */
>  		if (desc->flags & VRING_DESC_F_NEXT) {
> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  			vb_offset = 0;
>  			vb_avail = desc->len;
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset = vq->vhost_hlen;
>  			vb_avail = desc->len - vb_offset;
>  		}
>
  
Yuanhan Liu Nov. 18, 2015, 6:25 a.m. UTC | #5
On Wed, Nov 18, 2015 at 06:13:08AM +0000, Xie, Huawei wrote:
> On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
> > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
> >> I don't think that adding a SIGINT handler is the right solution, though. The
> >> guest app could be killed with another signal (SIGKILL).
> > Good point.
> >
> >> Worse, a malicious or
> >> buggy guest could write to just that field. vhost should not crash no matter
> >> what the guest writes into the virtqueues.
> Rich, exactly, that has been in our list for a long time. We should
> ensure that "Any malicious guest couldn't crash host through vrings"
> otherwise this vhost implementation couldn't be deployed into production
> environment.
> There are many other known security holes in current dpdk vhost in my mind.
> A very simple example is we don't check the gpa_to_vva return value, so
> you could easily put a invalid GPA to vring entry to crash vhost.
> My plan is to review the vhost implementation, fix all the possible
> issues in one single patch set, and make the fix performance

First of all, there is no way you could find all of them out at
once, for we simply make mistakes, and may miss something here
and there.

And, fixing them in one single patch is not a good pratice; fixing
them with one issue per patch is. That will make patch eaiser to
review, yet easier to revert if it's a wrong fix. And it's friendly
to bisect as well, if it breaks something.

	--yliu

> optimization friendly rather than fix them here and there.
> 
> > Yeah, I agree with you: though we could fix this issue in the source
> > side, we also should do some defend here.
> >
> > How about following patch then?
> >
> > Note that the vec_id overflow check should be done before referencing
> > it, but not after. Hence I moved it ahead.
> >
> > 	--yliu
> >
> > ---
> > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> > index 9322ce6..08f5942 100644
> > --- a/lib/librte_vhost/vhost_rxtx.c
> > +++ b/lib/librte_vhost/vhost_rxtx.c
> > @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
> >  
> >  		/* Get descriptor from available ring */
> >  		desc = &vq->desc[head[packet_success]];
> > +		if (desc->len == 0)
> > +			break;
> >  
> >  		buff = pkts[packet_success];
> >  
> > @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
> >  			/* Buffer address translation. */
> >  			buff_addr = gpa_to_vva(dev, desc->addr);
> >  		} else {
> > +			if (desc->len < vq->vhost_hlen)
> > +				break;
> >  			vb_offset += vq->vhost_hlen;
> >  			hdr = 1;
> >  		}
> > @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
> >  	uint32_t vec_id = *vec_idx;
> >  
> >  	do {
> > +		if (vec_id >= BUF_VECTOR_MAX)
> > +			break;
> > +
> >  		next_desc = 0;
> >  		len += vq->desc[idx].len;
> >  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
> > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
> >  					goto merge_rx_exit;
> >  				} else {
> >  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
> > +					if (secure_len == 0)
> > +						goto merge_rx_exit;
> >  					res_cur_idx++;
> >  				}
> >  			} while (pkt_len > secure_len);
> > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  		uint8_t alloc_err = 0;
> >  
> >  		desc = &vq->desc[head[entry_success]];
> > +		if (desc->len == 0)
> > +			break;
> >  
> >  		/* Discard first buffer as it is the virtio header */
> >  		if (desc->flags & VRING_DESC_F_NEXT) {
> > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  			vb_offset = 0;
> >  			vb_avail = desc->len;
> >  		} else {
> > +			if (desc->len < vq->vhost_hlen)
> > +				break;
> >  			vb_offset = vq->vhost_hlen;
> >  			vb_avail = desc->len - vb_offset;
> >  		}
> >
>
  
Huawei Xie Nov. 18, 2015, 7:53 a.m. UTC | #6
On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
>> I don't think that adding a SIGINT handler is the right solution, though. The
>> guest app could be killed with another signal (SIGKILL).
> Good point.
>
>> Worse, a malicious or
>> buggy guest could write to just that field. vhost should not crash no matter
>> what the guest writes into the virtqueues.
> Yeah, I agree with you: though we could fix this issue in the source
> side, we also should do some defend here.
>
> How about following patch then?
>
> Note that the vec_id overflow check should be done before referencing
> it, but not after. Hence I moved it ahead.
>
> 	--yliu
>
> ---
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 9322ce6..08f5942 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  		/* Get descriptor from available ring */
>  		desc = &vq->desc[head[packet_success]];
> +		if (desc->len == 0)
> +			break;
>  
>  		buff = pkts[packet_success];
>  
> @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  			/* Buffer address translation. */
>  			buff_addr = gpa_to_vva(dev, desc->addr);
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset += vq->vhost_hlen;
>  			hdr = 1;
>  		}
> @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
>  	uint32_t vec_id = *vec_idx;
>  
>  	do {
> +		if (vec_id >= BUF_VECTOR_MAX)
> +			break;
> +
>  		next_desc = 0;
>  		len += vq->desc[idx].len;
>  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  					goto merge_rx_exit;
>  				} else {
>  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
> +					if (secure_len == 0)
> +						goto merge_rx_exit;
Why do we exit when secure_len is 0 rather than 1? :). Malicious guest
could easily forge the desc len so that secure_len never reach pkt_len
even it is not zero so that host enters into dead loop here.
Generally speaking, we shouldn't fix for a specific issue, and the
security checks should be as few as possible. We need to consider
refactor the code here for the generic fix.

>  					res_cur_idx++;
>  				}
>  			} while (pkt_len > secure_len);
> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  		uint8_t alloc_err = 0;
>  
>  		desc = &vq->desc[head[entry_success]];
> +		if (desc->len == 0)
> +			break;
>  
>  		/* Discard first buffer as it is the virtio header */
>  		if (desc->flags & VRING_DESC_F_NEXT) {
> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  			vb_offset = 0;
>  			vb_avail = desc->len;
>  		} else {
> +			if (desc->len < vq->vhost_hlen)
> +				break;
>  			vb_offset = vq->vhost_hlen;
>  			vb_avail = desc->len - vb_offset;
>  		}
>
  
Huawei Xie Nov. 18, 2015, 8:13 a.m. UTC | #7
On 11/18/2015 2:25 PM, Yuanhan Liu wrote:
> On Wed, Nov 18, 2015 at 06:13:08AM +0000, Xie, Huawei wrote:
>> On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
>>> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
>>>> I don't think that adding a SIGINT handler is the right solution, though. The
>>>> guest app could be killed with another signal (SIGKILL).
>>> Good point.
>>>
>>>> Worse, a malicious or
>>>> buggy guest could write to just that field. vhost should not crash no matter
>>>> what the guest writes into the virtqueues.
>> Rich, exactly, that has been in our list for a long time. We should
>> ensure that "Any malicious guest couldn't crash host through vrings"
>> otherwise this vhost implementation couldn't be deployed into production
>> environment.
>> There are many other known security holes in current dpdk vhost in my mind.
>> A very simple example is we don't check the gpa_to_vva return value, so
>> you could easily put a invalid GPA to vring entry to crash vhost.
>> My plan is to review the vhost implementation, fix all the possible
>> issues in one single patch set, and make the fix performance
> First of all, there is no way you could find all of them out at
> once, for we simply make mistakes, and may miss something here
> and there.
Agree.
>
> And, fixing them in one single patch is not a good pratice; fixing
> them with one issue per patch is. That will make patch eaiser to
> review, yet easier to revert if it's a wrong fix. And it's friendly
> to bisect as well, if it breaks something.
One patch set, not one big patch. Anyway it isn't the key point.
The key point i want to make is we re-review the dpdk vhost
implementation from security point's review, from high level.
Otherwise as i commented in another mail, we add checks here and there,
but actually the fix isn't the generic fix, and some checks could be merged.

>
> 	--yliu
>
>> optimization friendly rather than fix them here and there.
>>
>>> Yeah, I agree with you: though we could fix this issue in the source
>>> side, we also should do some defend here.
>>>
>>> How about following patch then?
>>>
>>> Note that the vec_id overflow check should be done before referencing
>>> it, but not after. Hence I moved it ahead.
>>>
>>> 	--yliu
>>>
>>> ---
>>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
>>> index 9322ce6..08f5942 100644
>>> --- a/lib/librte_vhost/vhost_rxtx.c
>>> +++ b/lib/librte_vhost/vhost_rxtx.c
>>> @@ -132,6 +132,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>>>  
>>>  		/* Get descriptor from available ring */
>>>  		desc = &vq->desc[head[packet_success]];
>>> +		if (desc->len == 0)
>>> +			break;
>>>  
>>>  		buff = pkts[packet_success];
>>>  
>>> @@ -153,6 +155,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>>>  			/* Buffer address translation. */
>>>  			buff_addr = gpa_to_vva(dev, desc->addr);
>>>  		} else {
>>> +			if (desc->len < vq->vhost_hlen)
>>> +				break;
>>>  			vb_offset += vq->vhost_hlen;
>>>  			hdr = 1;
>>>  		}
>>> @@ -446,6 +450,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
>>>  	uint32_t vec_id = *vec_idx;
>>>  
>>>  	do {
>>> +		if (vec_id >= BUF_VECTOR_MAX)
>>> +			break;
>>> +
>>>  		next_desc = 0;
>>>  		len += vq->desc[idx].len;
>>>  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
>>> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>>>  					goto merge_rx_exit;
>>>  				} else {
>>>  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
>>> +					if (secure_len == 0)
>>> +						goto merge_rx_exit;
>>>  					res_cur_idx++;
>>>  				}
>>>  			} while (pkt_len > secure_len);
>>> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>>>  		uint8_t alloc_err = 0;
>>>  
>>>  		desc = &vq->desc[head[entry_success]];
>>> +		if (desc->len == 0)
>>> +			break;
>>>  
>>>  		/* Discard first buffer as it is the virtio header */
>>>  		if (desc->flags & VRING_DESC_F_NEXT) {
>>> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>>>  			vb_offset = 0;
>>>  			vb_avail = desc->len;
>>>  		} else {
>>> +			if (desc->len < vq->vhost_hlen)
>>> +				break;
>>>  			vb_offset = vq->vhost_hlen;
>>>  			vb_avail = desc->len - vb_offset;
>>>  		}
>>>
  
Yuanhan Liu Nov. 18, 2015, 8:48 a.m. UTC | #8
On Wed, Nov 18, 2015 at 07:53:24AM +0000, Xie, Huawei wrote:
...
> >  	do {
> > +		if (vec_id >= BUF_VECTOR_MAX)
> > +			break;
> > +
> >  		next_desc = 0;
> >  		len += vq->desc[idx].len;
> >  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
> > @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
> >  					goto merge_rx_exit;
> >  				} else {
> >  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
> > +					if (secure_len == 0)
> > +						goto merge_rx_exit;
> Why do we exit when secure_len is 0 rather than 1? :). Malicious guest

I confess it's not a proper fix. Making it return an error code, as Rich
suggested in early email, is better. It's generic enough, as we have to
check the vec_buf overflow here.

BTW, can we move the vec_buf outside `struct vhost_virtqueue'? It makes
the structure huge.

> could easily forge the desc len so that secure_len never reach pkt_len
> even it is not zero so that host enters into dead loop here.
> Generally speaking, we shouldn't fix for a specific issue,

Agreed.

> and the
> security checks should be as few as possible.

Idealy, yes.

> We need to consider
> refactor the code here for the generic fix.

What's your thougths?

	--yliu
> 
> >  					res_cur_idx++;
> >  				}
> >  			} while (pkt_len > secure_len);
> > @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  		uint8_t alloc_err = 0;
> >  
> >  		desc = &vq->desc[head[entry_success]];
> > +		if (desc->len == 0)
> > +			break;
> >  
> >  		/* Discard first buffer as it is the virtio header */
> >  		if (desc->flags & VRING_DESC_F_NEXT) {
> > @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  			vb_offset = 0;
> >  			vb_avail = desc->len;
> >  		} else {
> > +			if (desc->len < vq->vhost_hlen)
> > +				break;
> >  			vb_offset = vq->vhost_hlen;
> >  			vb_avail = desc->len - vb_offset;
> >  		}
> >
>
  
Huawei Xie Nov. 18, 2015, 11:15 a.m. UTC | #9
On 11/18/2015 4:47 PM, Yuanhan Liu wrote:
> On Wed, Nov 18, 2015 at 07:53:24AM +0000, Xie, Huawei wrote:
> ...
>>>  	do {
>>> +		if (vec_id >= BUF_VECTOR_MAX)
>>> +			break;
>>> +
>>>  		next_desc = 0;
>>>  		len += vq->desc[idx].len;
>>>  		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
>>> @@ -519,6 +526,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>>>  					goto merge_rx_exit;
>>>  				} else {
>>>  					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
>>> +					if (secure_len == 0)
>>> +						goto merge_rx_exit;
>> Why do we exit when secure_len is 0 rather than 1? :). Malicious guest
> I confess it's not a proper fix. Making it return an error code, as Rich
> suggested in early email, is better. It's generic enough, as we have to
> check the vec_buf overflow here.
>
> BTW, can we move the vec_buf outside `struct vhost_virtqueue'? It makes
> the structure huge.
>
>> could easily forge the desc len so that secure_len never reach pkt_len
>> even it is not zero so that host enters into dead loop here.
>> Generally speaking, we shouldn't fix for a specific issue,
> Agreed.
>
>> and the
>> security checks should be as few as possible.
> Idealy, yes.
>
>> We need to consider
>> refactor the code here for the generic fix.
> What's your thougths?
Maybe we merge the update_secure_len with the outside loop into a simple
inline function, in which we consider both the max vector number and
desc count to avoid trapped into dead loop. This functions returns a buf
vec with which we could copy securely afterwards.
>
> 	--yliu
>>>  					res_cur_idx++;
>>>  				}
>>>  			} while (pkt_len > secure_len);
>>> @@ -631,6 +640,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>>>  		uint8_t alloc_err = 0;
>>>  
>>>  		desc = &vq->desc[head[entry_success]];
>>> +		if (desc->len == 0)
>>> +			break;
>>>  
>>>  		/* Discard first buffer as it is the virtio header */
>>>  		if (desc->flags & VRING_DESC_F_NEXT) {
>>> @@ -638,6 +649,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>>>  			vb_offset = 0;
>>>  			vb_avail = desc->len;
>>>  		} else {
>>> +			if (desc->len < vq->vhost_hlen)
>>> +				break;
>>>  			vb_offset = vq->vhost_hlen;
>>>  			vb_avail = desc->len - vb_offset;
>>>  		}
>>>
  
Stephen Hemminger Nov. 18, 2015, 3:53 p.m. UTC | #10
On Wed, 18 Nov 2015 06:13:08 +0000
"Xie, Huawei" <huawei.xie@intel.com> wrote:

> On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
> > On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
> >> I don't think that adding a SIGINT handler is the right solution, though. The
> >> guest app could be killed with another signal (SIGKILL).
> > Good point.
> >
> >> Worse, a malicious or
> >> buggy guest could write to just that field. vhost should not crash no matter
> >> what the guest writes into the virtqueues.
> Rich, exactly, that has been in our list for a long time. We should
> ensure that "Any malicious guest couldn't crash host through vrings"
> otherwise this vhost implementation couldn't be deployed into production
> environment.
> There are many other known security holes in current dpdk vhost in my mind.
> A very simple example is we don't check the gpa_to_vva return value, so
> you could easily put a invalid GPA to vring entry to crash vhost.
> My plan is to review the vhost implementation, fix all the possible
> issues in one single patch set, and make the fix performance
> optimization friendly rather than fix them here and there.
> 

Both virtio and vhost need to adopt the "other side is broken" flag
model that is in Linux drivers.  What this means is that the virtio
and vhost driver would check parameters for consistency, and if out
of bounds set a broken flag and refuse to do anything more with the
device until reset.
  
Huawei Xie Nov. 18, 2015, 4 p.m. UTC | #11
On 11/18/2015 11:53 PM, Stephen Hemminger wrote:
> On Wed, 18 Nov 2015 06:13:08 +0000
> "Xie, Huawei" <huawei.xie@intel.com> wrote:
>
>> On 11/18/2015 10:56 AM, Yuanhan Liu wrote:
>>> On Tue, Nov 17, 2015 at 08:39:30AM -0800, Rich Lane wrote:
>>>> I don't think that adding a SIGINT handler is the right solution, though. The
>>>> guest app could be killed with another signal (SIGKILL).
>>> Good point.
>>>
>>>> Worse, a malicious or
>>>> buggy guest could write to just that field. vhost should not crash no matter
>>>> what the guest writes into the virtqueues.
>> Rich, exactly, that has been in our list for a long time. We should
>> ensure that "Any malicious guest couldn't crash host through vrings"
>> otherwise this vhost implementation couldn't be deployed into production
>> environment.
>> There are many other known security holes in current dpdk vhost in my mind.
>> A very simple example is we don't check the gpa_to_vva return value, so
>> you could easily put a invalid GPA to vring entry to crash vhost.
>> My plan is to review the vhost implementation, fix all the possible
>> issues in one single patch set, and make the fix performance
>> optimization friendly rather than fix them here and there.
>>
> Both virtio and vhost need to adopt the "other side is broken" flag
> model that is in Linux drivers.  What this means is that the virtio
> and vhost driver would check parameters for consistency, and if out
> of bounds set a broken flag and refuse to do anything more with the
> device until reset.
Stephen:
You raise an important opinion.
Current DPDK virtio driver implementation chooses to trust the vhost, so
doesn't do any consistency check.
What is the reason that virtio driver also needs consistency check? Is
it that vhost might be buggy or that vhost might also not be trusted in
some user case?
/huawei
>
  

Patch

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 9322ce6..08f5942 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -132,6 +132,8 @@  virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 
 		/* Get descriptor from available ring */
 		desc = &vq->desc[head[packet_success]];
+		if (desc->len == 0)
+			break;
 
 		buff = pkts[packet_success];
 
@@ -153,6 +155,8 @@  virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			/* Buffer address translation. */
 			buff_addr = gpa_to_vva(dev, desc->addr);
 		} else {
+			if (desc->len < vq->vhost_hlen)
+				break;
 			vb_offset += vq->vhost_hlen;
 			hdr = 1;
 		}
@@ -446,6 +450,9 @@  update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
 	uint32_t vec_id = *vec_idx;
 
 	do {
+		if (vec_id >= BUF_VECTOR_MAX)
+			break;
+
 		next_desc = 0;
 		len += vq->desc[idx].len;
 		vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
@@ -519,6 +526,8 @@  virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 					goto merge_rx_exit;
 				} else {
 					update_secure_len(vq, res_cur_idx, &secure_len, &vec_idx);
+					if (secure_len == 0)
+						goto merge_rx_exit;
 					res_cur_idx++;
 				}
 			} while (pkt_len > secure_len);
@@ -631,6 +640,8 @@  rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 		uint8_t alloc_err = 0;
 
 		desc = &vq->desc[head[entry_success]];
+		if (desc->len == 0)
+			break;
 
 		/* Discard first buffer as it is the virtio header */
 		if (desc->flags & VRING_DESC_F_NEXT) {
@@ -638,6 +649,8 @@  rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 			vb_offset = 0;
 			vb_avail = desc->len;
 		} else {
+			if (desc->len < vq->vhost_hlen)
+				break;
 			vb_offset = vq->vhost_hlen;
 			vb_avail = desc->len - vb_offset;
 		}