[dpdk-dev,v2] ring: check for zero objects mc dequeue / mp enqueue

Message ID 1458229783-15547-1-git-send-email-l@nofutznetworks.com (mailing list archive)
State Rejected, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Lazaros Koromilas March 17, 2016, 3:49 p.m. UTC
  Issuing a zero objects dequeue with a single consumer has no effect.
Doing so with multiple consumers, can get more than one thread to succeed
the compare-and-set operation and observe starvation or even deadlock in
the while loop that checks for preceding dequeues.  The problematic piece
of code when n = 0:

    cons_next = cons_head + n;
    success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);

The same is possible on the enqueue path.

Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
---
 lib/librte_ring/rte_ring.h | 10 ++++++++++
 1 file changed, 10 insertions(+)
  

Comments

Mauricio Vasquez B March 17, 2016, 4:09 p.m. UTC | #1
Hi Lazaros,

On Thu, Mar 17, 2016 at 4:49 PM, Lazaros Koromilas <l@nofutznetworks.com>
wrote:

> Issuing a zero objects dequeue with a single consumer has no effect.
> Doing so with multiple consumers, can get more than one thread to succeed
> the compare-and-set operation and observe starvation or even deadlock in
> the while loop that checks for preceding dequeues.  The problematic piece
> of code when n = 0:
>
>     cons_next = cons_head + n;
>     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
>
> The same is possible on the enqueue path.
>
> Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
> ---
>  lib/librte_ring/rte_ring.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 943c97c..eb45e41 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
> const *obj_table,
>         uint32_t mask = r->prod.mask;
>         int ret;
>
> +       /* Avoid the unnecessary cmpset operation below, which is also
> +        * potentially harmful when n equals 0. */
> +       if (n == 0)
>

What about using unlikely here?


> +               return 0;
> +
>         /* move prod.head atomically */
>         do {
>                 /* Reset n to the initial burst count */
> @@ -618,6 +623,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void
> **obj_table,
>         unsigned i, rep = 0;
>         uint32_t mask = r->prod.mask;
>
> +       /* Avoid the unnecessary cmpset operation below, which is also
> +        * potentially harmful when n equals 0. */
> +       if (n == 0)
>

Also here.


> +               return 0;
> +
>         /* move cons.head atomically */
>         do {
>                 /* Restore n as it may change every loop */
> --
> 1.9.1
>
>
  
Bruce Richardson March 18, 2016, 10:18 a.m. UTC | #2
On Thu, Mar 17, 2016 at 05:09:08PM +0100, Mauricio Vásquez wrote:
> Hi Lazaros,
> 
> On Thu, Mar 17, 2016 at 4:49 PM, Lazaros Koromilas <l@nofutznetworks.com>
> wrote:
> 
> > Issuing a zero objects dequeue with a single consumer has no effect.
> > Doing so with multiple consumers, can get more than one thread to succeed
> > the compare-and-set operation and observe starvation or even deadlock in
> > the while loop that checks for preceding dequeues.  The problematic piece
> > of code when n = 0:
> >
> >     cons_next = cons_head + n;
> >     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> >
> > The same is possible on the enqueue path.
> >
> > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
> > ---
> >  lib/librte_ring/rte_ring.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 943c97c..eb45e41 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
> > const *obj_table,
> >         uint32_t mask = r->prod.mask;
> >         int ret;
> >
> > +       /* Avoid the unnecessary cmpset operation below, which is also
> > +        * potentially harmful when n equals 0. */
> > +       if (n == 0)
> >
> 
> What about using unlikely here?
> 

Unless there is a measurable performance increase by adding in likely/unlikely
I'd suggest avoiding it's use. In general, likely/unlikely should only be used
for things like catestrophic errors because the penalty for taking the unlikely
leg of the code can be quite severe. For normal stuff, where the code nearly
always goes one way in the branch but occasionally goes the other, the hardware
branch predictors generally do a good enough job.

Just my 2c.

/Bruce
  
Olivier Matz March 18, 2016, 10:27 a.m. UTC | #3
Hi,

On 03/18/2016 11:18 AM, Bruce Richardson wrote:
>>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
>>> index 943c97c..eb45e41 100644
>>> --- a/lib/librte_ring/rte_ring.h
>>> +++ b/lib/librte_ring/rte_ring.h
>>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
>>> const *obj_table,
>>>         uint32_t mask = r->prod.mask;
>>>         int ret;
>>>
>>> +       /* Avoid the unnecessary cmpset operation below, which is also
>>> +        * potentially harmful when n equals 0. */
>>> +       if (n == 0)
>>>
>>
>> What about using unlikely here?
>>
> 
> Unless there is a measurable performance increase by adding in likely/unlikely
> I'd suggest avoiding it's use. In general, likely/unlikely should only be used
> for things like catestrophic errors because the penalty for taking the unlikely
> leg of the code can be quite severe. For normal stuff, where the code nearly
> always goes one way in the branch but occasionally goes the other, the hardware
> branch predictors generally do a good enough job.

Do you mean using likely/unlikely could be worst than not using it
in this case?

To me, using unlikely here is not a bad idea: it shows to the compiler
and to the reader of the code that is case is not the usual case.


Olivier
  
Bruce Richardson March 18, 2016, 10:35 a.m. UTC | #4
On Fri, Mar 18, 2016 at 11:27:18AM +0100, Olivier Matz wrote:
> Hi,
> 
> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> >>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> >>> index 943c97c..eb45e41 100644
> >>> --- a/lib/librte_ring/rte_ring.h
> >>> +++ b/lib/librte_ring/rte_ring.h
> >>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
> >>> const *obj_table,
> >>>         uint32_t mask = r->prod.mask;
> >>>         int ret;
> >>>
> >>> +       /* Avoid the unnecessary cmpset operation below, which is also
> >>> +        * potentially harmful when n equals 0. */
> >>> +       if (n == 0)
> >>>
> >>
> >> What about using unlikely here?
> >>
> > 
> > Unless there is a measurable performance increase by adding in likely/unlikely
> > I'd suggest avoiding it's use. In general, likely/unlikely should only be used
> > for things like catestrophic errors because the penalty for taking the unlikely
> > leg of the code can be quite severe. For normal stuff, where the code nearly
> > always goes one way in the branch but occasionally goes the other, the hardware
> > branch predictors generally do a good enough job.
> 
> Do you mean using likely/unlikely could be worst than not using it
> in this case?
> 
> To me, using unlikely here is not a bad idea: it shows to the compiler
> and to the reader of the code that is case is not the usual case.
>
Hi Olivier,

it might be worse if the user makes a lot of calls with n == 0. It almost
certainly would depend upon the compiler. Overall, I'd rather see us err on the
side of not putting in the calls unless there is a proven case to do so.

I don't think the documentation benefit is huge here either, it's just standard
parameter checking at the start of the function.

/Bruce
  
Thomas Monjalon March 18, 2016, 10:35 a.m. UTC | #5
2016-03-18 11:27, Olivier Matz:
> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> >>> +       /* Avoid the unnecessary cmpset operation below, which is also
> >>> +        * potentially harmful when n equals 0. */
> >>> +       if (n == 0)
> >>>
> >>
> >> What about using unlikely here?
> >>
> > 
> > Unless there is a measurable performance increase by adding in likely/unlikely
> > I'd suggest avoiding it's use. In general, likely/unlikely should only be used
> > for things like catestrophic errors because the penalty for taking the unlikely
> > leg of the code can be quite severe. For normal stuff, where the code nearly
> > always goes one way in the branch but occasionally goes the other, the hardware
> > branch predictors generally do a good enough job.
> 
> Do you mean using likely/unlikely could be worst than not using it
> in this case?
> 
> To me, using unlikely here is not a bad idea: it shows to the compiler
> and to the reader of the code that is case is not the usual case.

It would be nice to have a guideline section about likely/unlikely in
doc/guides/contributing/design.rst

Bruce gave a talk at Dublin about this kind of things.
I'm sure he could contribute more design guidelines ;)
  
Mauricio Vasquez B March 18, 2016, 12:47 p.m. UTC | #6
Hi,


On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com
> wrote:

> 2016-03-18 11:27, Olivier Matz:
> > On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> > >>> +       /* Avoid the unnecessary cmpset operation below, which is
> also
> > >>> +        * potentially harmful when n equals 0. */
> > >>> +       if (n == 0)
> > >>>
> > >>
> > >> What about using unlikely here?
> > >>
> > >
> > > Unless there is a measurable performance increase by adding in
> likely/unlikely
> > > I'd suggest avoiding it's use. In general, likely/unlikely should only
> be used
> > > for things like catestrophic errors because the penalty for taking the
> unlikely
> > > leg of the code can be quite severe. For normal stuff, where the code
> nearly
> > > always goes one way in the branch but occasionally goes the other, the
> hardware
> > > branch predictors generally do a good enough job.
> >
> > Do you mean using likely/unlikely could be worst than not using it
> > in this case?
> >
> > To me, using unlikely here is not a bad idea: it shows to the compiler
> > and to the reader of the code that is case is not the usual case.
>
> It would be nice to have a guideline section about likely/unlikely in
> doc/guides/contributing/design.rst
>
> Bruce gave a talk at Dublin about this kind of things.
> I'm sure he could contribute more design guidelines ;)
>

There is a small explanation in the section "Branch Prediction" of
doc/guides/contributing/coding_style.rst, but I do not know if that is
enough to understand when to use them.

I've made a fast check and there are many PMDs that use them to check if
number of packets is zero in the transmission function.
  
Bruce Richardson March 18, 2016, 2:16 p.m. UTC | #7
On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote:
> Hi,
> 
> 
> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com
> > wrote:
> 
> > 2016-03-18 11:27, Olivier Matz:
> > > On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> > > >>> +       /* Avoid the unnecessary cmpset operation below, which is
> > also
> > > >>> +        * potentially harmful when n equals 0. */
> > > >>> +       if (n == 0)
> > > >>>
> > > >>
> > > >> What about using unlikely here?
> > > >>
> > > >
> > > > Unless there is a measurable performance increase by adding in
> > likely/unlikely
> > > > I'd suggest avoiding it's use. In general, likely/unlikely should only
> > be used
> > > > for things like catestrophic errors because the penalty for taking the
> > unlikely
> > > > leg of the code can be quite severe. For normal stuff, where the code
> > nearly
> > > > always goes one way in the branch but occasionally goes the other, the
> > hardware
> > > > branch predictors generally do a good enough job.
> > >
> > > Do you mean using likely/unlikely could be worst than not using it
> > > in this case?
> > >
> > > To me, using unlikely here is not a bad idea: it shows to the compiler
> > > and to the reader of the code that is case is not the usual case.
> >
> > It would be nice to have a guideline section about likely/unlikely in
> > doc/guides/contributing/design.rst
> >
> > Bruce gave a talk at Dublin about this kind of things.
> > I'm sure he could contribute more design guidelines ;)
> >
> 
> There is a small explanation in the section "Branch Prediction" of
> doc/guides/contributing/coding_style.rst, but I do not know if that is
> enough to understand when to use them.
> 
> I've made a fast check and there are many PMDs that use them to check if
> number of packets is zero in the transmission function.

Yeah, and I wonder how many of those are actually necessary too :-)

It's not a big deal either way, I just think the patch is fine as-is without
the extra macros.

/Bruce
  
Olivier Matz March 21, 2016, 12:23 p.m. UTC | #8
Hi,

On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
> Issuing a zero objects dequeue with a single consumer has no effect.
> Doing so with multiple consumers, can get more than one thread to succeed
> the compare-and-set operation and observe starvation or even deadlock in
> the while loop that checks for preceding dequeues.  The problematic piece
> of code when n = 0:
> 
>     cons_next = cons_head + n;
>     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> 
> The same is possible on the enqueue path.
> 
> Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>
  
Huawei Xie March 21, 2016, 5:47 p.m. UTC | #9
On 3/18/2016 10:17 PM, Bruce Richardson wrote:
> On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote:
>> Hi,
>>
>>
>> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com
>>> wrote:
>>> 2016-03-18 11:27, Olivier Matz:
>>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
>>>>>>> +       /* Avoid the unnecessary cmpset operation below, which is
>>> also
>>>>>>> +        * potentially harmful when n equals 0. */
>>>>>>> +       if (n == 0)
>>>>>>>
>>>>>> What about using unlikely here?
>>>>>>
>>>>> Unless there is a measurable performance increase by adding in
>>> likely/unlikely
>>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only
>>> be used
>>>>> for things like catestrophic errors because the penalty for taking the
>>> unlikely
>>>>> leg of the code can be quite severe. For normal stuff, where the code
>>> nearly
>>>>> always goes one way in the branch but occasionally goes the other, the
>>> hardware
>>>>> branch predictors generally do a good enough job.
>>>> Do you mean using likely/unlikely could be worst than not using it
>>>> in this case?
>>>>
>>>> To me, using unlikely here is not a bad idea: it shows to the compiler
>>>> and to the reader of the code that is case is not the usual case.
>>> It would be nice to have a guideline section about likely/unlikely in
>>> doc/guides/contributing/design.rst
>>>
>>> Bruce gave a talk at Dublin about this kind of things.
>>> I'm sure he could contribute more design guidelines ;)
>>>
>> There is a small explanation in the section "Branch Prediction" of
>> doc/guides/contributing/coding_style.rst, but I do not know if that is
>> enough to understand when to use them.
>>
>> I've made a fast check and there are many PMDs that use them to check if
>> number of packets is zero in the transmission function.
> Yeah, and I wonder how many of those are actually necessary too :-)
>
> It's not a big deal either way, I just think the patch is fine as-is without
> the extra macros.

IMO we use likely/unlikely in two cases, catastrophic errors and the
code nearly always goes one way, i.e, preferred/favored fast path.
Likely/unlikely helps not only for branch predication but also for cache
usage. The code generated for the likely path will directly follow the
branch instruction. To me, it is reasonable enough to add unlikely for n
== 0, which we don't expect to happen.
I remember with/without likely, compiler could generate three kind of
instructions. Didn't deep dive into it.

>
> /Bruce
>
  
Bruce Richardson March 22, 2016, 10:13 a.m. UTC | #10
On Mon, Mar 21, 2016 at 05:47:44PM +0000, Xie, Huawei wrote:
> On 3/18/2016 10:17 PM, Bruce Richardson wrote:
> > On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote:
> >> Hi,
> >>
> >>
> >> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com
> >>> wrote:
> >>> 2016-03-18 11:27, Olivier Matz:
> >>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> >>>>>>> +       /* Avoid the unnecessary cmpset operation below, which is
> >>> also
> >>>>>>> +        * potentially harmful when n equals 0. */
> >>>>>>> +       if (n == 0)
> >>>>>>>
> >>>>>> What about using unlikely here?
> >>>>>>
> >>>>> Unless there is a measurable performance increase by adding in
> >>> likely/unlikely
> >>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only
> >>> be used
> >>>>> for things like catestrophic errors because the penalty for taking the
> >>> unlikely
> >>>>> leg of the code can be quite severe. For normal stuff, where the code
> >>> nearly
> >>>>> always goes one way in the branch but occasionally goes the other, the
> >>> hardware
> >>>>> branch predictors generally do a good enough job.
> >>>> Do you mean using likely/unlikely could be worst than not using it
> >>>> in this case?
> >>>>
> >>>> To me, using unlikely here is not a bad idea: it shows to the compiler
> >>>> and to the reader of the code that is case is not the usual case.
> >>> It would be nice to have a guideline section about likely/unlikely in
> >>> doc/guides/contributing/design.rst
> >>>
> >>> Bruce gave a talk at Dublin about this kind of things.
> >>> I'm sure he could contribute more design guidelines ;)
> >>>
> >> There is a small explanation in the section "Branch Prediction" of
> >> doc/guides/contributing/coding_style.rst, but I do not know if that is
> >> enough to understand when to use them.
> >>
> >> I've made a fast check and there are many PMDs that use them to check if
> >> number of packets is zero in the transmission function.
> > Yeah, and I wonder how many of those are actually necessary too :-)
> >
> > It's not a big deal either way, I just think the patch is fine as-is without
> > the extra macros.
> 
> IMO we use likely/unlikely in two cases, catastrophic errors and the
> code nearly always goes one way, i.e, preferred/favored fast path.
> Likely/unlikely helps not only for branch predication but also for cache

For branch prediction, anything after the first time through the code path
the prediction will be based on what happened before rather than any static
hints in the code.

> usage. The code generated for the likely path will directly follow the
> branch instruction. To me, it is reasonable enough to add unlikely for n
> == 0, which we don't expect to happen.
> I remember with/without likely, compiler could generate three kind of
> instructions. Didn't deep dive into it.
> 
> >
> > /Bruce
> >
>
  
Huawei Xie March 22, 2016, 2:38 p.m. UTC | #11
On 3/22/2016 6:13 PM, Richardson, Bruce wrote:
> On Mon, Mar 21, 2016 at 05:47:44PM +0000, Xie, Huawei wrote:
>> On 3/18/2016 10:17 PM, Bruce Richardson wrote:
>>> On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio Vásquez wrote:
>>>> Hi,
>>>>
>>>>
>>>> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon <thomas.monjalon@6wind.com
>>>>> wrote:
>>>>> 2016-03-18 11:27, Olivier Matz:
>>>>>> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
>>>>>>>>> +       /* Avoid the unnecessary cmpset operation below, which is
>>>>> also
>>>>>>>>> +        * potentially harmful when n equals 0. */
>>>>>>>>> +       if (n == 0)
>>>>>>>>>
>>>>>>>> What about using unlikely here?
>>>>>>>>
>>>>>>> Unless there is a measurable performance increase by adding in
>>>>> likely/unlikely
>>>>>>> I'd suggest avoiding it's use. In general, likely/unlikely should only
>>>>> be used
>>>>>>> for things like catestrophic errors because the penalty for taking the
>>>>> unlikely
>>>>>>> leg of the code can be quite severe. For normal stuff, where the code
>>>>> nearly
>>>>>>> always goes one way in the branch but occasionally goes the other, the
>>>>> hardware
>>>>>>> branch predictors generally do a good enough job.
>>>>>> Do you mean using likely/unlikely could be worst than not using it
>>>>>> in this case?
>>>>>>
>>>>>> To me, using unlikely here is not a bad idea: it shows to the compiler
>>>>>> and to the reader of the code that is case is not the usual case.
>>>>> It would be nice to have a guideline section about likely/unlikely in
>>>>> doc/guides/contributing/design.rst
>>>>>
>>>>> Bruce gave a talk at Dublin about this kind of things.
>>>>> I'm sure he could contribute more design guidelines ;)
>>>>>
>>>> There is a small explanation in the section "Branch Prediction" of
>>>> doc/guides/contributing/coding_style.rst, but I do not know if that is
>>>> enough to understand when to use them.
>>>>
>>>> I've made a fast check and there are many PMDs that use them to check if
>>>> number of packets is zero in the transmission function.
>>> Yeah, and I wonder how many of those are actually necessary too :-)
>>>
>>> It's not a big deal either way, I just think the patch is fine as-is without
>>> the extra macros.
>> IMO we use likely/unlikely in two cases, catastrophic errors and the
>> code nearly always goes one way, i.e, preferred/favored fast path.
>> Likely/unlikely helps not only for branch predication but also for cache
> For branch prediction, anything after the first time through the code path
> the prediction will be based on what happened before rather than any static
> hints in the code.

Yes, maybe i didn't make myself clear? My main concern isn't about
branch predication...

>> usage. The code generated for the likely path will directly follow the
>> branch instruction. To me, it is reasonable enough to add unlikely for n
>> == 0, which we don't expect to happen.
>> I remember with/without likely, compiler could generate three kind of
>> instructions. Didn't deep dive into it.
>>
>>> /Bruce
>>>
  
Thomas Monjalon March 22, 2016, 4:49 p.m. UTC | #12
> > Issuing a zero objects dequeue with a single consumer has no effect.
> > Doing so with multiple consumers, can get more than one thread to succeed
> > the compare-and-set operation and observe starvation or even deadlock in
> > the while loop that checks for preceding dequeues.  The problematic piece
> > of code when n = 0:
> > 
> >     cons_next = cons_head + n;
> >     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> > 
> > The same is possible on the enqueue path.
> > 
> > Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Applied, thanks
  
Olivier Matz March 25, 2016, 11:15 a.m. UTC | #13
Hi Lazaros,

On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
> Issuing a zero objects dequeue with a single consumer has no effect.
> Doing so with multiple consumers, can get more than one thread to succeed
> the compare-and-set operation and observe starvation or even deadlock in
> the while loop that checks for preceding dequeues.  The problematic piece
> of code when n = 0:
> 
>     cons_next = cons_head + n;
>     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> 
> The same is possible on the enqueue path.

Just a question about this patch (that has been applied). Thomas
retitled the commit from your log message:

  ring: fix deadlock in zero object multi enqueue or dequeue
  http://dpdk.org/browse/dpdk/commit/?id=d0979646166e

I think this patch does not fix a deadlock, or did I miss something?

As explained in the following links, the ring may not perform well
if several threads running on the same cpu use it:

  http://dpdk.org/ml/archives/dev/2013-November/000714.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001070.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001162.html
  http://dpdk.org/ml/archives/dev/2015-July/020659.html

A deadlock could occur if the threads running on the same core
have different priority.

Regards,
Olivier
  
Lazaros Koromilas March 28, 2016, 3:48 p.m. UTC | #14
Hi Olivier,

We could have two threads (running on different cores in the general
case) that both succeed the cmpset operation. In the dequeue path,
when n == 0, then cons_next == cons_head, and cmpset will always
succeed. Now, if they both see an old r->cons.tail value from a
previous dequeue, they can get stuck in the while
(unlikely(r->cons.tail != cons_head)) loop. I tried, however, to
reproduce (without the patch) and it seems that there is still a
window for deadlock.

I'm pasting some debug output below that shows two processes' state.
It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and
one sender doing mp_enqueue(32) on the same ring.

gdb --args ./ring-test -l0 --proc-type=primary
gdb --args ./ring-test -l1 --proc-type=secondary
gdb --args ./ring-test -l2 --proc-type=secondary -- tx

This is what I would usually see, process 0 and 1 both stuck at the same state:

663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 0
(gdb) p r->cons.tail
$2 = 576416
(gdb) p cons_head
$3 = 576448
(gdb) p cons_next
$4 = 576448

But now I managed to get the two processes stuck at this state too.

process 0:
663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 32
(gdb) p r->cons.tail
$2 = 254348832
(gdb) p cons_head
$3 = 254348864
(gdb) p cons_next
$4 = 254348896

proccess 1:
663             while (unlikely(r->cons.tail != cons_head)) {
(gdb) p n
$1 = 32
(gdb) p r->cons.tail
$2 = 254348832
(gdb) p cons_head
$3 = 254348896
(gdb) p cons_next
$4 = 254348928

I haven't been able to trigger this with the patch so far, but it
should be possible.

Lazaros.

On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz <olivier.matz@6wind.com> wrote:
> Hi Lazaros,
>
> On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
>> Issuing a zero objects dequeue with a single consumer has no effect.
>> Doing so with multiple consumers, can get more than one thread to succeed
>> the compare-and-set operation and observe starvation or even deadlock in
>> the while loop that checks for preceding dequeues.  The problematic piece
>> of code when n = 0:
>>
>>     cons_next = cons_head + n;
>>     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
>>
>> The same is possible on the enqueue path.
>
> Just a question about this patch (that has been applied). Thomas
> retitled the commit from your log message:
>
>   ring: fix deadlock in zero object multi enqueue or dequeue
>   http://dpdk.org/browse/dpdk/commit/?id=d0979646166e
>
> I think this patch does not fix a deadlock, or did I miss something?
>
> As explained in the following links, the ring may not perform well
> if several threads running on the same cpu use it:
>
>   http://dpdk.org/ml/archives/dev/2013-November/000714.html
>   http://www.dpdk.org/ml/archives/dev/2014-January/001070.html
>   http://www.dpdk.org/ml/archives/dev/2014-January/001162.html
>   http://dpdk.org/ml/archives/dev/2015-July/020659.html
>
> A deadlock could occur if the threads running on the same core
> have different priority.
>
> Regards,
> Olivier
  
Bruce Richardson March 29, 2016, 8:54 a.m. UTC | #15
On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
> Hi Olivier,
> 
> We could have two threads (running on different cores in the general
> case) that both succeed the cmpset operation. In the dequeue path,
> when n == 0, then cons_next == cons_head, and cmpset will always
> succeed. Now, if they both see an old r->cons.tail value from a
> previous dequeue, they can get stuck in the while

Hi,

I don't see how threads reading an "old r->cons.tail" value is even possible.
The head and tail pointers on the ring are marked in the code as volatile, so
all reads and writes to those values are always done from memory and not cached
in registers. No deadlock should be possible on that while loop, unless a 
process crashes in the middle of a ring operation. Each thread which updates
the head pointer from x to y, is responsible for updating the tail pointer in
a similar manner. The loop ensures the tail updates are in the same order as the
head updates.

If you believe deadlock is possible, can you outline the sequence of operations
which would lead to such a state, because I cannot see how it could occur without
a crash inside one of the threads.

/Bruce

> (unlikely(r->cons.tail != cons_head)) loop. I tried, however, to
> reproduce (without the patch) and it seems that there is still a
> window for deadlock.
> 
> I'm pasting some debug output below that shows two processes' state.
> It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and
> one sender doing mp_enqueue(32) on the same ring.
> 
> gdb --args ./ring-test -l0 --proc-type=primary
> gdb --args ./ring-test -l1 --proc-type=secondary
> gdb --args ./ring-test -l2 --proc-type=secondary -- tx
> 
> This is what I would usually see, process 0 and 1 both stuck at the same state:
> 
> 663             while (unlikely(r->cons.tail != cons_head)) {
> (gdb) p n
> $1 = 0
> (gdb) p r->cons.tail
> $2 = 576416
> (gdb) p cons_head
> $3 = 576448
> (gdb) p cons_next
> $4 = 576448
> 
> But now I managed to get the two processes stuck at this state too.
> 
> process 0:
> 663             while (unlikely(r->cons.tail != cons_head)) {
> (gdb) p n
> $1 = 32
> (gdb) p r->cons.tail
> $2 = 254348832
> (gdb) p cons_head
> $3 = 254348864
> (gdb) p cons_next
> $4 = 254348896
> 
> proccess 1:
> 663             while (unlikely(r->cons.tail != cons_head)) {
> (gdb) p n
> $1 = 32
> (gdb) p r->cons.tail
> $2 = 254348832
> (gdb) p cons_head
> $3 = 254348896
> (gdb) p cons_next
> $4 = 254348928
> 

Where is the thread which updated the head pointer from 832 to 864? That thread
was the one which would update the tail pointer to 864 to allow your thread 0
to continue.

/Bruce

> I haven't been able to trigger this with the patch so far, but it
> should be possible.
> 
> Lazaros.
> 
> On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz <olivier.matz@6wind.com> wrote:
> > Hi Lazaros,
> >
> > On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
> >> Issuing a zero objects dequeue with a single consumer has no effect.
> >> Doing so with multiple consumers, can get more than one thread to succeed
> >> the compare-and-set operation and observe starvation or even deadlock in
> >> the while loop that checks for preceding dequeues.  The problematic piece
> >> of code when n = 0:
> >>
> >>     cons_next = cons_head + n;
> >>     success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> >>
> >> The same is possible on the enqueue path.
> >
> > Just a question about this patch (that has been applied). Thomas
> > retitled the commit from your log message:
> >
> >   ring: fix deadlock in zero object multi enqueue or dequeue
> >   http://dpdk.org/browse/dpdk/commit/?id=d0979646166e
> >
> > I think this patch does not fix a deadlock, or did I miss something?
> >
> > As explained in the following links, the ring may not perform well
> > if several threads running on the same cpu use it:
> >
> >   http://dpdk.org/ml/archives/dev/2013-November/000714.html
> >   http://www.dpdk.org/ml/archives/dev/2014-January/001070.html
> >   http://www.dpdk.org/ml/archives/dev/2014-January/001162.html
> >   http://dpdk.org/ml/archives/dev/2015-July/020659.html
> >
> > A deadlock could occur if the threads running on the same core
> > have different priority.
> >
> > Regards,
> > Olivier
  
Olivier Matz March 29, 2016, 3:29 p.m. UTC | #16
Hi,


On 03/29/2016 10:54 AM, Bruce Richardson wrote:
> On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
>> Hi Olivier,
>>
>> We could have two threads (running on different cores in the general
>> case) that both succeed the cmpset operation. In the dequeue path,
>> when n == 0, then cons_next == cons_head, and cmpset will always
>> succeed. Now, if they both see an old r->cons.tail value from a
>> previous dequeue, they can get stuck in the while
>
> Hi,
>
> I don't see how threads reading an "old r->cons.tail" value is even possible.
> The head and tail pointers on the ring are marked in the code as volatile, so
> all reads and writes to those values are always done from memory and not cached
> in registers. No deadlock should be possible on that while loop, unless a
> process crashes in the middle of a ring operation. Each thread which updates
> the head pointer from x to y, is responsible for updating the tail pointer in
> a similar manner. The loop ensures the tail updates are in the same order as the
> head updates.
>
> If you believe deadlock is possible, can you outline the sequence of operations
> which would lead to such a state, because I cannot see how it could occur without
> a crash inside one of the threads.

I think the deadlock Lazaros describes could occur in the following
condition:

current ring state
  r->prod.head = 0
  r->prod.tail = 0

core 0                               core 1
====================================================================
enqueue 0 object
  cmpset(&r->prod.head, 0, 0)
  core 0 is interrupted here
                                      enqueue 1 object
                                       cmpset(&r->prod.head, 0, 1)
                                       copy the objects in box 0
                                       while (r->prod.tail != prod_head))
                                       r->prod.tail = prod_next
  copy 0 object (-> nothing to do)
  while (r->prod.tail != prod_head))
     <loop forever>


I think this issue is indeed fixed by Lazaros' patch (I missed it
in previous review). However, I don't think this deadlock could
happen once we avoided the (n == 0) case.


Olivier
  
Bruce Richardson March 29, 2016, 4:04 p.m. UTC | #17
On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote:
> Hi,
> 
> 
> On 03/29/2016 10:54 AM, Bruce Richardson wrote:
> >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
> >>Hi Olivier,
> >>
> >>We could have two threads (running on different cores in the general
> >>case) that both succeed the cmpset operation. In the dequeue path,
> >>when n == 0, then cons_next == cons_head, and cmpset will always
> >>succeed. Now, if they both see an old r->cons.tail value from a
> >>previous dequeue, they can get stuck in the while
> >
> >Hi,
> >
> >I don't see how threads reading an "old r->cons.tail" value is even possible.
> >The head and tail pointers on the ring are marked in the code as volatile, so
> >all reads and writes to those values are always done from memory and not cached
> >in registers. No deadlock should be possible on that while loop, unless a
> >process crashes in the middle of a ring operation. Each thread which updates
> >the head pointer from x to y, is responsible for updating the tail pointer in
> >a similar manner. The loop ensures the tail updates are in the same order as the
> >head updates.
> >
> >If you believe deadlock is possible, can you outline the sequence of operations
> >which would lead to such a state, because I cannot see how it could occur without
> >a crash inside one of the threads.
> 
> I think the deadlock Lazaros describes could occur in the following
> condition:
> 
> current ring state
>  r->prod.head = 0
>  r->prod.tail = 0
> 
> core 0                               core 1
> ====================================================================
> enqueue 0 object
>  cmpset(&r->prod.head, 0, 0)
>  core 0 is interrupted here
>                                      enqueue 1 object
>                                       cmpset(&r->prod.head, 0, 1)
>                                       copy the objects in box 0
>                                       while (r->prod.tail != prod_head))
>                                       r->prod.tail = prod_next
>  copy 0 object (-> nothing to do)
>  while (r->prod.tail != prod_head))
>     <loop forever>
> 
> 
> I think this issue is indeed fixed by Lazaros' patch (I missed it
> in previous review). However, I don't think this deadlock could
> happen once we avoided the (n == 0) case.
>
Yes, I agree with your scenario. However, I thought the issue was occuring with
non-zero updates, but I may well be mistaken.
If it's fixed now, all good... :-)

/Bruce
  
Lazaros Koromilas March 29, 2016, 5:35 p.m. UTC | #18
On Tue, Mar 29, 2016 at 7:04 PM, Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote:
> > Hi,
> >
> >
> > On 03/29/2016 10:54 AM, Bruce Richardson wrote:
> > >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
> > >>Hi Olivier,
> > >>
> > >>We could have two threads (running on different cores in the general
> > >>case) that both succeed the cmpset operation. In the dequeue path,
> > >>when n == 0, then cons_next == cons_head, and cmpset will always
> > >>succeed. Now, if they both see an old r->cons.tail value from a
> > >>previous dequeue, they can get stuck in the while
> > >
> > >Hi,
> > >
> > >I don't see how threads reading an "old r->cons.tail" value is even possible.
> > >The head and tail pointers on the ring are marked in the code as volatile, so
> > >all reads and writes to those values are always done from memory and not cached
> > >in registers. No deadlock should be possible on that while loop, unless a
> > >process crashes in the middle of a ring operation. Each thread which updates
> > >the head pointer from x to y, is responsible for updating the tail pointer in
> > >a similar manner. The loop ensures the tail updates are in the same order as the
> > >head updates.
> > >
> > >If you believe deadlock is possible, can you outline the sequence of operations
> > >which would lead to such a state, because I cannot see how it could occur without
> > >a crash inside one of the threads.
> >
> > I think the deadlock Lazaros describes could occur in the following
> > condition:
> >
> > current ring state
> >  r->prod.head = 0
> >  r->prod.tail = 0
> >
> > core 0                               core 1
> > ====================================================================
> > enqueue 0 object
> >  cmpset(&r->prod.head, 0, 0)
> >  core 0 is interrupted here
> >                                      enqueue 1 object
> >                                       cmpset(&r->prod.head, 0, 1)
> >                                       copy the objects in box 0
> >                                       while (r->prod.tail != prod_head))
> >                                       r->prod.tail = prod_next
> >  copy 0 object (-> nothing to do)
> >  while (r->prod.tail != prod_head))
> >     <loop forever>
> >
> >
> > I think this issue is indeed fixed by Lazaros' patch (I missed it
> > in previous review). However, I don't think this deadlock could
> > happen once we avoided the (n == 0) case.
> >
> Yes, I agree with your scenario. However, I thought the issue was occuring with
> non-zero updates, but I may well be mistaken.
> If it's fixed now, all good... :-)
>
> /Bruce

Hi all,

Bruce, I thought it could be still possible because of my threads
being stuck inside two dequeue(32) calls.  But haven't been able to
reproduce it with non-zero dequeues.  And by trying to find a scenario
of my own, it seems that at least one dequeue(0) is needed, similarly
to Olivier's example on the enqueue path.

Thanks,
Lazaros.
  

Patch

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 943c97c..eb45e41 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -431,6 +431,11 @@  __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t mask = r->prod.mask;
 	int ret;
 
+	/* Avoid the unnecessary cmpset operation below, which is also
+	 * potentially harmful when n equals 0. */
+	if (n == 0)
+		return 0;
+
 	/* move prod.head atomically */
 	do {
 		/* Reset n to the initial burst count */
@@ -618,6 +623,11 @@  __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	unsigned i, rep = 0;
 	uint32_t mask = r->prod.mask;
 
+	/* Avoid the unnecessary cmpset operation below, which is also
+	 * potentially harmful when n equals 0. */
+	if (n == 0)
+		return 0;
+
 	/* move cons.head atomically */
 	do {
 		/* Restore n as it may change every loop */