[dpdk-dev] timer: fix break list when timer_cb reset running timer

Message ID 1468778880-25515-1-git-send-email-h.mikita89@gmail.com (mailing list archive)
State Accepted, archived
Headers

Commit Message

Hiroyuki Mikita July 17, 2016, 6:08 p.m. UTC
  When timer_cb resets another running timer on the same lcore,
the list of expired timers is chained to the pending-list.
This commit prevents a running timer from being reset
by not its own timer_cb.

Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
---
 lib/librte_timer/rte_timer.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)
  

Comments

Sanford, Robert July 20, 2016, 8:17 p.m. UTC | #1
Hi Hiroyuki,

I am reviewing your 3 timer patches.
Can you please explain in more detail your use-case that results in a
problem?

For example, is it when timer A's callback tries to reset
(rte_timer_reset) timer B?
If yes, is timer B in PENDING state and likely to expire soon?

--
Thanks,
Robert


On 7/17/16 2:08 PM, "Hiroyuki Mikita" <h.mikita89@gmail.com> wrote:

>When timer_cb resets another running timer on the same lcore,
>the list of expired timers is chained to the pending-list.
>This commit prevents a running timer from being reset
>by not its own timer_cb.
>
>Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
>---
> lib/librte_timer/rte_timer.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
>diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
>index 3dcdab5..00e80c9 100644
>--- a/lib/librte_timer/rte_timer.c
>+++ b/lib/librte_timer/rte_timer.c
>@@ -69,6 +69,9 @@ struct priv_timer {
> 
> 	unsigned prev_lcore;              /**< used for lcore round robin */
> 
>+	/** running timer on this lcore now */
>+	struct rte_timer *running_tim;
>+
> #ifdef RTE_LIBRTE_TIMER_DEBUG
> 	/** per-lcore statistics */
> 	struct rte_timer_debug_stats stats;
>@@ -135,9 +138,12 @@ timer_set_config_state(struct rte_timer *tim,
> 	while (success == 0) {
> 		prev_status.u32 = tim->status.u32;
> 
>-		/* timer is running on another core, exit */
>+		/* timer is running on another core
>+		 * or ready to run on local core, exit
>+		 */
> 		if (prev_status.state == RTE_TIMER_RUNNING &&
>-		    prev_status.owner != (uint16_t)lcore_id)
>+		    (prev_status.owner != (uint16_t)lcore_id ||
>+		     tim != priv_timer[lcore_id].running_tim))
> 			return -1;
> 
> 		/* timer is being configured on another core */
>@@ -580,6 +586,7 @@ void rte_timer_manage(void)
> 	for (tim = run_first_tim; tim != NULL; tim = next_tim) {
> 		next_tim = tim->sl_next[0];
> 		priv_timer[lcore_id].updated = 0;
>+		priv_timer[lcore_id].running_tim = tim;
> 
> 		/* execute callback function with list unlocked */
> 		tim->f(tim, tim->arg);
>@@ -610,6 +617,7 @@ void rte_timer_manage(void)
> 			rte_spinlock_unlock(&priv_timer[lcore_id].list_lock);
> 		}
> 	}
>+	priv_timer[lcore_id].running_tim = NULL;
> }
> 
> /* dump statistics about timers */
>-- 
>2.7.4
>
  
Hiroyuki Mikita July 21, 2016, 10:34 a.m. UTC | #2
Hi Robert,

Thank you for reviewing.

In the following case, the skip list is broken.
- Timer A and timer B are configured on the same lcore, in the same
pending list.
- The expire time of timer A is earlier than that of timer B.
- rte_timer_manage() is called on the lcore after the expire time of timer B.
- The callback of timer A resets timer B.

In rte_timer_manage() process,
both timers are expired at the same time, so the running list includes
both timers.
States of both timers are transited from PENDING to RUNNING, then
callbacks are executed sequentially.
The callback of timer A resets timer B which is still RUNNING, so
timer B is added to the pending-list.
In this time, timer B is in both the running list and the pending list.
It means that the running list is chained to the pending list. Both
lists are broken.

Regards,
Hiroyuki
  
Sanford, Robert July 22, 2016, 10:14 p.m. UTC | #3
On 7/17/16 2:08 PM, "Hiroyuki Mikita" <h.mikita89@gmail.com> wrote:

>When timer_cb resets another running timer on the same lcore,
>the list of expired timers is chained to the pending-list.
>This commit prevents a running timer from being reset
>by not its own timer_cb.
>
>Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
>---
> lib/librte_timer/rte_timer.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
>diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
>index 3dcdab5..00e80c9 100644
>--- a/lib/librte_timer/rte_timer.c
>+++ b/lib/librte_timer/rte_timer.c
>@@ -69,6 +69,9 @@ struct priv_timer {
> 
> 	unsigned prev_lcore;              /**< used for lcore round robin */
> 
>+	/** running timer on this lcore now */
>+	struct rte_timer *running_tim;
>+
> #ifdef RTE_LIBRTE_TIMER_DEBUG
> 	/** per-lcore statistics */
> 	struct rte_timer_debug_stats stats;
>@@ -135,9 +138,12 @@ timer_set_config_state(struct rte_timer *tim,
> 	while (success == 0) {
> 		prev_status.u32 = tim->status.u32;
> 
>-		/* timer is running on another core, exit */
>+		/* timer is running on another core
>+		 * or ready to run on local core, exit
>+		 */
> 		if (prev_status.state == RTE_TIMER_RUNNING &&
>-		    prev_status.owner != (uint16_t)lcore_id)
>+		    (prev_status.owner != (uint16_t)lcore_id ||
>+		     tim != priv_timer[lcore_id].running_tim))
> 			return -1;
> 
> 		/* timer is being configured on another core */
>@@ -580,6 +586,7 @@ void rte_timer_manage(void)
> 	for (tim = run_first_tim; tim != NULL; tim = next_tim) {
> 		next_tim = tim->sl_next[0];
> 		priv_timer[lcore_id].updated = 0;
>+		priv_timer[lcore_id].running_tim = tim;
> 
> 		/* execute callback function with list unlocked */
> 		tim->f(tim, tim->arg);
>@@ -610,6 +617,7 @@ void rte_timer_manage(void)
> 			rte_spinlock_unlock(&priv_timer[lcore_id].list_lock);
> 		}
> 	}
>+	priv_timer[lcore_id].running_tim = NULL;
> }
> 
> /* dump statistics about timers */
>-- 
>2.7.4
>

Acked-by: Robert Sanford <rsanford@akamai.com>


I tested the three timer patches with app/test timer_autotest and
timer_racecond_autotest, and additional private tests.
  
Thomas Monjalon July 23, 2016, 8:49 a.m. UTC | #4
2016-07-23 0:14 GMT+02:00 Sanford, Robert <rsanford@akamai.com>:
> Acked-by: Robert Sanford <rsanford@akamai.com>
>
> I tested the three timer patches with app/test timer_autotest and
> timer_racecond_autotest, and additional private tests.

Thanks Robert.
Are you confident enough to integrate them in the last days of 16.07?
How critical are they?
Should we make a RC5 for them?
  
Sanford, Robert July 25, 2016, 12:29 p.m. UTC | #5
On 7/23/16 4:49 AM, "Thomas Monjalon" <thomas.monjalon@6wind.com> wrote:

>2016-07-23 0:14 GMT+02:00 Sanford, Robert <rsanford@akamai.com>:
>> Acked-by: Robert Sanford <rsanford@akamai.com>
>>
>> I tested the three timer patches with app/test timer_autotest and
>> timer_racecond_autotest, and additional private tests.
>
>Thanks Robert.
>Are you confident enough to integrate them in the last days of 16.07?
>How critical are they?
>Should we make a RC5 for them?

Yes, I'm confident that the patches are safe and correct.
However, I'm not sure that we should make a RC just for them.

Patch 1 fixes a problem where we incorrectly lower the depth of the skip
list, resulting in performance that does not live up to O(log n) that we
expect. Summary: performance degradation with large number of timers.

Patch 2 fixes a minor inefficiency when timer_manage() races with
timer_stop() or timer_reset().

Patch 3 fixes the most serious problem: We may corrupt timer list links if
multiple timers expire at roughly the same time, and one of those timers'
callback tries to stop/reset other(s) that are scheduled to run in the
same call to timer_manage().

Question for Hiroyuki:
How did you discover timer linked-list corruption? By code inspection, or
do you have a real application that needs that functionality (timers
stop/reset each other at roughly the same time)?

Regards,
Robert
  
Hiroyuki Mikita July 25, 2016, 2:21 p.m. UTC | #6
Hi Robert,

My application had timers which reset another timer and sometimes did not work.
Its profile by 'perf' command showed timer_add occupied 99.9% CPU. It
seemed that an infinite loop occurred in rte_timer.
I inspected timer codes and found that the main cause was a bug of the
patch 3. In this process, I also found the other bugs.

Regards,
Hiroyuki
  
Thomas Monjalon July 25, 2016, 2:43 p.m. UTC | #7
Hiroyuki, Robert,
I would like to apply these patches quickly.
Please could you provide some "Fixes:" line to know the origin
of the bugs?
Thanks
  
Hiroyuki Mikita July 25, 2016, 3:14 p.m. UTC | #8
Fixes: a4b7a5a45cf5 ("timer: fix race condition")

2016-07-25 23:43 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> Hiroyuki, Robert,
> I would like to apply these patches quickly.
> Please could you provide some "Fixes:" line to know the origin
> of the bugs?
> Thanks
>
  
Thomas Monjalon July 25, 2016, 3:21 p.m. UTC | #9
2016-07-26 00:14, Hiroyuki Mikita:
> Fixes: a4b7a5a45cf5 ("timer: fix race condition")
> 
> 2016-07-25 23:43 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> > Hiroyuki, Robert,
> > I would like to apply these patches quickly.
> > Please could you provide some "Fixes:" line to know the origin
> > of the bugs?
> > Thanks

Thanks a lot, Hiroyuki
  
Thomas Monjalon July 25, 2016, 3:53 p.m. UTC | #10
> >When timer_cb resets another running timer on the same lcore,
> >the list of expired timers is chained to the pending-list.
> >This commit prevents a running timer from being reset
> >by not its own timer_cb.
> >
> >Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
> 
> Acked-by: Robert Sanford <rsanford@akamai.com>
> 
> I tested the three timer patches with app/test timer_autotest and
> timer_racecond_autotest, and additional private tests.

Applied, thanks
  

Patch

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 3dcdab5..00e80c9 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -69,6 +69,9 @@  struct priv_timer {
 
 	unsigned prev_lcore;              /**< used for lcore round robin */
 
+	/** running timer on this lcore now */
+	struct rte_timer *running_tim;
+
 #ifdef RTE_LIBRTE_TIMER_DEBUG
 	/** per-lcore statistics */
 	struct rte_timer_debug_stats stats;
@@ -135,9 +138,12 @@  timer_set_config_state(struct rte_timer *tim,
 	while (success == 0) {
 		prev_status.u32 = tim->status.u32;
 
-		/* timer is running on another core, exit */
+		/* timer is running on another core
+		 * or ready to run on local core, exit
+		 */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    prev_status.owner != (uint16_t)lcore_id)
+		    (prev_status.owner != (uint16_t)lcore_id ||
+		     tim != priv_timer[lcore_id].running_tim))
 			return -1;
 
 		/* timer is being configured on another core */
@@ -580,6 +586,7 @@  void rte_timer_manage(void)
 	for (tim = run_first_tim; tim != NULL; tim = next_tim) {
 		next_tim = tim->sl_next[0];
 		priv_timer[lcore_id].updated = 0;
+		priv_timer[lcore_id].running_tim = tim;
 
 		/* execute callback function with list unlocked */
 		tim->f(tim, tim->arg);
@@ -610,6 +617,7 @@  void rte_timer_manage(void)
 			rte_spinlock_unlock(&priv_timer[lcore_id].list_lock);
 		}
 	}
+	priv_timer[lcore_id].running_tim = NULL;
 }
 
 /* dump statistics about timers */