[dpdk-dev,v6] eal: add function to check if primary proc alive
Commit Message
This patch adds a new function to the EAL API:
int rte_eal_primary_proc_alive(const char *path);
The function indicates if a primary process is alive right now.
This functionality is implemented by testing for a write-
lock on the config file, and the function tests for a lock.
The use case for this functionality is that a secondary
process can wait until a primary process starts by polling
the function and waiting. When the primary is running, the
secondary continues to poll to detect if the primary process
has quit unexpectedly, the secondary process can detect this.
The RTE_MAGIC number is written to the shared config by the
primary process, this is the signal to the secondary process
that the EAL is set up, and ready to be used. The function
rte_eal_mcfg_complete() writes RTE_MAGIC. This has been
delayed in the EAL init proceedure, as the PCI probing in
the primary process can interfere with the secondary running.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
---
v6:
- Fix license header
v5:
- Renamed returns in doc from words to digits
- Fixed line spacing in docs
- Fixed line spacing in EAL header
- Rebased to master (Makefile conflicts)
v4:
- Rebased to git head (2.3 -> 16.04 changes)
v3:
- Fixed Copyright years
v2:
- Passing NULL as const char* uses default /var/run/.rte_config
- Moved code into /common/ instead of /linuxapp/, should work on BSD now
---
doc/guides/rel_notes/release_16_04.rst | 8 ++++
lib/librte_eal/bsdapp/eal/Makefile | 1 +
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_proc.c | 61 +++++++++++++++++++++++++
lib/librte_eal/common/include/rte_eal.h | 20 +++++++-
lib/librte_eal/linuxapp/eal/Makefile | 3 +-
lib/librte_eal/linuxapp/eal/eal.c | 6 +--
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
8 files changed, 96 insertions(+), 5 deletions(-)
create mode 100644 lib/librte_eal/common/eal_common_proc.c
Comments
Hello Harry,
On Mon, Mar 7, 2016 at 1:02 PM, Harry van Haaren
<harry.van.haaren@intel.com> wrote:
> This patch adds a new function to the EAL API:
> int rte_eal_primary_proc_alive(const char *path);
>
> The function indicates if a primary process is alive right now.
> This functionality is implemented by testing for a write-
> lock on the config file, and the function tests for a lock.
>
> The use case for this functionality is that a secondary
> process can wait until a primary process starts by polling
> the function and waiting. When the primary is running, the
> secondary continues to poll to detect if the primary process
> has quit unexpectedly, the secondary process can detect this.
>
> The RTE_MAGIC number is written to the shared config by the
> primary process, this is the signal to the secondary process
> that the EAL is set up, and ready to be used. The function
> rte_eal_mcfg_complete() writes RTE_MAGIC. This has been
> delayed in the EAL init proceedure, as the PCI probing in
> the primary process can interfere with the secondary running.
Well, this sounds odd.
There might be an issue, but I can't see it at the moment.
When I look at this new api, I am under the impression that you are
supposed to check for primary liveliness once dpdk init has finished
(from your secondary process point of view), not before and not while
it is initialising.
Why do you need to move this ?
> diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
> new file mode 100644
> index 0000000..12e0fca
> --- /dev/null
> +++ b/lib/librte_eal/common/eal_common_proc.c
[snip]
> +int
> +rte_eal_primary_proc_alive(const char *config_file_path)
> +{
> + int config_fd;
> +
> + if (config_file_path)
> + config_fd = open(config_file_path, O_RDONLY);
> + else {
> + char default_path[PATH_MAX+1];
> + snprintf(default_path, PATH_MAX, RUNTIME_CONFIG_FMT,
> + default_config_dir, "rte");
> + config_fd = open(default_path, O_RDONLY);
Can't you reuse eal_runtime_config_path() here ?
Hi David,
> From: David Marchand [mailto:david.marchand@6wind.com]
> Subject: Re: [PATCH v6] eal: add function to check if primary proc alive
> When I look at this new api, I am under the impression that you are
> supposed to check for primary liveliness once dpdk init has finished
> (from your secondary process point of view), not before and not while
> it is initialising.
The issue is that if a secondary process is initialized, it holds a read
lock on /var/run/.rte_config and this prevents a primary from starting.
So we *must* detect a primary process being ready to attach to, *without*
having called rte_eal_init() in the secondary process.
> Why do you need to move this ?
Issues arise when a primary and secondary process both scan the PCI devices
at the same time. Moving rte_eal_mcfg_complete() solves this race-cond
because the secondary process will wait until the primary is finished.
> > + if (config_file_path)
> > + config_fd = open(config_file_path, O_RDONLY);
> > + else {
> > + char default_path[PATH_MAX+1];
> > + snprintf(default_path, PATH_MAX, RUNTIME_CONFIG_FMT,
> > + default_config_dir, "rte");
> > + config_fd = open(default_path, O_RDONLY);
>
> Can't you reuse eal_runtime_config_path() here ?
No, as rte_eal_init() has not been called, for the same reason as above.
As rte_eal_init() has not been called, the shared config that is read by
eal_runtime_config_path() has not been initialized.
-Harry
2016-03-08 09:58, Van Haaren, Harry:
> From: David Marchand [mailto:david.marchand@6wind.com]
> > When I look at this new api, I am under the impression that you are
> > supposed to check for primary liveliness once dpdk init has finished
> > (from your secondary process point of view), not before and not while
> > it is initialising.
>
> The issue is that if a secondary process is initialized, it holds a read
> lock on /var/run/.rte_config and this prevents a primary from starting.
The new function is advertised as a monitoring feature.
But it seems to be also a workaround for an ordering issue when starting
primary and secondary processes concurrently, right?
On Tue, Mar 8, 2016 at 12:13 PM, Thomas Monjalon
<thomas.monjalon@6wind.com> wrote:
> 2016-03-08 09:58, Van Haaren, Harry:
>> From: David Marchand [mailto:david.marchand@6wind.com]
>> > When I look at this new api, I am under the impression that you are
>> > supposed to check for primary liveliness once dpdk init has finished
>> > (from your secondary process point of view), not before and not while
>> > it is initialising.
>>
>> The issue is that if a secondary process is initialized, it holds a read
>> lock on /var/run/.rte_config and this prevents a primary from starting.
>
> The new function is advertised as a monitoring feature.
> But it seems to be also a workaround for an ordering issue when starting
> primary and secondary processes concurrently, right?
+1
> From: David Marchand [mailto:david.marchand@6wind.com]
> >> The issue is that if a secondary process is initialized, it holds a read
> >> lock on /var/run/.rte_config and this prevents a primary from starting.
> >
> > The new function is advertised as a monitoring feature.
> > But it seems to be also a workaround for an ordering issue when starting
> > primary and secondary processes concurrently, right?
>
> +1
You are correct, the function rte_eal_primary_proc_alive() added here is
for monitoring if there is a primary process alive.
The rte_eal_mcfg_complete() function call in rte_eal_init() is delayed
to avoid a race-condition between secondary and primary processes.
This race-condition occurs when two processes probe the PCI devices
at the same time.
Delaying the rte_eal_mcfg_complete() call until after the primary has
finished rte_eal_pci_probe() ensures that this race condition is avoided.
-Harry
On Tue, Mar 8, 2016 at 2:57 PM, Van Haaren, Harry
<harry.van.haaren@intel.com> wrote:
>> From: David Marchand [mailto:david.marchand@6wind.com]
>> >> The issue is that if a secondary process is initialized, it holds a read
>> >> lock on /var/run/.rte_config and this prevents a primary from starting.
>> >
>> > The new function is advertised as a monitoring feature.
>> > But it seems to be also a workaround for an ordering issue when starting
>> > primary and secondary processes concurrently, right?
>>
>> +1
>
> You are correct, the function rte_eal_primary_proc_alive() added here is
> for monitoring if there is a primary process alive.
>
> The rte_eal_mcfg_complete() function call in rte_eal_init() is delayed
> to avoid a race-condition between secondary and primary processes.
> This race-condition occurs when two processes probe the PCI devices
> at the same time.
>
> Delaying the rte_eal_mcfg_complete() call until after the primary has
> finished rte_eal_pci_probe() ensures that this race condition is avoided.
Then, those are two different things.
Can you split this into two patches: one for the fix and one for the
new function ?
CCing sergio, who is the multi process maintainer.
Thanks.
The first patch of this patchset contains a fix for EAL PCI probing,
to avoid a race-condition where a primary and secondary probe PCI
devices at the same time.
The second patch adds a function that can be polled by a process to
detect if a DPDK primary process is alive. This function does not
rely on rte_eal_init(), as this uses the EAL and thus stops a
primary from starting.
The functionality provided by this patch is very useful for providing
additional services to DPDK primary applications such as monitoring
statistics and performing fault detection.
Harry van Haaren (2):
eal: fix race-condition in pri/sec proc startup
eal: add function to check if primary proc alive
doc/guides/rel_notes/release_16_04.rst | 8 ++++++++
lib/librte_eal/bsdapp/eal/Makefile | 1 +
lib/librte_eal/bsdapp/eal/eal.c | 6 +++---
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/include/rte_eal.h | 20 +++++++++++++++++++-
lib/librte_eal/linuxapp/eal/Makefile | 3 ++-
lib/librte_eal/linuxapp/eal/eal.c | 6 +++---
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
8 files changed, 38 insertions(+), 8 deletions(-)
The first patch of this patchset contains a fix for EAL PCI probing,
to avoid a race-condition where a primary and secondary probe PCI
devices at the same time.
The second patch adds a function that can be polled by a process to
detect if a DPDK primary process is alive. This function does not
rely on rte_eal_init(), as this uses the EAL and thus stops a
primary from starting.
The functionality provided by this patch is very useful for providing
additional services to DPDK primary applications such as monitoring
statistics and performing fault detection.
v8:
- include implementation of function (got lost in v7)
v7:
- split patch into two, one for eal fix, one for adding functionality
v6:
- Fix license header
v5:
- Renamed returns in doc from words to digits
- Fixed line spacing in docs
- Fixed line spacing in EAL header
- Rebased to master (Makefile conflicts)
v4:
- Rebased to git head (2.3 -> 16.04 changes)
v3:
- Fixed Copyright years
v2:
- Passing NULL as const char* uses default /var/run/.rte_config
- Moved co
Harry van Haaren (2):
eal: fix race-condition in pri/sec proc startup
eal: add function to check if primary proc alive
doc/guides/rel_notes/release_16_04.rst | 8 ++++
lib/librte_eal/bsdapp/eal/Makefile | 1 +
lib/librte_eal/bsdapp/eal/eal.c | 6 +--
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_proc.c | 61 +++++++++++++++++++++++++
lib/librte_eal/common/include/rte_eal.h | 20 +++++++-
lib/librte_eal/linuxapp/eal/Makefile | 3 +-
lib/librte_eal/linuxapp/eal/eal.c | 6 +--
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
9 files changed, 99 insertions(+), 8 deletions(-)
create mode 100644 lib/librte_eal/common/eal_common_proc.c
On Wed, Mar 9, 2016 at 11:12 AM, Harry van Haaren
<harry.van.haaren@intel.com> wrote:
>
> The first patch of this patchset contains a fix for EAL PCI probing,
> to avoid a race-condition where a primary and secondary probe PCI
> devices at the same time.
>
> The second patch adds a function that can be polled by a process to
> detect if a DPDK primary process is alive. This function does not
> rely on rte_eal_init(), as this uses the EAL and thus stops a
> primary from starting.
>
> The functionality provided by this patch is very useful for providing
> additional services to DPDK primary applications such as monitoring
> statistics and performing fault detection.
Sergio, please can you have a look at this patchset ?
Thanks.
On 09/03/2016 11:07, David Marchand wrote:
> On Wed, Mar 9, 2016 at 11:12 AM, Harry van Haaren
> <harry.van.haaren@intel.com> wrote:
>> The first patch of this patchset contains a fix for EAL PCI probing,
>> to avoid a race-condition where a primary and secondary probe PCI
>> devices at the same time.
>>
>> The second patch adds a function that can be polled by a process to
>> detect if a DPDK primary process is alive. This function does not
>> rely on rte_eal_init(), as this uses the EAL and thus stops a
>> primary from starting.
>>
>> The functionality provided by this patch is very useful for providing
>> additional services to DPDK primary applications such as monitoring
>> statistics and performing fault detection.
> Sergio, please can you have a look at this patchset ?
Yes, will do.
Sergio
> Thanks.
>
The first patch of this patchset contains a fix for EAL PCI probing,
to avoid a race-condition where a primary and secondary probe PCI
devices at the same time.
The second patch adds a function that can be polled by a process to
detect if a DPDK primary process is alive. This function does not
rely on rte_eal_init(), as this uses the EAL and thus stops a
primary from starting.
The functionality provided by this patch is very useful for providing
additional services to DPDK primary applications such as monitoring
statistics and performing fault detection.
v9:
- Improve commit message for EAL fix
v8:
- include implementation of function (got lost in v7)
v7:
- split patch into two, one for eal fix, one for adding functionality
v6:
- Fix license header
v5:
- Renamed returns in doc from words to digits
- Fixed line spacing in docs
- Fixed line spacing in EAL header
- Rebased to master (Makefile conflicts)
v4:
- Rebased to git head (2.3 -> 16.04 changes)
v3:
- Fixed Copyright years
v2:
- Passing NULL as const char* uses default /var/run/.rte_config
- Moved co
Harry van Haaren (2):
eal: fix race-condition in pri/sec proc startup
eal: add function to check if primary proc alive
doc/guides/rel_notes/release_16_04.rst | 8 ++++
lib/librte_eal/bsdapp/eal/Makefile | 1 +
lib/librte_eal/bsdapp/eal/eal.c | 6 +--
lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
lib/librte_eal/common/eal_common_proc.c | 61 +++++++++++++++++++++++++
lib/librte_eal/common/include/rte_eal.h | 20 +++++++-
lib/librte_eal/linuxapp/eal/Makefile | 3 +-
lib/librte_eal/linuxapp/eal/eal.c | 6 +--
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
9 files changed, 99 insertions(+), 8 deletions(-)
create mode 100644 lib/librte_eal/common/eal_common_proc.c
2016-03-09 13:37, Harry van Haaren:
> The first patch of this patchset contains a fix for EAL PCI probing,
> to avoid a race-condition where a primary and secondary probe PCI
> devices at the same time.
>
> The second patch adds a function that can be polled by a process to
> detect if a DPDK primary process is alive. This function does not
> rely on rte_eal_init(), as this uses the EAL and thus stops a
> primary from starting.
>
> The functionality provided by this patch is very useful for providing
> additional services to DPDK primary applications such as monitoring
> statistics and performing fault detection.
Applied, thanks
@@ -74,6 +74,14 @@ EAL
~~~
+* **Added rte_eal_primary_proc_alive() function**
+
+ A new function ``rte_eal_primary_proc_alive()`` has been added
+ to allow the user to detect if a primary process is running.
+ Use cases for this feature include fault detection, and monitoring
+ using secondary processes.
+
+
Drivers
~~~~~~~
@@ -79,6 +79,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_devargs.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_dev.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_options.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_thread.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += malloc_elem.c
SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += malloc_heap.c
@@ -148,5 +148,6 @@ DPDK_16.04 {
rte_eal_pci_ioport_write;
rte_eal_pci_map_device;
rte_eal_pci_unmap_device;
+ rte_eal_primary_proc_alive;
} DPDK_2.2;
new file mode 100644
@@ -0,0 +1,61 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <rte_eal.h>
+
+#include "eal_filesystem.h"
+#include "eal_internal_cfg.h"
+
+int
+rte_eal_primary_proc_alive(const char *config_file_path)
+{
+ int config_fd;
+
+ if (config_file_path)
+ config_fd = open(config_file_path, O_RDONLY);
+ else {
+ char default_path[PATH_MAX+1];
+ snprintf(default_path, PATH_MAX, RUNTIME_CONFIG_FMT,
+ default_config_dir, "rte");
+ config_fd = open(default_path, O_RDONLY);
+ }
+ if (config_fd < 0)
+ return 0;
+
+ int ret = lockf(config_fd, F_TEST, 0);
+ close(config_fd);
+
+ return !!ret;
+}
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -156,6 +156,24 @@ int rte_eal_iopl_init(void);
* - On failure, a negative error value.
*/
int rte_eal_init(int argc, char **argv);
+
+/**
+ * Check if a primary process is currently alive
+ *
+ * This function returns true when a primary process is currently
+ * active.
+ *
+ * @param config_file_path
+ * The config_file_path argument provided should point at the location
+ * that the primary process will create its config file. If NULL, the default
+ * config file path is used.
+ *
+ * @return
+ * - If alive, returns 1.
+ * - If dead, returns 0.
+ */
+int rte_eal_primary_proc_alive(const char *config_file_path);
+
/**
* Usage function typedef used by the application usage function.
*
@@ -1,6 +1,6 @@
# BSD LICENSE
#
-# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+# Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -89,6 +89,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_devargs.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_dev.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_options.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_thread.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += malloc_elem.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += malloc_heap.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* Copyright(c) 2012-2014 6WIND S.A.
* All rights reserved.
*
@@ -821,8 +821,6 @@ rte_eal_init(int argc, char **argv)
eal_check_mem_on_local_socket();
- rte_eal_mcfg_complete();
-
if (eal_plugins_init() < 0)
rte_panic("Cannot init plugins\n");
@@ -880,6 +878,8 @@ rte_eal_init(int argc, char **argv)
if (rte_eal_pci_probe())
rte_panic("Cannot probe PCI\n");
+ rte_eal_mcfg_complete();
+
return fctret;
}
@@ -151,5 +151,6 @@ DPDK_16.04 {
rte_eal_pci_ioport_write;
rte_eal_pci_map_device;
rte_eal_pci_unmap_device;
+ rte_eal_primary_proc_alive;
} DPDK_2.2;