[dpdk-dev] eal: add option --avail-cores to detect lcores

Message ID 1457085957-115339-1-git-send-email-jianfeng.tan@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Jianfeng Tan March 4, 2016, 10:05 a.m. UTC
  This patch adds option, --avail-cores, to use lcores which are available
by calling pthread_getaffinity_np() to narrow down detected cores before
parsing coremask (-c), corelist (-l), and coremap (--lcores).

Test example:
$ taskset 0xc0000 ./examples/helloworld/build/helloworld \
		--avail-cores -m 1024

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 lib/librte_eal/common/eal_common_options.c | 52 ++++++++++++++++++++++++++++++
 lib/librte_eal/common/eal_options.h        |  2 ++
 2 files changed, 54 insertions(+)
  

Comments

Panu Matilainen March 8, 2016, 8:54 a.m. UTC | #1
On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
> This patch adds option, --avail-cores, to use lcores which are available
> by calling pthread_getaffinity_np() to narrow down detected cores before
> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>
> Test example:
> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
> 		--avail-cores -m 1024
>
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Hmm, to me this sounds like something that should be done always so 
there's no need for an option. Or if there's a chance it might do the 
wrong thing in some rare circumstance then perhaps there should be a 
disabler option instead?

Or am I just missing something?

	- Panu -
  
Jianfeng Tan March 8, 2016, 5:38 p.m. UTC | #2
Hi Panu,

On 3/8/2016 4:54 PM, Panu Matilainen wrote:
> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>> This patch adds option, --avail-cores, to use lcores which are available
>> by calling pthread_getaffinity_np() to narrow down detected cores before
>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>
>> Test example:
>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>         --avail-cores -m 1024
>>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>
> Hmm, to me this sounds like something that should be done always so 
> there's no need for an option. Or if there's a chance it might do the 
> wrong thing in some rare circumstance then perhaps there should be a 
> disabler option instead?

Thanks for comments.

Yes, there's a use case that we cannot handle.

If we make it as default, DPDK applications may fail to start, when user 
specifies a core in isolcpus and its parent process (say bash) has a 
cpuset affinity that excludes isolcpus. Originally, DPDK applications 
just blindly do pthread_setaffinity_np() and it always succeeds because 
it always has root privilege to change any cpu affinity.

Now, if we do the checking in rte_eal_cpu_init(), those lcores will be 
flagged as undetected (in my older implementation) and leads to failure. 
To make it correct, we would always add "taskset mask" (or other ways) 
before DPDK application cmd lines.

How do you think?

Thanks,
Jianfeng

>
> Or am I just missing something?
>
>     - Panu -
>
  
Panu Matilainen March 9, 2016, 1:05 p.m. UTC | #3
On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
> Hi Panu,
>
> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>> This patch adds option, --avail-cores, to use lcores which are available
>>> by calling pthread_getaffinity_np() to narrow down detected cores before
>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>
>>> Test example:
>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>         --avail-cores -m 1024
>>>
>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>
>> Hmm, to me this sounds like something that should be done always so
>> there's no need for an option. Or if there's a chance it might do the
>> wrong thing in some rare circumstance then perhaps there should be a
>> disabler option instead?
>
> Thanks for comments.
>
> Yes, there's a use case that we cannot handle.
>
> If we make it as default, DPDK applications may fail to start, when user
> specifies a core in isolcpus and its parent process (say bash) has a
> cpuset affinity that excludes isolcpus. Originally, DPDK applications
> just blindly do pthread_setaffinity_np() and it always succeeds because
> it always has root privilege to change any cpu affinity.
>
> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
> flagged as undetected (in my older implementation) and leads to failure.
> To make it correct, we would always add "taskset mask" (or other ways)
> before DPDK application cmd lines.
>
> How do you think?

I still think it sounds like something that should be done by default 
and maybe be overridable with some flag, rather than the other way 
around. Another alternative might be detecting the cores always but if 
running as root, override but with a warning.

But I dont know, just wondering. To look at it from another angle: why 
would somebody use this new --avail-cores option and in what situation, 
if things "just work" otherwise anyway?

	- Panu -
  
Jianfeng Tan March 9, 2016, 1:53 p.m. UTC | #4
On 3/9/2016 9:05 PM, Panu Matilainen wrote:
> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>> Hi Panu,
>>
>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>> This patch adds option, --avail-cores, to use lcores which are 
>>>> available
>>>> by calling pthread_getaffinity_np() to narrow down detected cores 
>>>> before
>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>
>>>> Test example:
>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>         --avail-cores -m 1024
>>>>
>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>
>>> Hmm, to me this sounds like something that should be done always so
>>> there's no need for an option. Or if there's a chance it might do the
>>> wrong thing in some rare circumstance then perhaps there should be a
>>> disabler option instead?
>>
>> Thanks for comments.
>>
>> Yes, there's a use case that we cannot handle.
>>
>> If we make it as default, DPDK applications may fail to start, when user
>> specifies a core in isolcpus and its parent process (say bash) has a
>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>> just blindly do pthread_setaffinity_np() and it always succeeds because
>> it always has root privilege to change any cpu affinity.
>>
>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>> flagged as undetected (in my older implementation) and leads to failure.
>> To make it correct, we would always add "taskset mask" (or other ways)
>> before DPDK application cmd lines.
>>
>> How do you think?
>
> I still think it sounds like something that should be done by default 
> and maybe be overridable with some flag, rather than the other way 
> around. Another alternative might be detecting the cores always but if 
> running as root, override but with a warning.

For your second solution, only root can setaffinity to isolcpus?
Your first solution seems like a promising way for me.

>
> But I dont know, just wondering. To look at it from another angle: why 
> would somebody use this new --avail-cores option and in what 
> situation, if things "just work" otherwise anyway?

For DPDK applications, the most common case to initialize DPDK is like 
this: "$dpdk-app [options for DPDK] -- [options for app]", so users need 
to specify which cores to run and how much hugepages are used. Suppose 
we need this dpdk-app to run in a container, users already give those 
information when they build up the cgroup for it to run inside, this 
option or this patch is to make DPDK more smart to discover how much 
resource will be used. Make sense?

Thanks,
Jianfeng


>
>     - Panu -
>
  
Ananyev, Konstantin March 9, 2016, 2:01 p.m. UTC | #5
> -----Original Message-----

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng

> Sent: Wednesday, March 09, 2016 1:53 PM

> To: Panu Matilainen; dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> 

> 

> 

> On 3/9/2016 9:05 PM, Panu Matilainen wrote:

> > On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:

> >> Hi Panu,

> >>

> >> On 3/8/2016 4:54 PM, Panu Matilainen wrote:

> >>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:

> >>>> This patch adds option, --avail-cores, to use lcores which are

> >>>> available

> >>>> by calling pthread_getaffinity_np() to narrow down detected cores

> >>>> before

> >>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).

> >>>>

> >>>> Test example:

> >>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \

> >>>>         --avail-cores -m 1024

> >>>>

> >>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> >>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >>>

> >>> Hmm, to me this sounds like something that should be done always so

> >>> there's no need for an option. Or if there's a chance it might do the

> >>> wrong thing in some rare circumstance then perhaps there should be a

> >>> disabler option instead?

> >>

> >> Thanks for comments.

> >>

> >> Yes, there's a use case that we cannot handle.

> >>

> >> If we make it as default, DPDK applications may fail to start, when user

> >> specifies a core in isolcpus and its parent process (say bash) has a

> >> cpuset affinity that excludes isolcpus. Originally, DPDK applications

> >> just blindly do pthread_setaffinity_np() and it always succeeds because

> >> it always has root privilege to change any cpu affinity.

> >>

> >> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be

> >> flagged as undetected (in my older implementation) and leads to failure.

> >> To make it correct, we would always add "taskset mask" (or other ways)

> >> before DPDK application cmd lines.

> >>

> >> How do you think?

> >

> > I still think it sounds like something that should be done by default

> > and maybe be overridable with some flag, rather than the other way

> > around. Another alternative might be detecting the cores always but if

> > running as root, override but with a warning.

> 

> For your second solution, only root can setaffinity to isolcpus?

> Your first solution seems like a promising way for me.

> 

> >

> > But I dont know, just wondering. To look at it from another angle: why

> > would somebody use this new --avail-cores option and in what

> > situation, if things "just work" otherwise anyway?

> 

> For DPDK applications, the most common case to initialize DPDK is like

> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need

> to specify which cores to run and how much hugepages are used. Suppose

> we need this dpdk-app to run in a container, users already give those

> information when they build up the cgroup for it to run inside, this

> option or this patch is to make DPDK more smart to discover how much

> resource will be used. Make sense?


But then, all we need might be just a script that would extract this information from the system
and form a proper cmdline parameter for DPDK? 
Konstantin

> 

> Thanks,

> Jianfeng

> 

> 

> >

> >     - Panu -

> >
  
Jianfeng Tan March 9, 2016, 2:17 p.m. UTC | #6
On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote:
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng
>> Sent: Wednesday, March 09, 2016 1:53 PM
>> To: Panu Matilainen; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>
>>
>>
>> On 3/9/2016 9:05 PM, Panu Matilainen wrote:
>>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>>> Hi Panu,
>>>>
>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>>> This patch adds option, --avail-cores, to use lcores which are
>>>>>> available
>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores
>>>>>> before
>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>>
>>>>>> Test example:
>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>>          --avail-cores -m 1024
>>>>>>
>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>> Hmm, to me this sounds like something that should be done always so
>>>>> there's no need for an option. Or if there's a chance it might do the
>>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>>> disabler option instead?
>>>> Thanks for comments.
>>>>
>>>> Yes, there's a use case that we cannot handle.
>>>>
>>>> If we make it as default, DPDK applications may fail to start, when user
>>>> specifies a core in isolcpus and its parent process (say bash) has a
>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>>> it always has root privilege to change any cpu affinity.
>>>>
>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>>> flagged as undetected (in my older implementation) and leads to failure.
>>>> To make it correct, we would always add "taskset mask" (or other ways)
>>>> before DPDK application cmd lines.
>>>>
>>>> How do you think?
>>> I still think it sounds like something that should be done by default
>>> and maybe be overridable with some flag, rather than the other way
>>> around. Another alternative might be detecting the cores always but if
>>> running as root, override but with a warning.
>> For your second solution, only root can setaffinity to isolcpus?
>> Your first solution seems like a promising way for me.
>>
>>> But I dont know, just wondering. To look at it from another angle: why
>>> would somebody use this new --avail-cores option and in what
>>> situation, if things "just work" otherwise anyway?
>> For DPDK applications, the most common case to initialize DPDK is like
>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need
>> to specify which cores to run and how much hugepages are used. Suppose
>> we need this dpdk-app to run in a container, users already give those
>> information when they build up the cgroup for it to run inside, this
>> option or this patch is to make DPDK more smart to discover how much
>> resource will be used. Make sense?
> But then, all we need might be just a script that would extract this information from the system
> and form a proper cmdline parameter for DPDK?

Yes, a script will work. Or to construct (argc, argv) to call 
rte_eal_init() in the application. But as Neil Horman once suggested, a 
simple pthread_getaffinity_np() will get all things done. So if it worth 
a patch here?

Thanks,
Jianfeng

> Konstantin
>
>> Thanks,
>> Jianfeng
>>
>>
>>>      - Panu -
>>>
  
Ananyev, Konstantin March 9, 2016, 2:44 p.m. UTC | #7
> -----Original Message-----

> From: Tan, Jianfeng

> Sent: Wednesday, March 09, 2016 2:17 PM

> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> 

> 

> 

> On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote:

> >

> >> -----Original Message-----

> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng

> >> Sent: Wednesday, March 09, 2016 1:53 PM

> >> To: Panu Matilainen; dev@dpdk.org

> >> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> >>

> >>

> >>

> >> On 3/9/2016 9:05 PM, Panu Matilainen wrote:

> >>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:

> >>>> Hi Panu,

> >>>>

> >>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:

> >>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:

> >>>>>> This patch adds option, --avail-cores, to use lcores which are

> >>>>>> available

> >>>>>> by calling pthread_getaffinity_np() to narrow down detected cores

> >>>>>> before

> >>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).

> >>>>>>

> >>>>>> Test example:

> >>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \

> >>>>>>          --avail-cores -m 1024

> >>>>>>

> >>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> >>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >>>>> Hmm, to me this sounds like something that should be done always so

> >>>>> there's no need for an option. Or if there's a chance it might do the

> >>>>> wrong thing in some rare circumstance then perhaps there should be a

> >>>>> disabler option instead?

> >>>> Thanks for comments.

> >>>>

> >>>> Yes, there's a use case that we cannot handle.

> >>>>

> >>>> If we make it as default, DPDK applications may fail to start, when user

> >>>> specifies a core in isolcpus and its parent process (say bash) has a

> >>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications

> >>>> just blindly do pthread_setaffinity_np() and it always succeeds because

> >>>> it always has root privilege to change any cpu affinity.

> >>>>

> >>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be

> >>>> flagged as undetected (in my older implementation) and leads to failure.

> >>>> To make it correct, we would always add "taskset mask" (or other ways)

> >>>> before DPDK application cmd lines.

> >>>>

> >>>> How do you think?

> >>> I still think it sounds like something that should be done by default

> >>> and maybe be overridable with some flag, rather than the other way

> >>> around. Another alternative might be detecting the cores always but if

> >>> running as root, override but with a warning.

> >> For your second solution, only root can setaffinity to isolcpus?

> >> Your first solution seems like a promising way for me.

> >>

> >>> But I dont know, just wondering. To look at it from another angle: why

> >>> would somebody use this new --avail-cores option and in what

> >>> situation, if things "just work" otherwise anyway?

> >> For DPDK applications, the most common case to initialize DPDK is like

> >> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need

> >> to specify which cores to run and how much hugepages are used. Suppose

> >> we need this dpdk-app to run in a container, users already give those

> >> information when they build up the cgroup for it to run inside, this

> >> option or this patch is to make DPDK more smart to discover how much

> >> resource will be used. Make sense?

> > But then, all we need might be just a script that would extract this information from the system

> > and form a proper cmdline parameter for DPDK?

> 

> Yes, a script will work. Or to construct (argc, argv) to call

> rte_eal_init() in the application. But as Neil Horman once suggested, a

> simple pthread_getaffinity_np() will get all things done. So if it worth

> a patch here?


Don't know...
Personally I would prefer not to put extra logic inside EAL.
For me - there are too many different options already.
From other side looking at the patch itself:
You are updating lcore_count and lcore_config[],based on physical cpu availability,
but these days it is not always one-to-one mapping between EAL lcore and physical cpu. 
Shouldn't that be taken into account?
Konstantin
  
Jianfeng Tan March 9, 2016, 2:55 p.m. UTC | #8
Hi Konstantin,

On 3/9/2016 10:44 PM, Ananyev, Konstantin wrote:
>
>> -----Original Message-----
>> From: Tan, Jianfeng
>> Sent: Wednesday, March 09, 2016 2:17 PM
>> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>
>>
>>
>> On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote:
>>>> -----Original Message-----
>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng
>>>> Sent: Wednesday, March 09, 2016 1:53 PM
>>>> To: Panu Matilainen; dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>>>
>>>>
>>>>
>>>> On 3/9/2016 9:05 PM, Panu Matilainen wrote:
>>>>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>>>>> Hi Panu,
>>>>>>
>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>>>>> This patch adds option, --avail-cores, to use lcores which are
>>>>>>>> available
>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores
>>>>>>>> before
>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>>>>
>>>>>>>> Test example:
>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>>>>           --avail-cores -m 1024
>>>>>>>>
>>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>>>> Hmm, to me this sounds like something that should be done always so
>>>>>>> there's no need for an option. Or if there's a chance it might do the
>>>>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>>>>> disabler option instead?
>>>>>> Thanks for comments.
>>>>>>
>>>>>> Yes, there's a use case that we cannot handle.
>>>>>>
>>>>>> If we make it as default, DPDK applications may fail to start, when user
>>>>>> specifies a core in isolcpus and its parent process (say bash) has a
>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>>>>> it always has root privilege to change any cpu affinity.
>>>>>>
>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>>>>> flagged as undetected (in my older implementation) and leads to failure.
>>>>>> To make it correct, we would always add "taskset mask" (or other ways)
>>>>>> before DPDK application cmd lines.
>>>>>>
>>>>>> How do you think?
>>>>> I still think it sounds like something that should be done by default
>>>>> and maybe be overridable with some flag, rather than the other way
>>>>> around. Another alternative might be detecting the cores always but if
>>>>> running as root, override but with a warning.
>>>> For your second solution, only root can setaffinity to isolcpus?
>>>> Your first solution seems like a promising way for me.
>>>>
>>>>> But I dont know, just wondering. To look at it from another angle: why
>>>>> would somebody use this new --avail-cores option and in what
>>>>> situation, if things "just work" otherwise anyway?
>>>> For DPDK applications, the most common case to initialize DPDK is like
>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need
>>>> to specify which cores to run and how much hugepages are used. Suppose
>>>> we need this dpdk-app to run in a container, users already give those
>>>> information when they build up the cgroup for it to run inside, this
>>>> option or this patch is to make DPDK more smart to discover how much
>>>> resource will be used. Make sense?
>>> But then, all we need might be just a script that would extract this information from the system
>>> and form a proper cmdline parameter for DPDK?
>> Yes, a script will work. Or to construct (argc, argv) to call
>> rte_eal_init() in the application. But as Neil Horman once suggested, a
>> simple pthread_getaffinity_np() will get all things done. So if it worth
>> a patch here?
> Don't know...
> Personally I would prefer not to put extra logic inside EAL.
> For me - there are too many different options already.

Then how about make it default in rte_eal_cpu_init()? And it is already 
known it will bring trouble to those use isolcpus users, they need to 
add "taskset [mask]" before starting a DPDK app.

>  From other side looking at the patch itself:
> You are updating lcore_count and lcore_config[],based on physical cpu availability,
> but these days it is not always one-to-one mapping between EAL lcore and physical cpu.
> Shouldn't that be taken into account?

I have not see the problem so far, because this work is done before 
parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core 
is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or 
could you please give more hints?

Thanks,
Jianfeng

> Konstantin
>   
>
>
  
Ananyev, Konstantin March 9, 2016, 3:17 p.m. UTC | #9
Hi Jianfeng,

> -----Original Message-----

> From: Tan, Jianfeng

> Sent: Wednesday, March 09, 2016 2:56 PM

> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> 

> Hi Konstantin,

> 

> On 3/9/2016 10:44 PM, Ananyev, Konstantin wrote:

> >

> >> -----Original Message-----

> >> From: Tan, Jianfeng

> >> Sent: Wednesday, March 09, 2016 2:17 PM

> >> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org

> >> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> >>

> >>

> >>

> >> On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote:

> >>>> -----Original Message-----

> >>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng

> >>>> Sent: Wednesday, March 09, 2016 1:53 PM

> >>>> To: Panu Matilainen; dev@dpdk.org

> >>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

> >>>>

> >>>>

> >>>>

> >>>> On 3/9/2016 9:05 PM, Panu Matilainen wrote:

> >>>>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:

> >>>>>> Hi Panu,

> >>>>>>

> >>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:

> >>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:

> >>>>>>>> This patch adds option, --avail-cores, to use lcores which are

> >>>>>>>> available

> >>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores

> >>>>>>>> before

> >>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).

> >>>>>>>>

> >>>>>>>> Test example:

> >>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \

> >>>>>>>>           --avail-cores -m 1024

> >>>>>>>>

> >>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> >>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >>>>>>> Hmm, to me this sounds like something that should be done always so

> >>>>>>> there's no need for an option. Or if there's a chance it might do the

> >>>>>>> wrong thing in some rare circumstance then perhaps there should be a

> >>>>>>> disabler option instead?

> >>>>>> Thanks for comments.

> >>>>>>

> >>>>>> Yes, there's a use case that we cannot handle.

> >>>>>>

> >>>>>> If we make it as default, DPDK applications may fail to start, when user

> >>>>>> specifies a core in isolcpus and its parent process (say bash) has a

> >>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications

> >>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because

> >>>>>> it always has root privilege to change any cpu affinity.

> >>>>>>

> >>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be

> >>>>>> flagged as undetected (in my older implementation) and leads to failure.

> >>>>>> To make it correct, we would always add "taskset mask" (or other ways)

> >>>>>> before DPDK application cmd lines.

> >>>>>>

> >>>>>> How do you think?

> >>>>> I still think it sounds like something that should be done by default

> >>>>> and maybe be overridable with some flag, rather than the other way

> >>>>> around. Another alternative might be detecting the cores always but if

> >>>>> running as root, override but with a warning.

> >>>> For your second solution, only root can setaffinity to isolcpus?

> >>>> Your first solution seems like a promising way for me.

> >>>>

> >>>>> But I dont know, just wondering. To look at it from another angle: why

> >>>>> would somebody use this new --avail-cores option and in what

> >>>>> situation, if things "just work" otherwise anyway?

> >>>> For DPDK applications, the most common case to initialize DPDK is like

> >>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need

> >>>> to specify which cores to run and how much hugepages are used. Suppose

> >>>> we need this dpdk-app to run in a container, users already give those

> >>>> information when they build up the cgroup for it to run inside, this

> >>>> option or this patch is to make DPDK more smart to discover how much

> >>>> resource will be used. Make sense?

> >>> But then, all we need might be just a script that would extract this information from the system

> >>> and form a proper cmdline parameter for DPDK?

> >> Yes, a script will work. Or to construct (argc, argv) to call

> >> rte_eal_init() in the application. But as Neil Horman once suggested, a

> >> simple pthread_getaffinity_np() will get all things done. So if it worth

> >> a patch here?

> > Don't know...

> > Personally I would prefer not to put extra logic inside EAL.

> > For me - there are too many different options already.

> 

> Then how about make it default in rte_eal_cpu_init()? And it is already

> known it will bring trouble to those use isolcpus users, they need to

> add "taskset [mask]" before starting a DPDK app.


As I said - provide a script?
Same might be for amount of hugepage memory available to the user? 

> 

> >  From other side looking at the patch itself:

> > You are updating lcore_count and lcore_config[],based on physical cpu availability,

> > but these days it is not always one-to-one mapping between EAL lcore and physical cpu.

> > Shouldn't that be taken into account?

> 

> I have not see the problem so far, because this work is done before

> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core

> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or

> could you please give more hints?


I didn't test try changes, so probably I am missing something.
Let say iuser allowed to use only cpus 0-3.
If he would type with:
 --avail-cores  --lcores='(1-7)@2',
then only lcores 1-3 would be started.
Again if user would specify '2@(1-7)' it would also be undetected
that cpus 4-7 are note available to the user. 
Is that so?

Konstantin

> 

> Thanks,

> Jianfeng

> 

> > Konstantin

> >

> >

> >
  
Jianfeng Tan March 9, 2016, 5:45 p.m. UTC | #10
Hi Konstantin,

On 3/9/2016 11:17 PM, Ananyev, Konstantin wrote:
> Hi Jianfeng,
>
>> -----Original Message-----
>> From: Tan, Jianfeng
>> Sent: Wednesday, March 09, 2016 2:56 PM
>> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>
>> Hi Konstantin,
>>
>> On 3/9/2016 10:44 PM, Ananyev, Konstantin wrote:
>>>> -----Original Message-----
>>>> From: Tan, Jianfeng
>>>> Sent: Wednesday, March 09, 2016 2:17 PM
>>>> To: Ananyev, Konstantin; Panu Matilainen; dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>>>
>>>>
>>>>
>>>> On 3/9/2016 10:01 PM, Ananyev, Konstantin wrote:
>>>>>> -----Original Message-----
>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng
>>>>>> Sent: Wednesday, March 09, 2016 1:53 PM
>>>>>> To: Panu Matilainen; dev@dpdk.org
>>>>>> Subject: Re: [dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/9/2016 9:05 PM, Panu Matilainen wrote:
>>>>>>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>>>>>>> Hi Panu,
>>>>>>>>
>>>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>>>>>>> This patch adds option, --avail-cores, to use lcores which are
>>>>>>>>>> available
>>>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores
>>>>>>>>>> before
>>>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>>>>>>
>>>>>>>>>> Test example:
>>>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>>>>>>            --avail-cores -m 1024
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>>>>>> Hmm, to me this sounds like something that should be done always so
>>>>>>>>> there's no need for an option. Or if there's a chance it might do the
>>>>>>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>>>>>>> disabler option instead?
>>>>>>>> Thanks for comments.
>>>>>>>>
>>>>>>>> Yes, there's a use case that we cannot handle.
>>>>>>>>
>>>>>>>> If we make it as default, DPDK applications may fail to start, when user
>>>>>>>> specifies a core in isolcpus and its parent process (say bash) has a
>>>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>>>>>>> it always has root privilege to change any cpu affinity.
>>>>>>>>
>>>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>>>>>>> flagged as undetected (in my older implementation) and leads to failure.
>>>>>>>> To make it correct, we would always add "taskset mask" (or other ways)
>>>>>>>> before DPDK application cmd lines.
>>>>>>>>
>>>>>>>> How do you think?
>>>>>>> I still think it sounds like something that should be done by default
>>>>>>> and maybe be overridable with some flag, rather than the other way
>>>>>>> around. Another alternative might be detecting the cores always but if
>>>>>>> running as root, override but with a warning.
>>>>>> For your second solution, only root can setaffinity to isolcpus?
>>>>>> Your first solution seems like a promising way for me.
>>>>>>
>>>>>>> But I dont know, just wondering. To look at it from another angle: why
>>>>>>> would somebody use this new --avail-cores option and in what
>>>>>>> situation, if things "just work" otherwise anyway?
>>>>>> For DPDK applications, the most common case to initialize DPDK is like
>>>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need
>>>>>> to specify which cores to run and how much hugepages are used. Suppose
>>>>>> we need this dpdk-app to run in a container, users already give those
>>>>>> information when they build up the cgroup for it to run inside, this
>>>>>> option or this patch is to make DPDK more smart to discover how much
>>>>>> resource will be used. Make sense?
>>>>> But then, all we need might be just a script that would extract this information from the system
>>>>> and form a proper cmdline parameter for DPDK?
>>>> Yes, a script will work. Or to construct (argc, argv) to call
>>>> rte_eal_init() in the application. But as Neil Horman once suggested, a
>>>> simple pthread_getaffinity_np() will get all things done. So if it worth
>>>> a patch here?
>>> Don't know...
>>> Personally I would prefer not to put extra logic inside EAL.
>>> For me - there are too many different options already.
>> Then how about make it default in rte_eal_cpu_init()? And it is already
>> known it will bring trouble to those use isolcpus users, they need to
>> add "taskset [mask]" before starting a DPDK app.
> As I said - provide a script?

Yes. But what I want to say is this script is hard to be right, if there 
are different kinds of limitations. (Barely happen though :-) )

> Same might be for amount of hugepage memory available to the user?

Ditto. Limitations like hugetlbfs quota, cgroup hugetlb, some are used 
by app themself (more like an artificial argument) ...
>
>>>   From other side looking at the patch itself:
>>> You are updating lcore_count and lcore_config[],based on physical cpu availability,
>>> but these days it is not always one-to-one mapping between EAL lcore and physical cpu.
>>> Shouldn't that be taken into account?
>> I have not see the problem so far, because this work is done before
>> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core
>> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or
>> could you please give more hints?
> I didn't test try changes, so probably I am missing something.
> Let say iuser allowed to use only cpus 0-3.
> If he would type with:
>   --avail-cores  --lcores='(1-7)@2',
> then only lcores 1-3 would be started.
> Again if user would specify '2@(1-7)' it would also be undetected
> that cpus 4-7 are note available to the user.
> Is that so?

After reading the code:
For case --lcores='(1-7)@2', lcores 1-7 would be started, and bind to 
pcore 2.
For case --lcores='2@(1-7)', this will fail with "core 4 unavailable".

It's because:
a.  although 1:1 mapping is built-up and flagged as detected if pcore is 
found in sysfs. (ROLE_RTE, cpuset, detected is true)
b. in the beginning of eal_parse_lcores(), "reset lcore config". 
(ROLE_OFF, cpuset is empty, detected is still true)
c. pcore cpuset will be checked by convert_to_cpuset using the previous 
"detected" value.

I have tested it with the patch. Result aligns above analysis.
For case --lcores='(1-7)@2': sudo taskset 0xf 
./examples/helloworld/build/helloworld --avail-cores --lcores='(1-7)@2'
...
hello from core 2
hello from core 3
hello from core 4
hello from core 5
hello from core 6
hello from core 7
hello from core 1

For case --lcores='2@(1-7)': sudo taskset 0xf 
./examples/helloworld/build/helloworld --avail-cores --lcores='2@(1-7)'
...
EAL: core 4 unavailable
EAL: invalid parameter for --lcores
...

One thing may worth mention: shall "detected" be maintained in struct 
lcore_config? Maybe we need to maintain an data structure for pcores?

Thanks,
Jianfeng

>
> Konstantin
>
>> Thanks,
>> Jianfeng
>>
>>> Konstantin
>>>
>>>
>>>
  
Ananyev, Konstantin March 9, 2016, 7:33 p.m. UTC | #11
> >>>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:

> >>>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:

> >>>>>>>>>> This patch adds option, --avail-cores, to use lcores which are

> >>>>>>>>>> available

> >>>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores

> >>>>>>>>>> before

> >>>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).

> >>>>>>>>>>

> >>>>>>>>>> Test example:

> >>>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \

> >>>>>>>>>>            --avail-cores -m 1024

> >>>>>>>>>>

> >>>>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> >>>>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >>>>>>>>> Hmm, to me this sounds like something that should be done always so

> >>>>>>>>> there's no need for an option. Or if there's a chance it might do the

> >>>>>>>>> wrong thing in some rare circumstance then perhaps there should be a

> >>>>>>>>> disabler option instead?

> >>>>>>>> Thanks for comments.

> >>>>>>>>

> >>>>>>>> Yes, there's a use case that we cannot handle.

> >>>>>>>>

> >>>>>>>> If we make it as default, DPDK applications may fail to start, when user

> >>>>>>>> specifies a core in isolcpus and its parent process (say bash) has a

> >>>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications

> >>>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because

> >>>>>>>> it always has root privilege to change any cpu affinity.

> >>>>>>>>

> >>>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be

> >>>>>>>> flagged as undetected (in my older implementation) and leads to failure.

> >>>>>>>> To make it correct, we would always add "taskset mask" (or other ways)

> >>>>>>>> before DPDK application cmd lines.

> >>>>>>>>

> >>>>>>>> How do you think?

> >>>>>>> I still think it sounds like something that should be done by default

> >>>>>>> and maybe be overridable with some flag, rather than the other way

> >>>>>>> around. Another alternative might be detecting the cores always but if

> >>>>>>> running as root, override but with a warning.

> >>>>>> For your second solution, only root can setaffinity to isolcpus?

> >>>>>> Your first solution seems like a promising way for me.

> >>>>>>

> >>>>>>> But I dont know, just wondering. To look at it from another angle: why

> >>>>>>> would somebody use this new --avail-cores option and in what

> >>>>>>> situation, if things "just work" otherwise anyway?

> >>>>>> For DPDK applications, the most common case to initialize DPDK is like

> >>>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need

> >>>>>> to specify which cores to run and how much hugepages are used. Suppose

> >>>>>> we need this dpdk-app to run in a container, users already give those

> >>>>>> information when they build up the cgroup for it to run inside, this

> >>>>>> option or this patch is to make DPDK more smart to discover how much

> >>>>>> resource will be used. Make sense?

> >>>>> But then, all we need might be just a script that would extract this information from the system

> >>>>> and form a proper cmdline parameter for DPDK?

> >>>> Yes, a script will work. Or to construct (argc, argv) to call

> >>>> rte_eal_init() in the application. But as Neil Horman once suggested, a

> >>>> simple pthread_getaffinity_np() will get all things done. So if it worth

> >>>> a patch here?

> >>> Don't know...

> >>> Personally I would prefer not to put extra logic inside EAL.

> >>> For me - there are too many different options already.

> >> Then how about make it default in rte_eal_cpu_init()? And it is already

> >> known it will bring trouble to those use isolcpus users, they need to

> >> add "taskset [mask]" before starting a DPDK app.

> > As I said - provide a script?

> 

> Yes. But what I want to say is this script is hard to be right, if there

> are different kinds of limitations. (Barely happen though :-) )


My thought was to keep dpdk code untouched - i.e. let it still blindly set_pthread_affinity()
based on the input parameters, and in addition provide a script for those who want to run
in '--avail-cores' mode. 
So it could do 'taskset -p $$' and then either form -c parameter list  for the app,
or check existing -c/-l/--lcores parameter and complain if not allowed pcpu detected.
But ok, might be it is easier and more convenient to have this logic inside EAL,
then in a separate script.

> 

> > Same might be for amount of hugepage memory available to the user?

> 

> Ditto. Limitations like hugetlbfs quota, cgroup hugetlb, some are used

> by app themself (more like an artificial argument) ...

> >

> >>>   From other side looking at the patch itself:

> >>> You are updating lcore_count and lcore_config[],based on physical cpu availability,

> >>> but these days it is not always one-to-one mapping between EAL lcore and physical cpu.

> >>> Shouldn't that be taken into account?

> >> I have not see the problem so far, because this work is done before

> >> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core

> >> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or

> >> could you please give more hints?

> > I didn't test try changes, so probably I am missing something.

> > Let say iuser allowed to use only cpus 0-3.

> > If he would type with:

> >   --avail-cores  --lcores='(1-7)@2',

> > then only lcores 1-3 would be started.

> > Again if user would specify '2@(1-7)' it would also be undetected

> > that cpus 4-7 are note available to the user.

> > Is that so?

> 

> After reading the code:

> For case --lcores='(1-7)@2', lcores 1-7 would be started, and bind to

> pcore 2.

> For case --lcores='2@(1-7)', this will fail with "core 4 unavailable".

> 

> It's because:

> a.  although 1:1 mapping is built-up and flagged as detected if pcore is

> found in sysfs. (ROLE_RTE, cpuset, detected is true)

> b. in the beginning of eal_parse_lcores(), "reset lcore config".

> (ROLE_OFF, cpuset is empty, detected is still true)

> c. pcore cpuset will be checked by convert_to_cpuset using the previous

> "detected" value.


Ok, my bad then - I misunderstood the code.
Thanks for explanation.
So if I get it right now - first inside lib/librte_eal/common/eal_common_lcore.c
Both lcore_count and lcore_config relate to the pcpus.
Then later, at lib/librte_eal/common/eal_common_options.c
they are overwritten related to lcores information.
Except lcore_config[].detected, which seems kept intact.
Is that correct? 

> 

> I have tested it with the patch. Result aligns above analysis.

> For case --lcores='(1-7)@2': sudo taskset 0xf

> ./examples/helloworld/build/helloworld --avail-cores --lcores='(1-7)@2'

> ...

> hello from core 2

> hello from core 3

> hello from core 4

> hello from core 5

> hello from core 6

> hello from core 7

> hello from core 1

> 

> For case --lcores='2@(1-7)': sudo taskset 0xf

> ./examples/helloworld/build/helloworld --avail-cores --lcores='2@(1-7)'

> ...

> EAL: core 4 unavailable

> EAL: invalid parameter for --lcores

> ...

> 

> One thing may worth mention: shall "detected" be maintained in struct

> lcore_config? Maybe we need to maintain an data structure for pcores?


Yes, it might be good to split pcpu and lcores information somehow,
as it is a bit confusing right now.
But I suppose this is a subject for another patch/discussion.
Konstantin
  
Jianfeng Tan March 10, 2016, 1:36 a.m. UTC | #12
On 3/10/2016 3:33 AM, Ananyev, Konstantin wrote:
>
>>>>>>>>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>>>>>>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>>>>>>>>> This patch adds option, --avail-cores, to use lcores which are
>>>>>>>>>>>> available
>>>>>>>>>>>> by calling pthread_getaffinity_np() to narrow down detected cores
>>>>>>>>>>>> before
>>>>>>>>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>>>>>>>>
>>>>>>>>>>>> Test example:
>>>>>>>>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>>>>>>>>             --avail-cores -m 1024
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>>>>>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>>>>>>>> Hmm, to me this sounds like something that should be done always so
>>>>>>>>>>> there's no need for an option. Or if there's a chance it might do the
>>>>>>>>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>>>>>>>>> disabler option instead?
>>>>>>>>>> Thanks for comments.
>>>>>>>>>>
>>>>>>>>>> Yes, there's a use case that we cannot handle.
>>>>>>>>>>
>>>>>>>>>> If we make it as default, DPDK applications may fail to start, when user
>>>>>>>>>> specifies a core in isolcpus and its parent process (say bash) has a
>>>>>>>>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>>>>>>>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>>>>>>>>> it always has root privilege to change any cpu affinity.
>>>>>>>>>>
>>>>>>>>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>>>>>>>>> flagged as undetected (in my older implementation) and leads to failure.
>>>>>>>>>> To make it correct, we would always add "taskset mask" (or other ways)
>>>>>>>>>> before DPDK application cmd lines.
>>>>>>>>>>
>>>>>>>>>> How do you think?
>>>>>>>>> I still think it sounds like something that should be done by default
>>>>>>>>> and maybe be overridable with some flag, rather than the other way
>>>>>>>>> around. Another alternative might be detecting the cores always but if
>>>>>>>>> running as root, override but with a warning.
>>>>>>>> For your second solution, only root can setaffinity to isolcpus?
>>>>>>>> Your first solution seems like a promising way for me.
>>>>>>>>
>>>>>>>>> But I dont know, just wondering. To look at it from another angle: why
>>>>>>>>> would somebody use this new --avail-cores option and in what
>>>>>>>>> situation, if things "just work" otherwise anyway?
>>>>>>>> For DPDK applications, the most common case to initialize DPDK is like
>>>>>>>> this: "$dpdk-app [options for DPDK] -- [options for app]", so users need
>>>>>>>> to specify which cores to run and how much hugepages are used. Suppose
>>>>>>>> we need this dpdk-app to run in a container, users already give those
>>>>>>>> information when they build up the cgroup for it to run inside, this
>>>>>>>> option or this patch is to make DPDK more smart to discover how much
>>>>>>>> resource will be used. Make sense?
>>>>>>> But then, all we need might be just a script that would extract this information from the system
>>>>>>> and form a proper cmdline parameter for DPDK?
>>>>>> Yes, a script will work. Or to construct (argc, argv) to call
>>>>>> rte_eal_init() in the application. But as Neil Horman once suggested, a
>>>>>> simple pthread_getaffinity_np() will get all things done. So if it worth
>>>>>> a patch here?
>>>>> Don't know...
>>>>> Personally I would prefer not to put extra logic inside EAL.
>>>>> For me - there are too many different options already.
>>>> Then how about make it default in rte_eal_cpu_init()? And it is already
>>>> known it will bring trouble to those use isolcpus users, they need to
>>>> add "taskset [mask]" before starting a DPDK app.
>>> As I said - provide a script?
>> Yes. But what I want to say is this script is hard to be right, if there
>> are different kinds of limitations. (Barely happen though :-) )
> My thought was to keep dpdk code untouched - i.e. let it still blindly set_pthread_affinity()
> based on the input parameters, and in addition provide a script for those who want to run
> in '--avail-cores' mode.
> So it could do 'taskset -p $$' and then either form -c parameter list  for the app,
> or check existing -c/-l/--lcores parameter and complain if not allowed pcpu detected.
> But ok, might be it is easier and more convenient to have this logic inside EAL,
> then in a separate script.
>
>>> Same might be for amount of hugepage memory available to the user?
>> Ditto. Limitations like hugetlbfs quota, cgroup hugetlb, some are used
>> by app themself (more like an artificial argument) ...
>>>>>    From other side looking at the patch itself:
>>>>> You are updating lcore_count and lcore_config[],based on physical cpu availability,
>>>>> but these days it is not always one-to-one mapping between EAL lcore and physical cpu.
>>>>> Shouldn't that be taken into account?
>>>> I have not see the problem so far, because this work is done before
>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores). If a core
>>>> is disabled here, it's like it is not detected in rte_eal_cpu_init(). Or
>>>> could you please give more hints?
>>> I didn't test try changes, so probably I am missing something.
>>> Let say iuser allowed to use only cpus 0-3.
>>> If he would type with:
>>>    --avail-cores  --lcores='(1-7)@2',
>>> then only lcores 1-3 would be started.
>>> Again if user would specify '2@(1-7)' it would also be undetected
>>> that cpus 4-7 are note available to the user.
>>> Is that so?
>> After reading the code:
>> For case --lcores='(1-7)@2', lcores 1-7 would be started, and bind to
>> pcore 2.
>> For case --lcores='2@(1-7)', this will fail with "core 4 unavailable".
>>
>> It's because:
>> a.  although 1:1 mapping is built-up and flagged as detected if pcore is
>> found in sysfs. (ROLE_RTE, cpuset, detected is true)
>> b. in the beginning of eal_parse_lcores(), "reset lcore config".
>> (ROLE_OFF, cpuset is empty, detected is still true)
>> c. pcore cpuset will be checked by convert_to_cpuset using the previous
>> "detected" value.
> Ok, my bad then - I misunderstood the code.
> Thanks for explanation.
> So if I get it right now - first inside lib/librte_eal/common/eal_common_lcore.c
> Both lcore_count and lcore_config relate to the pcpus.
> Then later, at lib/librte_eal/common/eal_common_options.c
> they are overwritten related to lcores information.
> Except lcore_config[].detected, which seems kept intact.
> Is that correct?

Yes, exactly. And really appreciate that you raise up this question for 
discussion.

>
>> I have tested it with the patch. Result aligns above analysis.
>> For case --lcores='(1-7)@2': sudo taskset 0xf
>> ./examples/helloworld/build/helloworld --avail-cores --lcores='(1-7)@2'
>> ...
>> hello from core 2
>> hello from core 3
>> hello from core 4
>> hello from core 5
>> hello from core 6
>> hello from core 7
>> hello from core 1
>>
>> For case --lcores='2@(1-7)': sudo taskset 0xf
>> ./examples/helloworld/build/helloworld --avail-cores --lcores='2@(1-7)'
>> ...
>> EAL: core 4 unavailable
>> EAL: invalid parameter for --lcores
>> ...
>>
>> One thing may worth mention: shall "detected" be maintained in struct
>> lcore_config? Maybe we need to maintain an data structure for pcores?
> Yes, it might be good to split pcpu and lcores information somehow,
> as it is a bit confusing right now.
> But I suppose this is a subject for another patch/discussion.

Yes, just another topic.

Thanks,
Jianfeng

> Konstantin
>
>
  
Jianfeng Tan April 26, 2016, 12:39 p.m. UTC | #13
Hi,

Since some guys are asking about the status of this patch, I'd like to 
ping if anyone still has concerns.
Current conclusion is: with option --avail-cores.

Thanks,
Jianfeng

On 3/4/2016 6:05 PM, Jianfeng Tan wrote:
> This patch adds option, --avail-cores, to use lcores which are available
> by calling pthread_getaffinity_np() to narrow down detected cores before
> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>
> Test example:
> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
> 		--avail-cores -m 1024
>
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> ---
>   lib/librte_eal/common/eal_common_options.c | 52 ++++++++++++++++++++++++++++++
>   lib/librte_eal/common/eal_options.h        |  2 ++
>   2 files changed, 54 insertions(+)
>
> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index 29942ea..dc4882d 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -95,6 +95,7 @@ eal_long_options[] = {
>   	{OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
>   	{OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
>   	{OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
> +	{OPT_AVAIL_CORES,       0, NULL, OPT_AVAIL_CORES_NUM      },
>   	{0,                     0, NULL, 0                        }
>   };
>   
> @@ -681,6 +682,37 @@ err:
>   }
>   
>   static int
> +eal_parse_avail_cores(void)
> +{
> +	int i, count;
> +	pthread_t tid;
> +	rte_cpuset_t cpuset;
> +	struct rte_config *cfg = rte_eal_get_configuration();
> +
> +	tid = pthread_self();
> +	if (pthread_getaffinity_np(tid, sizeof(rte_cpuset_t), &cpuset) != 0)
> +		return -1;
> +
> +	for (i = 0, count = 0; i < RTE_MAX_LCORE; i++) {
> +		if (lcore_config[i].detected && !CPU_ISSET(i, &cpuset)) {
> +			RTE_LOG(DEBUG, EAL, "Flag lcore %u as undetected\n", i);
> +			lcore_config[i].detected = 0;
> +			lcore_config[i].core_index = -1;
> +			cfg->lcore_role[i] = ROLE_OFF;
> +			count++;
> +		}
> +	}
> +	cfg->lcore_count -= count;
> +	if (cfg->lcore_count == 0) {
> +		RTE_LOG(ERR, EAL, "No lcores available\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +
> +static int
>   eal_parse_syslog(const char *facility, struct internal_config *conf)
>   {
>   	int i;
> @@ -754,6 +786,10 @@ eal_parse_proc_type(const char *arg)
>   	return RTE_PROC_INVALID;
>   }
>   
> +static int param_coremask;
> +static int param_corelist;
> +static int param_coremap;
> +
>   int
>   eal_parse_common_option(int opt, const char *optarg,
>   			struct internal_config *conf)
> @@ -775,6 +811,7 @@ eal_parse_common_option(int opt, const char *optarg,
>   		break;
>   	/* coremask */
>   	case 'c':
> +		param_coremask = 1;
>   		if (eal_parse_coremask(optarg) < 0) {
>   			RTE_LOG(ERR, EAL, "invalid coremask\n");
>   			return -1;
> @@ -782,6 +819,7 @@ eal_parse_common_option(int opt, const char *optarg,
>   		break;
>   	/* corelist */
>   	case 'l':
> +		param_corelist = 1;
>   		if (eal_parse_corelist(optarg) < 0) {
>   			RTE_LOG(ERR, EAL, "invalid core list\n");
>   			return -1;
> @@ -890,12 +928,25 @@ eal_parse_common_option(int opt, const char *optarg,
>   		break;
>   	}
>   	case OPT_LCORES_NUM:
> +		param_coremap = 1;
>   		if (eal_parse_lcores(optarg) < 0) {
>   			RTE_LOG(ERR, EAL, "invalid parameter for --"
>   				OPT_LCORES "\n");
>   			return -1;
>   		}
>   		break;
> +	case OPT_AVAIL_CORES_NUM:
> +		if (param_coremask || param_corelist || param_coremap) {
> +			RTE_LOG(ERR, EAL, "should put --" OPT_AVAIL_CORES
> +				" before -c, -l and --" OPT_LCORES "\n");
> +			return -1;
> +		}
> +		if (eal_parse_avail_cores() < 0) {
> +			RTE_LOG(ERR, EAL, "failed to use --"
> +				OPT_AVAIL_CORES "\n");
> +			return -1;
> +		}
> +		break;
>   
>   	/* don't know what to do, leave this to caller */
>   	default:
> @@ -990,6 +1041,7 @@ eal_common_usage(void)
>   	       "                      ',' is used for single number separator.\n"
>   	       "                      '( )' can be omitted for single element group,\n"
>   	       "                      '@' can be omitted if cpus and lcores have the same value\n"
> +	       "  --"OPT_AVAIL_CORES"       Use pthread_getaffinity_np() to detect cores to be used\n"
>   	       "  --"OPT_MASTER_LCORE" ID   Core ID that is used as master\n"
>   	       "  -n CHANNELS         Number of memory channels\n"
>   	       "  -m MB               Memory to allocate (see also --"OPT_SOCKET_MEM")\n"
> diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> index a881c62..b2ddea3 100644
> --- a/lib/librte_eal/common/eal_options.h
> +++ b/lib/librte_eal/common/eal_options.h
> @@ -83,6 +83,8 @@ enum {
>   	OPT_VMWARE_TSC_MAP_NUM,
>   #define OPT_XEN_DOM0          "xen-dom0"
>   	OPT_XEN_DOM0_NUM,
> +#define OPT_AVAIL_CORES       "avail-cores"
> +	OPT_AVAIL_CORES_NUM,
>   	OPT_LONG_MAX_NUM
>   };
>
  
David Marchand May 18, 2016, 12:46 p.m. UTC | #14
Hello Jianfeng,

On Wed, Mar 9, 2016 at 2:05 PM, Panu Matilainen <pmatilai@redhat.com> wrote:
> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>
>> Hi Panu,
>>
>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>
>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>
>>>> This patch adds option, --avail-cores, to use lcores which are available
>>>> by calling pthread_getaffinity_np() to narrow down detected cores before
>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>
>>>> Test example:
>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>         --avail-cores -m 1024
>>>>
>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>
>>>
>>> Hmm, to me this sounds like something that should be done always so
>>> there's no need for an option. Or if there's a chance it might do the
>>> wrong thing in some rare circumstance then perhaps there should be a
>>> disabler option instead?
>>
>>
>> Thanks for comments.
>>
>> Yes, there's a use case that we cannot handle.
>>
>> If we make it as default, DPDK applications may fail to start, when user
>> specifies a core in isolcpus and its parent process (say bash) has a
>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>> just blindly do pthread_setaffinity_np() and it always succeeds because
>> it always has root privilege to change any cpu affinity.
>>
>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>> flagged as undetected (in my older implementation) and leads to failure.
>> To make it correct, we would always add "taskset mask" (or other ways)
>> before DPDK application cmd lines.
>>
>> How do you think?
>
>
> I still think it sounds like something that should be done by default and
> maybe be overridable with some flag, rather than the other way around.
> Another alternative might be detecting the cores always but if running as
> root, override but with a warning.
>
> But I dont know, just wondering. To look at it from another angle: why would
> somebody use this new --avail-cores option and in what situation, if things
> "just work" otherwise anyway?

+1 and I don't even see why we should have an option to disable this,
since taskset would do the job.

Looking at your special case, if the user did set an isolcpus option
for another use, with no -c/-l, I understand the dpdk application
won't care too much about it.
So, this seems like somehow rude to the rest of the system and unwanted.

We can still help the user starting its application as root (without
taskset) by adding a warning message if a requested cpu (-c / -l ..)
is not part of the available cpus.
  
Jianfeng Tan May 19, 2016, 2:25 a.m. UTC | #15
Hi David,


On 5/18/2016 8:46 PM, David Marchand wrote:
> Hello Jianfeng,
>
> On Wed, Mar 9, 2016 at 2:05 PM, Panu Matilainen <pmatilai@redhat.com> wrote:
>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>> Hi Panu,
>>>
>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
>>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
>>>>> This patch adds option, --avail-cores, to use lcores which are available
>>>>> by calling pthread_getaffinity_np() to narrow down detected cores before
>>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>>>>>
>>>>> Test example:
>>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
>>>>>          --avail-cores -m 1024
>>>>>
>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>>>
>>>> Hmm, to me this sounds like something that should be done always so
>>>> there's no need for an option. Or if there's a chance it might do the
>>>> wrong thing in some rare circumstance then perhaps there should be a
>>>> disabler option instead?
>>>
>>> Thanks for comments.
>>>
>>> Yes, there's a use case that we cannot handle.
>>>
>>> If we make it as default, DPDK applications may fail to start, when user
>>> specifies a core in isolcpus and its parent process (say bash) has a
>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>> it always has root privilege to change any cpu affinity.
>>>
>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>> flagged as undetected (in my older implementation) and leads to failure.
>>> To make it correct, we would always add "taskset mask" (or other ways)
>>> before DPDK application cmd lines.
>>>
>>> How do you think?
>>
>> I still think it sounds like something that should be done by default and
>> maybe be overridable with some flag, rather than the other way around.
>> Another alternative might be detecting the cores always but if running as
>> root, override but with a warning.
>>
>> But I dont know, just wondering. To look at it from another angle: why would
>> somebody use this new --avail-cores option and in what situation, if things
>> "just work" otherwise anyway?
> +1 and I don't even see why we should have an option to disable this,
> since taskset would do the job.
>
> Looking at your special case, if the user did set an isolcpus option
> for another use, with no -c/-l, I understand the dpdk application
> won't care too much about it.
> So, this seems like somehow rude to the rest of the system and unwanted.

The case you mentioned above is not the case I mean. But you make your 
point about this one.
The case I originally mean: user sets an isolcpus option for DPDK 
applications. Originally, DPDK apps would be started without any 
problem. But for now, fail to start them because the required cores are 
excluded before -c/-l. As per your comments following, we can add a 
warning message (or should we quit on this situation?). But it indeed 
has an effect on old users (they should changed to use "taskset 
./dpdk_app ..."). Do you think it's a problem?

Thanks,
Jianfeng


>
> We can still help the user starting its application as root (without
> taskset) by adding a warning message if a requested cpu (-c / -l ..)
> is not part of the available cpus.
>
>
  
Thomas Monjalon June 30, 2016, 1:43 p.m. UTC | #16
2016-05-19 10:25, Tan, Jianfeng:
> On 5/18/2016 8:46 PM, David Marchand wrote:
> > On Wed, Mar 9, 2016 at 2:05 PM, Panu Matilainen <pmatilai@redhat.com> wrote:
> >> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
> >>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
> >>>> On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
> >>>>> This patch adds option, --avail-cores, to use lcores which are available
> >>>>> by calling pthread_getaffinity_np() to narrow down detected cores before
> >>>>> parsing coremask (-c), corelist (-l), and coremap (--lcores).
> >>>>>
> >>>>> Test example:
> >>>>> $ taskset 0xc0000 ./examples/helloworld/build/helloworld \
> >>>>>          --avail-cores -m 1024
> >>>>>
> >>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> >>>>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> >>>>
> >>>> Hmm, to me this sounds like something that should be done always so
> >>>> there's no need for an option. Or if there's a chance it might do the
> >>>> wrong thing in some rare circumstance then perhaps there should be a
> >>>> disabler option instead?
> >>>
> >>> Thanks for comments.
> >>>
> >>> Yes, there's a use case that we cannot handle.
> >>>
> >>> If we make it as default, DPDK applications may fail to start, when user
> >>> specifies a core in isolcpus and its parent process (say bash) has a
> >>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
> >>> just blindly do pthread_setaffinity_np() and it always succeeds because
> >>> it always has root privilege to change any cpu affinity.
> >>>
> >>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
> >>> flagged as undetected (in my older implementation) and leads to failure.
> >>> To make it correct, we would always add "taskset mask" (or other ways)
> >>> before DPDK application cmd lines.
> >>>
> >>> How do you think?
> >>
> >> I still think it sounds like something that should be done by default and
> >> maybe be overridable with some flag, rather than the other way around.
> >> Another alternative might be detecting the cores always but if running as
> >> root, override but with a warning.
> >>
> >> But I dont know, just wondering. To look at it from another angle: why would
> >> somebody use this new --avail-cores option and in what situation, if things
> >> "just work" otherwise anyway?
> > +1 and I don't even see why we should have an option to disable this,
> > since taskset would do the job.
> >
> > Looking at your special case, if the user did set an isolcpus option
> > for another use, with no -c/-l, I understand the dpdk application
> > won't care too much about it.
> > So, this seems like somehow rude to the rest of the system and unwanted.
> 
> The case you mentioned above is not the case I mean. But you make your 
> point about this one.
> The case I originally mean: user sets an isolcpus option for DPDK 
> applications. Originally, DPDK apps would be started without any 
> problem. But for now, fail to start them because the required cores are 
> excluded before -c/-l. As per your comments following, we can add a 
> warning message (or should we quit on this situation?). But it indeed 
> has an effect on old users (they should changed to use "taskset 
> ./dpdk_app ..."). Do you think it's a problem?

There is no activity on this patch.
Jianfeng, do not hesitate to ping if needed.
Should we class this patch as "changes requested"?
  
Jianfeng Tan July 1, 2016, 12:52 a.m. UTC | #17
Hi Thomas,

> > >
> > > Looking at your special case, if the user did set an isolcpus option
> > > for another use, with no -c/-l, I understand the dpdk application
> > > won't care too much about it.
> > > So, this seems like somehow rude to the rest of the system and
> unwanted.
> >
> > The case you mentioned above is not the case I mean. But you make your
> > point about this one.
> > The case I originally mean: user sets an isolcpus option for DPDK
> > applications. Originally, DPDK apps would be started without any
> > problem. But for now, fail to start them because the required cores are
> > excluded before -c/-l. As per your comments following, we can add a
> > warning message (or should we quit on this situation?). But it indeed
> > has an effect on old users (they should changed to use "taskset
> > ./dpdk_app ..."). Do you think it's a problem?
> 
> There is no activity on this patch.
> Jianfeng, do not hesitate to ping if needed.
> Should we class this patch as "changes requested"?

Yes, according to latest comments, it should be classified as "changes requested" (I've done that).

I'll resent a new version.

Thanks,
Jianfeng
  

Patch

diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 29942ea..dc4882d 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -95,6 +95,7 @@  eal_long_options[] = {
 	{OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
 	{OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
 	{OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
+	{OPT_AVAIL_CORES,       0, NULL, OPT_AVAIL_CORES_NUM      },
 	{0,                     0, NULL, 0                        }
 };
 
@@ -681,6 +682,37 @@  err:
 }
 
 static int
+eal_parse_avail_cores(void)
+{
+	int i, count;
+	pthread_t tid;
+	rte_cpuset_t cpuset;
+	struct rte_config *cfg = rte_eal_get_configuration();
+
+	tid = pthread_self();
+	if (pthread_getaffinity_np(tid, sizeof(rte_cpuset_t), &cpuset) != 0)
+		return -1;
+
+	for (i = 0, count = 0; i < RTE_MAX_LCORE; i++) {
+		if (lcore_config[i].detected && !CPU_ISSET(i, &cpuset)) {
+			RTE_LOG(DEBUG, EAL, "Flag lcore %u as undetected\n", i);
+			lcore_config[i].detected = 0;
+			lcore_config[i].core_index = -1;
+			cfg->lcore_role[i] = ROLE_OFF;
+			count++;
+		}
+	}
+	cfg->lcore_count -= count;
+	if (cfg->lcore_count == 0) {
+		RTE_LOG(ERR, EAL, "No lcores available\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+
+static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
 	int i;
@@ -754,6 +786,10 @@  eal_parse_proc_type(const char *arg)
 	return RTE_PROC_INVALID;
 }
 
+static int param_coremask;
+static int param_corelist;
+static int param_coremap;
+
 int
 eal_parse_common_option(int opt, const char *optarg,
 			struct internal_config *conf)
@@ -775,6 +811,7 @@  eal_parse_common_option(int opt, const char *optarg,
 		break;
 	/* coremask */
 	case 'c':
+		param_coremask = 1;
 		if (eal_parse_coremask(optarg) < 0) {
 			RTE_LOG(ERR, EAL, "invalid coremask\n");
 			return -1;
@@ -782,6 +819,7 @@  eal_parse_common_option(int opt, const char *optarg,
 		break;
 	/* corelist */
 	case 'l':
+		param_corelist = 1;
 		if (eal_parse_corelist(optarg) < 0) {
 			RTE_LOG(ERR, EAL, "invalid core list\n");
 			return -1;
@@ -890,12 +928,25 @@  eal_parse_common_option(int opt, const char *optarg,
 		break;
 	}
 	case OPT_LCORES_NUM:
+		param_coremap = 1;
 		if (eal_parse_lcores(optarg) < 0) {
 			RTE_LOG(ERR, EAL, "invalid parameter for --"
 				OPT_LCORES "\n");
 			return -1;
 		}
 		break;
+	case OPT_AVAIL_CORES_NUM:
+		if (param_coremask || param_corelist || param_coremap) {
+			RTE_LOG(ERR, EAL, "should put --" OPT_AVAIL_CORES
+				" before -c, -l and --" OPT_LCORES "\n");
+			return -1;
+		}
+		if (eal_parse_avail_cores() < 0) {
+			RTE_LOG(ERR, EAL, "failed to use --"
+				OPT_AVAIL_CORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -990,6 +1041,7 @@  eal_common_usage(void)
 	       "                      ',' is used for single number separator.\n"
 	       "                      '( )' can be omitted for single element group,\n"
 	       "                      '@' can be omitted if cpus and lcores have the same value\n"
+	       "  --"OPT_AVAIL_CORES"       Use pthread_getaffinity_np() to detect cores to be used\n"
 	       "  --"OPT_MASTER_LCORE" ID   Core ID that is used as master\n"
 	       "  -n CHANNELS         Number of memory channels\n"
 	       "  -m MB               Memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index a881c62..b2ddea3 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -83,6 +83,8 @@  enum {
 	OPT_VMWARE_TSC_MAP_NUM,
 #define OPT_XEN_DOM0          "xen-dom0"
 	OPT_XEN_DOM0_NUM,
+#define OPT_AVAIL_CORES       "avail-cores"
+	OPT_AVAIL_CORES_NUM,
 	OPT_LONG_MAX_NUM
 };