[dpdk-dev] IGB_UIO: PCI Resources Management

Message ID 3355891.l3I590SjcV@polaris (mailing list archive)
State Rejected, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Gregory Etelson Dec. 9, 2016, 8:54 a.m. UTC
  Hello,

IGB_UIO driver does not close port PCI activities after DPDK process exits.
DPDK API provides rte_eth_dev_close() to manage port PCI,
but it can be skipped if process receives SIGKILL signal
The patches below provide IGB_UIO release callback and IXGBEVF release function
With the patches, each time DPDK process terminates,
UIO release callback will trigger port PCI close.
On the down side, patched IGB_UIO can be bound to a single adapter type

Regards,
Gregory
  

Comments

Ferruh Yigit Jan. 12, 2017, 11:55 a.m. UTC | #1
On 12/9/2016 8:54 AM, Gregory Etelson wrote:
> Hello,
> 
> IGB_UIO driver does not close port PCI activities after DPDK process exits.
> DPDK API provides rte_eth_dev_close() to manage port PCI,
> but it can be skipped if process receives SIGKILL signal

I guess I understand the problem.

> The patches below provide IGB_UIO release callback and IXGBEVF release function

But adding ixgbe specific code into igb_uio may not be good idea.
Can be anything done one upper layer, pci layer, generic to all drivers?

> With the patches, each time DPDK process terminates,
> UIO release callback will trigger port PCI close.
> On the down side, patched IGB_UIO can be bound to a single adapter type
> 
> Regards,
> Gregory

<...>
  
Alejandro Lucero Jan. 12, 2017, 12:12 p.m. UTC | #2
On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 12/9/2016 8:54 AM, Gregory Etelson wrote:
> > Hello,
> >
> > IGB_UIO driver does not close port PCI activities after DPDK process
> exits.
> > DPDK API provides rte_eth_dev_close() to manage port PCI,
> > but it can be skipped if process receives SIGKILL signal
>
> I guess I understand the problem.
>

This is a known problem, but it is not just a UIO problem, and this patch
does not solve it, maybe it just solves part of it.

In fact, a DPDK program crashing could imply the NIC DMAing after that and
after that memory was assigned to another program.



>
> > The patches below provide IGB_UIO release callback and IXGBEVF release
> function
>
> But adding ixgbe specific code into igb_uio may not be good idea.
> Can be anything done one upper layer, pci layer, generic to all drivers?
>
>
This  module is not just being used for Intel cards, so this addition will
break, at least, the NFP PMD support.

I was told to use igb_uio instead of adding a new NFP uio driver, so I
guess that implies this igb_uio driver should be considered not only a igb
driver.


> > With the patches, each time DPDK process terminates,
> > UIO release callback will trigger port PCI close.
> > On the down side, patched IGB_UIO can be bound to a single adapter type
> >
> > Regards,
> > Gregory
>
> <...>
>
  
Ferruh Yigit Jan. 12, 2017, 12:22 p.m. UTC | #3
On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
> 
> 
> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
> <mailto:ferruh.yigit@intel.com>> wrote:
> 
>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>     > Hello,
>     >
>     > IGB_UIO driver does not close port PCI activities after DPDK process exits.
>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
>     > but it can be skipped if process receives SIGKILL signal
> 
>     I guess I understand the problem.
> 
> 
> This is a known problem, but it is not just a UIO problem, and this
> patch does not solve it, maybe it just solves part of it.
> 
> In fact, a DPDK program crashing could imply the NIC DMAing after that
> and after that memory was assigned to another program.

Yes.
Can there be a way to stop NIC DMA, (or prevent it access to mem
anymore) when app crashes?
I think that is what this patch is looking for.

> 
>  
> 
> 
>     > The patches below provide IGB_UIO release callback and IXGBEVF release function
> 
>     But adding ixgbe specific code into igb_uio may not be good idea.
>     Can be anything done one upper layer, pci layer, generic to all drivers?
> 
> 
> This  module is not just being used for Intel cards, so this addition
> will break, at least, the NFP PMD support.
> 
> I was told to use igb_uio instead of adding a new NFP uio driver, so I
> guess that implies this igb_uio driver should be considered not only a
> igb driver.

No it is generic, I think names has igb_ just for historical reasons.

>  
> 
>     > With the patches, each time DPDK process terminates,
>     > UIO release callback will trigger port PCI close.
>     > On the down side, patched IGB_UIO can be bound to a single adapter type
>     >
>     > Regards,
>     > Gregory
> 
>     <...>
> 
>
  
Alejandro Lucero Jan. 12, 2017, 12:58 p.m. UTC | #4
On Thu, Jan 12, 2017 at 12:22 PM, Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
> >
> >
> > On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
> > <mailto:ferruh.yigit@intel.com>> wrote:
> >
> >     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
> >     > Hello,
> >     >
> >     > IGB_UIO driver does not close port PCI activities after DPDK
> process exits.
> >     > DPDK API provides rte_eth_dev_close() to manage port PCI,
> >     > but it can be skipped if process receives SIGKILL signal
> >
> >     I guess I understand the problem.
> >
> >
> > This is a known problem, but it is not just a UIO problem, and this
> > patch does not solve it, maybe it just solves part of it.
> >
> > In fact, a DPDK program crashing could imply the NIC DMAing after that
> > and after that memory was assigned to another program.
>
> Yes.
> Can there be a way to stop NIC DMA, (or prevent it access to mem
> anymore) when app crashes?
> I think that is what this patch is looking for.
>
>
But with the patch, it just happens when igb_uio module is removed. I guess
this is good for then loading or binding the device to another module, but
that does not solve the problem about stopping the NIC asap.

As I see it, the EAL should catch signals forcing always to close ports,
even when no signal, because it could be just the app exiting without error
but the port/NIC able to receive packets. But for SIGKILL, that would not
be enough. So we need something else for always calling a destructor before
fully exiting.



> >
> >
> >
> >
> >     > The patches below provide IGB_UIO release callback and IXGBEVF
> release function
> >
> >     But adding ixgbe specific code into igb_uio may not be good idea.
> >     Can be anything done one upper layer, pci layer, generic to all
> drivers?
> >
> >
> > This  module is not just being used for Intel cards, so this addition
> > will break, at least, the NFP PMD support.
> >
> > I was told to use igb_uio instead of adding a new NFP uio driver, so I
> > guess that implies this igb_uio driver should be considered not only a
> > igb driver.
>
> No it is generic, I think names has igb_ just for historical reasons.
>

Great.


>
> >
> >
> >     > With the patches, each time DPDK process terminates,
> >     > UIO release callback will trigger port PCI close.
> >     > On the down side, patched IGB_UIO can be bound to a single adapter
> type
> >     >
> >     > Regards,
> >     > Gregory
> >
> >     <...>
> >
> >
>
>
  
Jianfeng Tan Jan. 13, 2017, 1:51 a.m. UTC | #5
> -----Original Message-----

> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit

> Sent: Thursday, January 12, 2017 8:22 PM

> To: Alejandro Lucero

> Cc: Gregory Etelson; dev; users@dpdk.org

> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management

> 

> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:

> >

> >

> > On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com

> > <mailto:ferruh.yigit@intel.com>> wrote:

> >

> >     On 12/9/2016 8:54 AM, Gregory Etelson wrote:

> >     > Hello,

> >     >

> >     > IGB_UIO driver does not close port PCI activities after DPDK process

> exits.

> >     > DPDK API provides rte_eth_dev_close() to manage port PCI,

> >     > but it can be skipped if process receives SIGKILL signal

> >

> >     I guess I understand the problem.

> >

> >

> > This is a known problem, but it is not just a UIO problem, and this

> > patch does not solve it, maybe it just solves part of it.

> >

> > In fact, a DPDK program crashing could imply the NIC DMAing after that

> > and after that memory was assigned to another program.

> 

> Yes.

> Can there be a way to stop NIC DMA, (or prevent it access to mem

> anymore) when app crashes?

> I think that is what this patch is looking for.


If I understand it correctly, you are looking for this patch?
http://dpdk.org/dev/patchwork/patch/17495/
  
Ferruh Yigit Jan. 13, 2017, 2:04 a.m. UTC | #6
On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
> 
> 
>> -----Original Message-----
>> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Thursday, January 12, 2017 8:22 PM
>> To: Alejandro Lucero
>> Cc: Gregory Etelson; dev; users@dpdk.org
>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
>>
>> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
>>>
>>>
>>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
>>> <mailto:ferruh.yigit@intel.com>> wrote:
>>>
>>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>>>     > Hello,
>>>     >
>>>     > IGB_UIO driver does not close port PCI activities after DPDK process
>> exits.
>>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
>>>     > but it can be skipped if process receives SIGKILL signal
>>>
>>>     I guess I understand the problem.
>>>
>>>
>>> This is a known problem, but it is not just a UIO problem, and this
>>> patch does not solve it, maybe it just solves part of it.
>>>
>>> In fact, a DPDK program crashing could imply the NIC DMAing after that
>>> and after that memory was assigned to another program.
>>
>> Yes.
>> Can there be a way to stop NIC DMA, (or prevent it access to mem
>> anymore) when app crashes?
>> I think that is what this patch is looking for.
> 
> If I understand it correctly, you are looking for this patch?
> http://dpdk.org/dev/patchwork/patch/17495/
> 

That is good, thanks Jianfeng, I will check it.

btw, patch's current state is rejected, which is by mistake, it seems I
confused it with "iomem and ioport mapping" patch, sorry about it, I
will update its status immediately.
  
Jianfeng Tan Jan. 13, 2017, 5:33 a.m. UTC | #7
> -----Original Message-----

> From: Yigit, Ferruh

> Sent: Friday, January 13, 2017 10:05 AM

> To: Tan, Jianfeng; Alejandro Lucero

> Cc: Gregory Etelson; dev; users@dpdk.org

> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management

> 

> On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:

> >

> >

> >> -----Original Message-----

> >> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit

> >> Sent: Thursday, January 12, 2017 8:22 PM

> >> To: Alejandro Lucero

> >> Cc: Gregory Etelson; dev; users@dpdk.org

> >> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources

> Management

> >>

> >> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:

> >>>

> >>>

> >>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com

> >>> <mailto:ferruh.yigit@intel.com>> wrote:

> >>>

> >>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:

> >>>     > Hello,

> >>>     >

> >>>     > IGB_UIO driver does not close port PCI activities after DPDK process

> >> exits.

> >>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,

> >>>     > but it can be skipped if process receives SIGKILL signal

> >>>

> >>>     I guess I understand the problem.

> >>>

> >>>

> >>> This is a known problem, but it is not just a UIO problem, and this

> >>> patch does not solve it, maybe it just solves part of it.

> >>>

> >>> In fact, a DPDK program crashing could imply the NIC DMAing after that

> >>> and after that memory was assigned to another program.

> >>

> >> Yes.

> >> Can there be a way to stop NIC DMA, (or prevent it access to mem

> >> anymore) when app crashes?

> >> I think that is what this patch is looking for.

> >

> > If I understand it correctly, you are looking for this patch?

> > http://dpdk.org/dev/patchwork/patch/17495/

> >

> 

> That is good, thanks Jianfeng, I will check it.

> 

> btw, patch's current state is rejected, which is by mistake, it seems I

> confused it with "iomem and ioport mapping" patch, sorry about it, I

> will update its status immediately.


No problem at all. This patch is rejected as it's based on "iomem and ioport mapping" patch. As "iomem and ioport mapping" patch has backward compatibility issue, we need to figure out a way to resubmit this patch without changing the original "iomem and ioport mapping" in igb_uio.

Thanks,
Jianfeng
  
Alejandro Lucero Jan. 13, 2017, 11:10 a.m. UTC | #8
I completely misread the patch, and I wrongly thought that code was linked
to module removal, but I see this is not about that, but about releasing
the /dev/uio file calling release function, what is done by the kernel when
the process exits.

So yes, the patch avoids the problem I talked about.

However, calling that specific ixgbe driver function will break other
devices relying on igb_uio. What about implementing a notifier chain for
this? The igb_uio code would be agnostic and each interested driver, like
ixgbe or nfp_net, could execute the specific port close code when the
notifier chain triggers.


On Fri, Jan 13, 2017 at 5:33 AM, Tan, Jianfeng <jianfeng.tan@intel.com>
wrote:

>
>
> > -----Original Message-----
> > From: Yigit, Ferruh
> > Sent: Friday, January 13, 2017 10:05 AM
> > To: Tan, Jianfeng; Alejandro Lucero
> > Cc: Gregory Etelson; dev; users@dpdk.org
> > Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
> >
> > On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> > >> Sent: Thursday, January 12, 2017 8:22 PM
> > >> To: Alejandro Lucero
> > >> Cc: Gregory Etelson; dev; users@dpdk.org
> > >> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
> > Management
> > >>
> > >> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
> > >>>
> > >>>
> > >>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <
> ferruh.yigit@intel.com
> > >>> <mailto:ferruh.yigit@intel.com>> wrote:
> > >>>
> > >>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
> > >>>     > Hello,
> > >>>     >
> > >>>     > IGB_UIO driver does not close port PCI activities after DPDK
> process
> > >> exits.
> > >>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
> > >>>     > but it can be skipped if process receives SIGKILL signal
> > >>>
> > >>>     I guess I understand the problem.
> > >>>
> > >>>
> > >>> This is a known problem, but it is not just a UIO problem, and this
> > >>> patch does not solve it, maybe it just solves part of it.
> > >>>
> > >>> In fact, a DPDK program crashing could imply the NIC DMAing after
> that
> > >>> and after that memory was assigned to another program.
> > >>
> > >> Yes.
> > >> Can there be a way to stop NIC DMA, (or prevent it access to mem
> > >> anymore) when app crashes?
> > >> I think that is what this patch is looking for.
> > >
> > > If I understand it correctly, you are looking for this patch?
> > > http://dpdk.org/dev/patchwork/patch/17495/
> > >
> >
> > That is good, thanks Jianfeng, I will check it.
> >
> > btw, patch's current state is rejected, which is by mistake, it seems I
> > confused it with "iomem and ioport mapping" patch, sorry about it, I
> > will update its status immediately.
>
> No problem at all. This patch is rejected as it's based on "iomem and
> ioport mapping" patch. As "iomem and ioport mapping" patch has backward
> compatibility issue, we need to figure out a way to resubmit this patch
> without changing the original "iomem and ioport mapping" in igb_uio.
>
> Thanks,
> Jianfeng
>
  
Ferruh Yigit Jan. 19, 2017, 3:59 p.m. UTC | #9
On 1/13/2017 5:33 AM, Tan, Jianfeng wrote:
> 
> 
>> -----Original Message-----
>> From: Yigit, Ferruh
>> Sent: Friday, January 13, 2017 10:05 AM
>> To: Tan, Jianfeng; Alejandro Lucero
>> Cc: Gregory Etelson; dev; users@dpdk.org
>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
>>
>> On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>>>> Sent: Thursday, January 12, 2017 8:22 PM
>>>> To: Alejandro Lucero
>>>> Cc: Gregory Etelson; dev; users@dpdk.org
>>>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
>> Management
>>>>
>>>> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
>>>>>
>>>>>
>>>>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
>>>>> <mailto:ferruh.yigit@intel.com>> wrote:
>>>>>
>>>>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>>>>>     > Hello,
>>>>>     >
>>>>>     > IGB_UIO driver does not close port PCI activities after DPDK process
>>>> exits.
>>>>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
>>>>>     > but it can be skipped if process receives SIGKILL signal
>>>>>
>>>>>     I guess I understand the problem.
>>>>>
>>>>>
>>>>> This is a known problem, but it is not just a UIO problem, and this
>>>>> patch does not solve it, maybe it just solves part of it.
>>>>>
>>>>> In fact, a DPDK program crashing could imply the NIC DMAing after that
>>>>> and after that memory was assigned to another program.
>>>>
>>>> Yes.
>>>> Can there be a way to stop NIC DMA, (or prevent it access to mem
>>>> anymore) when app crashes?
>>>> I think that is what this patch is looking for.
>>>
>>> If I understand it correctly, you are looking for this patch?
>>> http://dpdk.org/dev/patchwork/patch/17495/
>>>
>>
>> That is good, thanks Jianfeng, I will check it.
>>
>> btw, patch's current state is rejected, which is by mistake, it seems I
>> confused it with "iomem and ioport mapping" patch, sorry about it, I
>> will update its status immediately.
> 
> No problem at all. This patch is rejected as it's based on "iomem and ioport mapping" patch. As "iomem and ioport mapping" patch has backward compatibility issue, we need to figure out a way to resubmit this patch without changing the original "iomem and ioport mapping" in igb_uio.

I thinks implementing uio_info->release and uio_info.open is good idea,
but I have a few questions:

1- What is the the dependency to "iomem and ioport mapping" patch?

2- If we keep pci_enable_device() in probe() can this prevent moving
registering/freeing interrupts in open()/release()

3- And is pci_disable_device() done in release is enough to stop NIC DMA
to access memory?


I did a simple test, implemented simple uio_info->release and
uio_info.open, which only does pci_disable_device() and
pci_enable_device(),
but this prevent app receiving packets in its second run, independent
from app terminated gracefully or not. Any idea why this is not working?


btw, I can produce the problematic case, as George Prekas described in:
http://dpdk.org/ml/archives/users/2016-September/001026.html

CC'ed George, since he also seems interested in issue.

> 
> Thanks,
> Jianfeng
>
  
George Prekas Jan. 19, 2017, 4:09 p.m. UTC | #10
Nice to see that the community is finally interested in this bug. For 
the record, we have fixed this bug in our project (IX). Check out our 
kernel module here:

https://github.com/ix-project/pcidma

I will be interested to know when and how DPDK will resolve this 
situation so that we can actually use your solution in our project.


On 19/01/2017 16:59, Ferruh Yigit wrote:
> On 1/13/2017 5:33 AM, Tan, Jianfeng wrote:
>>
>>> -----Original Message-----
>>> From: Yigit, Ferruh
>>> Sent: Friday, January 13, 2017 10:05 AM
>>> To: Tan, Jianfeng; Alejandro Lucero
>>> Cc: Gregory Etelson; dev; users@dpdk.org
>>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
>>>
>>> On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>>>>> Sent: Thursday, January 12, 2017 8:22 PM
>>>>> To: Alejandro Lucero
>>>>> Cc: Gregory Etelson; dev; users@dpdk.org
>>>>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
>>> Management
>>>>> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
>>>>>>
>>>>>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
>>>>>> <mailto:ferruh.yigit@intel.com>> wrote:
>>>>>>
>>>>>>      On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>>>>>>      > Hello,
>>>>>>      >
>>>>>>      > IGB_UIO driver does not close port PCI activities after DPDK process
>>>>> exits.
>>>>>>      > DPDK API provides rte_eth_dev_close() to manage port PCI,
>>>>>>      > but it can be skipped if process receives SIGKILL signal
>>>>>>
>>>>>>      I guess I understand the problem.
>>>>>>
>>>>>>
>>>>>> This is a known problem, but it is not just a UIO problem, and this
>>>>>> patch does not solve it, maybe it just solves part of it.
>>>>>>
>>>>>> In fact, a DPDK program crashing could imply the NIC DMAing after that
>>>>>> and after that memory was assigned to another program.
>>>>> Yes.
>>>>> Can there be a way to stop NIC DMA, (or prevent it access to mem
>>>>> anymore) when app crashes?
>>>>> I think that is what this patch is looking for.
>>>> If I understand it correctly, you are looking for this patch?
>>>> http://dpdk.org/dev/patchwork/patch/17495/
>>>>
>>> That is good, thanks Jianfeng, I will check it.
>>>
>>> btw, patch's current state is rejected, which is by mistake, it seems I
>>> confused it with "iomem and ioport mapping" patch, sorry about it, I
>>> will update its status immediately.
>> No problem at all. This patch is rejected as it's based on "iomem and ioport mapping" patch. As "iomem and ioport mapping" patch has backward compatibility issue, we need to figure out a way to resubmit this patch without changing the original "iomem and ioport mapping" in igb_uio.
> I thinks implementing uio_info->release and uio_info.open is good idea,
> but I have a few questions:
>
> 1- What is the the dependency to "iomem and ioport mapping" patch?
>
> 2- If we keep pci_enable_device() in probe() can this prevent moving
> registering/freeing interrupts in open()/release()
>
> 3- And is pci_disable_device() done in release is enough to stop NIC DMA
> to access memory?
>
>
> I did a simple test, implemented simple uio_info->release and
> uio_info.open, which only does pci_disable_device() and
> pci_enable_device(),
> but this prevent app receiving packets in its second run, independent
> from app terminated gracefully or not. Any idea why this is not working?
>
>
> btw, I can produce the problematic case, as George Prekas described in:
> http://dpdk.org/ml/archives/users/2016-September/001026.html
>
> CC'ed George, since he also seems interested in issue.
>
>> Thanks,
>> Jianfeng
>>
  
Ferruh Yigit Jan. 19, 2017, 4:36 p.m. UTC | #11
On 1/13/2017 11:10 AM, Alejandro Lucero wrote:
> I completely misread the patch, and I wrongly thought that code was
> linked to module removal, but I see this is not about that, but about
> releasing the /dev/uio file calling release function, what is done by
> the kernel when the process exits.
> 
> So yes, the patch avoids the problem I talked about.
> 
> However, calling that specific ixgbe driver function will break other
> devices relying on igb_uio. What about implementing a notifier chain for
> this? The igb_uio code would be agnostic and each interested driver,
> like ixgbe or nfp_net, could execute the specific port close code when
> the notifier chain triggers.

Can you please elaborate, how can this be done?
This is kernel code, and interested drivers are in userspace. This patch
benefit from Linux kernel driver of same device.

> 
> 
> On Fri, Jan 13, 2017 at 5:33 AM, Tan, Jianfeng <jianfeng.tan@intel.com
> <mailto:jianfeng.tan@intel.com>> wrote:
> 
> 
> 
>     > -----Original Message-----
>     > From: Yigit, Ferruh
>     > Sent: Friday, January 13, 2017 10:05 AM
>     > To: Tan, Jianfeng; Alejandro Lucero
>     > Cc: Gregory Etelson; dev; users@dpdk.org <mailto:users@dpdk.org>
>     > Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
>     >
>     > On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
>     > >
>     > >
>     > >> -----Original Message-----
>     > >> From: users [mailto:users-bounces@dpdk.org
>     <mailto:users-bounces@dpdk.org>] On Behalf Of Ferruh Yigit
>     > >> Sent: Thursday, January 12, 2017 8:22 PM
>     > >> To: Alejandro Lucero
>     > >> Cc: Gregory Etelson; dev; users@dpdk.org <mailto:users@dpdk.org>
>     > >> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
>     > Management
>     > >>
>     > >> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
>     > >>>
>     > >>>
>     > >>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit
>     <ferruh.yigit@intel.com <mailto:ferruh.yigit@intel.com>
>     > >>> <mailto:ferruh.yigit@intel.com
>     <mailto:ferruh.yigit@intel.com>>> wrote:
>     > >>>
>     > >>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>     > >>>     > Hello,
>     > >>>     >
>     > >>>     > IGB_UIO driver does not close port PCI activities after
>     DPDK process
>     > >> exits.
>     > >>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
>     > >>>     > but it can be skipped if process receives SIGKILL signal
>     > >>>
>     > >>>     I guess I understand the problem.
>     > >>>
>     > >>>
>     > >>> This is a known problem, but it is not just a UIO problem, and
>     this
>     > >>> patch does not solve it, maybe it just solves part of it.
>     > >>>
>     > >>> In fact, a DPDK program crashing could imply the NIC DMAing
>     after that
>     > >>> and after that memory was assigned to another program.
>     > >>
>     > >> Yes.
>     > >> Can there be a way to stop NIC DMA, (or prevent it access to mem
>     > >> anymore) when app crashes?
>     > >> I think that is what this patch is looking for.
>     > >
>     > > If I understand it correctly, you are looking for this patch?
>     > > http://dpdk.org/dev/patchwork/patch/17495/
>     <http://dpdk.org/dev/patchwork/patch/17495/>
>     > >
>     >
>     > That is good, thanks Jianfeng, I will check it.
>     >
>     > btw, patch's current state is rejected, which is by mistake, it
>     seems I
>     > confused it with "iomem and ioport mapping" patch, sorry about it, I
>     > will update its status immediately.
> 
>     No problem at all. This patch is rejected as it's based on "iomem
>     and ioport mapping" patch. As "iomem and ioport mapping" patch has
>     backward compatibility issue, we need to figure out a way to
>     resubmit this patch without changing the original "iomem and ioport
>     mapping" in igb_uio.
> 
>     Thanks,
>     Jianfeng
> 
>
  
Alejandro Lucero Jan. 19, 2017, 8:33 p.m. UTC | #12
On Thu, Jan 19, 2017 at 4:36 PM, Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 1/13/2017 11:10 AM, Alejandro Lucero wrote:
> > I completely misread the patch, and I wrongly thought that code was
> > linked to module removal, but I see this is not about that, but about
> > releasing the /dev/uio file calling release function, what is done by
> > the kernel when the process exits.
> >
> > So yes, the patch avoids the problem I talked about.
> >
> > However, calling that specific ixgbe driver function will break other
> > devices relying on igb_uio. What about implementing a notifier chain for
> > this? The igb_uio code would be agnostic and each interested driver,
> > like ixgbe or nfp_net, could execute the specific port close code when
> > the notifier chain triggers.
>
> Can you please elaborate, how can this be done?
> This is kernel code, and interested drivers are in userspace. This patch
> benefit from Linux kernel driver of same device.
>
>
The patch seems to add  a function to the ixgve_vf kernel driver which will
be called from igb_uio. The patch does not show the usual path to the
kernel ethernet drivers because, I guess, it is a patch for a specific
software build. So the interested drivers are not in user space, and a
driver, igb_uio, calling a userspace function? No way.

Any netdev driver interested to close ports, those left working by a
crashed DPDK app, could register a notifier like this:


int device_notifier(struct notifier_block *nb, unsigned long action, void
*data) {

...
    if (action == NETDEV_GOING_DOWN) {

...
}


struct notifier_block device_nb = {
   .notifier_call =device_notifier,
};

...

register_netdev_notifer(&device_nb);

...


Inside igb_uio, something like this:

raw_notifier_call_chain(&netdev_chain, val, dev);

which would trigger the invocation of all registered notifier functions.

Please, note that this is just a possibility, and the notifier chain to use
could not be the best one, same for the event NETDEV_GOING_DOWN.

Also, how can the drivers know if that event is for them?, This is
something to work on.

Because UIO is part of the kernel, maybe it could be pushed a specific
chain for this purpose if we can not use any currently available.


> >
> >
> > On Fri, Jan 13, 2017 at 5:33 AM, Tan, Jianfeng <jianfeng.tan@intel.com
> > <mailto:jianfeng.tan@intel.com>> wrote:
> >
> >
> >
> >     > -----Original Message-----
> >     > From: Yigit, Ferruh
> >     > Sent: Friday, January 13, 2017 10:05 AM
> >     > To: Tan, Jianfeng; Alejandro Lucero
> >     > Cc: Gregory Etelson; dev; users@dpdk.org <mailto:users@dpdk.org>
> >     > Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
> Management
> >     >
> >     > On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
> >     > >
> >     > >
> >     > >> -----Original Message-----
> >     > >> From: users [mailto:users-bounces@dpdk.org
> >     <mailto:users-bounces@dpdk.org>] On Behalf Of Ferruh Yigit
> >     > >> Sent: Thursday, January 12, 2017 8:22 PM
> >     > >> To: Alejandro Lucero
> >     > >> Cc: Gregory Etelson; dev; users@dpdk.org <mailto:users@dpdk.org
> >
> >     > >> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
> >     > Management
> >     > >>
> >     > >> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
> >     > >>>
> >     > >>>
> >     > >>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit
> >     <ferruh.yigit@intel.com <mailto:ferruh.yigit@intel.com>
> >     > >>> <mailto:ferruh.yigit@intel.com
> >     <mailto:ferruh.yigit@intel.com>>> wrote:
> >     > >>>
> >     > >>>     On 12/9/2016 8:54 AM, Gregory Etelson wrote:
> >     > >>>     > Hello,
> >     > >>>     >
> >     > >>>     > IGB_UIO driver does not close port PCI activities after
> >     DPDK process
> >     > >> exits.
> >     > >>>     > DPDK API provides rte_eth_dev_close() to manage port PCI,
> >     > >>>     > but it can be skipped if process receives SIGKILL signal
> >     > >>>
> >     > >>>     I guess I understand the problem.
> >     > >>>
> >     > >>>
> >     > >>> This is a known problem, but it is not just a UIO problem, and
> >     this
> >     > >>> patch does not solve it, maybe it just solves part of it.
> >     > >>>
> >     > >>> In fact, a DPDK program crashing could imply the NIC DMAing
> >     after that
> >     > >>> and after that memory was assigned to another program.
> >     > >>
> >     > >> Yes.
> >     > >> Can there be a way to stop NIC DMA, (or prevent it access to mem
> >     > >> anymore) when app crashes?
> >     > >> I think that is what this patch is looking for.
> >     > >
> >     > > If I understand it correctly, you are looking for this patch?
> >     > > http://dpdk.org/dev/patchwork/patch/17495/
> >     <http://dpdk.org/dev/patchwork/patch/17495/>
> >     > >
> >     >
> >     > That is good, thanks Jianfeng, I will check it.
> >     >
> >     > btw, patch's current state is rejected, which is by mistake, it
> >     seems I
> >     > confused it with "iomem and ioport mapping" patch, sorry about it,
> I
> >     > will update its status immediately.
> >
> >     No problem at all. This patch is rejected as it's based on "iomem
> >     and ioport mapping" patch. As "iomem and ioport mapping" patch has
> >     backward compatibility issue, we need to figure out a way to
> >     resubmit this patch without changing the original "iomem and ioport
> >     mapping" in igb_uio.
> >
> >     Thanks,
> >     Jianfeng
> >
> >
>
>
  
Jianfeng Tan Jan. 20, 2017, 5:09 a.m. UTC | #13
On 1/19/2017 11:59 PM, Ferruh Yigit wrote:
> On 1/13/2017 5:33 AM, Tan, Jianfeng wrote:
>>
>>> -----Original Message-----
>>> From: Yigit, Ferruh
>>> Sent: Friday, January 13, 2017 10:05 AM
>>> To: Tan, Jianfeng; Alejandro Lucero
>>> Cc: Gregory Etelson; dev; users@dpdk.org
>>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources Management
>>>
>>> On 1/13/2017 1:51 AM, Tan, Jianfeng wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>>>>> Sent: Thursday, January 12, 2017 8:22 PM
>>>>> To: Alejandro Lucero
>>>>> Cc: Gregory Etelson; dev; users@dpdk.org
>>>>> Subject: Re: [dpdk-users] [dpdk-dev] IGB_UIO: PCI Resources
>>> Management
>>>>> On 1/12/2017 12:12 PM, Alejandro Lucero wrote:
>>>>>>
>>>>>> On Thu, Jan 12, 2017 at 11:55 AM, Ferruh Yigit <ferruh.yigit@intel.com
>>>>>> <mailto:ferruh.yigit@intel.com>> wrote:
>>>>>>
>>>>>>      On 12/9/2016 8:54 AM, Gregory Etelson wrote:
>>>>>>      > Hello,
>>>>>>      >
>>>>>>      > IGB_UIO driver does not close port PCI activities after DPDK process
>>>>> exits.
>>>>>>      > DPDK API provides rte_eth_dev_close() to manage port PCI,
>>>>>>      > but it can be skipped if process receives SIGKILL signal
>>>>>>
>>>>>>      I guess I understand the problem.
>>>>>>
>>>>>>
>>>>>> This is a known problem, but it is not just a UIO problem, and this
>>>>>> patch does not solve it, maybe it just solves part of it.
>>>>>>
>>>>>> In fact, a DPDK program crashing could imply the NIC DMAing after that
>>>>>> and after that memory was assigned to another program.
>>>>> Yes.
>>>>> Can there be a way to stop NIC DMA, (or prevent it access to mem
>>>>> anymore) when app crashes?
>>>>> I think that is what this patch is looking for.
>>>> If I understand it correctly, you are looking for this patch?
>>>> http://dpdk.org/dev/patchwork/patch/17495/
>>>>
>>> That is good, thanks Jianfeng, I will check it.
>>>
>>> btw, patch's current state is rejected, which is by mistake, it seems I
>>> confused it with "iomem and ioport mapping" patch, sorry about it, I
>>> will update its status immediately.
>> No problem at all. This patch is rejected as it's based on "iomem and ioport mapping" patch. As "iomem and ioport mapping" patch has backward compatibility issue, we need to figure out a way to resubmit this patch without changing the original "iomem and ioport mapping" in igb_uio.
> I thinks implementing uio_info->release and uio_info.open is good idea,
> but I have a few questions:
>
> 1- What is the the dependency to "iomem and ioport mapping" patch?

igb_uio is based on UIO framework to do iomem and ioport mapping, and 
it's done once in probe. If we do pci_disable_device() in release(), 
iomem/ioport mapping will be missing. And I did not figure out a way to 
do the iomem/ioport remapping in open() (may be we can check how 
vfio-pci succeeds to do this).


>
> 2- If we keep pci_enable_device() in probe() can this prevent moving
> registering/freeing interrupts in open()/release()

Yes. But then how can we stop a device in release()?

>
> 3- And is pci_disable_device() done in release is enough to stop NIC DMA
> to access memory?

To find out the answer, we can also use vfio-pci as a reference. Please 
refer to vfio_pci_release().

>
>
> I did a simple test, implemented simple uio_info->release and
> uio_info.open, which only does pci_disable_device() and
> pci_enable_device(),
> but this prevent app receiving packets in its second run, independent
> from app terminated gracefully or not. Any idea why this is not working?

After calling pci_disable_device() in first uio_info->release, 
iomem/ioport remap is missing. So my original idea is to let DPDK 
initialization not depends on this igb_uio's sysfs files, instead use 
the info what pci bus driver provides.

Thanks,
Jianfeng

>
>
> btw, I can produce the problematic case, as George Prekas described in:
> http://dpdk.org/ml/archives/users/2016-September/001026.html
>
> CC'ed George, since he also seems interested in issue.
>
>> Thanks,
>> Jianfeng
>>
  

Patch

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index df41e45..97b8501 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -170,6 +170,16 @@  igbuio_pci_irqhandler(int irq, struct uio_info *info)
        return IRQ_HANDLED;
 }
 
+static int
+igbuio_pci_release(struct uio_info *info, struct inode *inode)
+{
+       extern int ixgbevf_uio_release(struct pci_dev *pdev, u8 __iomem *hw_addr);
+       int ret;
+       struct rte_uio_pci_dev *udev = info->priv;
+       ret = ixgbevf_uio_release(udev->pdev, info->mem[0].internal_addr);
+       return 0;
+}
+
 #ifdef CONFIG_XEN_DOM0
 static int
 igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct *vma)
@@ -368,6 +378,7 @@  igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
        udev->info.version = "0.1";
        udev->info.handler = igbuio_pci_irqhandler;
        udev->info.irqcontrol = igbuio_pci_irqcontrol;
+       udev->info.release = igbuio_pci_release;
 #ifdef CONFIG_XEN_DOM0
        /* check if the driver run on Xen Dom0 */
        if (xen_initial_domain())





diff --git a/src/ixgbevf_main.c b/src/ixgbevf_main.c
index c5a922f..b96d9f0 100644
--- a/src/ixgbevf_main.c
+++ b/src/ixgbevf_main.c
@@ -3002,6 +3002,39 @@  static void ixgbevf_clear_interrupt_scheme(struct ixgbevf_adapter *adapter)
        ixgbevf_reset_interrupt_capability(adapter);
 }
 
+int ixgbevf_uio_release(struct pci_dev *pdev, u8 __iomem *hw_addr)
+{
+       int err;
+       struct ixgbe_hw __hw;
+       struct ixgbe_hw *hw = &__hw;
+
+       if (!hw_addr) {
+               dev_warn(&pdev->dev, "ixgbevf_uio_release: no HW addr. Bailing out\n");
+       }
+
+       memset(hw, 0, sizeof(*hw));
+       hw->hw_addr = hw_addr;
+
+       /* PCI config space info */
+       hw->vendor_id = pdev->vendor;
+       hw->device_id = pdev->device;
+       pci_read_config_byte(pdev, PCI_REVISION_ID, &hw->revision_id);
+       hw->subsystem_vendor_id = pdev->subsystem_vendor;
+       hw->subsystem_device_id = pdev->subsystem_device;
+
+       ixgbe_init_ops_vf(hw);
+       hw->mbx.ops.init_params(hw);
+
+       /* assume legacy case in which PF would only give VF 2 queues */
+       hw->mac.max_tx_queues = 2;
+       hw->mac.max_rx_queues = 2;
+
+       err = hw->mac.ops.stop_adapter(hw);
+       dev_info(&pdev->dev, "ixgbevf_uio_release: UIO adapter stopped\n");
+       return err;
+}
+EXPORT_SYMBOL(ixgbevf_uio_release);
+
 /**
  * ixgbevf_sw_init - Initialize general software structures
  * (struct ixgbevf_adapter)