[dpdk-dev,v6,9/9] table: align rte table hash structs for cache line size

Message ID 1471343279-24014-10-git-send-email-gowrishankar.m@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Gowrishankar Aug. 16, 2016, 10:27 a.m. UTC
  rte table hash structs rte_bucket_4_8, rte_bucket_4_16 and rte_bucket_4_32 have
to be cache aligned as required by their corresponding hash create functions
rte_table_hash_create_key8_lru etc.

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
---
 lib/librte_table/rte_table_hash_key16.c | 4 ++--
 lib/librte_table/rte_table_hash_key32.c | 4 ++--
 lib/librte_table/rte_table_hash_key8.c  | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)
  

Comments

Cristian Dumitrescu Aug. 31, 2016, 5:29 p.m. UTC | #1
> -----Original Message-----
> From: Gowrishankar Muthukrishnan
> [mailto:gowrishankar.m@linux.vnet.ibm.com]
> Sent: Tuesday, August 16, 2016 11:28 AM
> To: dev@dpdk.org
> Cc: Chao Zhu <chaozhu@linux.vnet.ibm.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Thomas Monjalon
> <thomas.monjalon@6wind.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Pradeep <pradeep@us.ibm.com>
> Subject: [PATCH v6 9/9] table: align rte table hash structs for cache line size
> 
> rte table hash structs rte_bucket_4_8, rte_bucket_4_16 and
> rte_bucket_4_32 have
> to be cache aligned as required by their corresponding hash create functions
> rte_table_hash_create_key8_lru etc.
> 
> Signed-off-by: Gowrishankar Muthukrishnan
> <gowrishankar.m@linux.vnet.ibm.com>
> ---
>  lib/librte_table/rte_table_hash_key16.c | 4 ++--
>  lib/librte_table/rte_table_hash_key32.c | 4 ++--
>  lib/librte_table/rte_table_hash_key8.c  | 2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_key16.c
> b/lib/librte_table/rte_table_hash_key16.c
> index b7e000f..2102326 100644
> --- a/lib/librte_table/rte_table_hash_key16.c
> +++ b/lib/librte_table/rte_table_hash_key16.c
> @@ -68,10 +68,10 @@ struct rte_bucket_4_16 {
>  	uint64_t next_valid;
> 
>  	/* Cache line 1 */
> -	uint64_t key[4][2];
> +	uint64_t key[4][2] __rte_cache_aligned;
> 
>  	/* Cache line 2 */
> -	uint8_t data[0];
> +	uint8_t data[0] __rte_cache_aligned;
>  };
> 
>  struct rte_table_hash {
> diff --git a/lib/librte_table/rte_table_hash_key32.c
> b/lib/librte_table/rte_table_hash_key32.c
> index a7aba49..619f63a 100644
> --- a/lib/librte_table/rte_table_hash_key32.c
> +++ b/lib/librte_table/rte_table_hash_key32.c
> @@ -68,10 +68,10 @@ struct rte_bucket_4_32 {
>  	uint64_t next_valid;
> 
>  	/* Cache lines 1 and 2 */
> -	uint64_t key[4][4];
> +	uint64_t key[4][4] __rte_cache_aligned;
> 
>  	/* Cache line 3 */
> -	uint8_t data[0];
> +	uint8_t data[0] __rte_cache_aligned;
>  };
> 
>  struct rte_table_hash {
> diff --git a/lib/librte_table/rte_table_hash_key8.c
> b/lib/librte_table/rte_table_hash_key8.c
> index e2e2bdc..4d5e0cd 100644
> --- a/lib/librte_table/rte_table_hash_key8.c
> +++ b/lib/librte_table/rte_table_hash_key8.c
> @@ -68,7 +68,7 @@ struct rte_bucket_4_8 {
>  	uint64_t key[4];
> 
>  	/* Cache line 1 */
> -	uint8_t data[0];
> +	uint8_t data[0] __rte_cache_aligned;
>  };
> 
>  struct rte_table_hash {
> --
> 1.9.1

Hi Gowrishankar,

My understanding is you are trying to work around the check invoked by the hash table create functions that verifies the size of the bucket header structure is a multiple of the cache line, right?

Given that the size of this structure is 1x, 2x or 3x times 64 bytes, the check passes on IA CPUs (cache line of 64 bytes; explicit alignment to cache line size is not needed in order to make the size of the structure a multiple of cache line), but not on PPC CPUs (cache line of 128 bytes), correct?

I don't think your proposal provides the best way to fix this issue, since your code leads to a considerable increase in the memory consumption used per bucket in most cases:
	- 100% more memory for 8-byte key hash table
	- 0% more for 16-byte key hash table (code does not fix anything, explicit alignment is not needed)
	- 50% more for 32-byte key hash table

I suggest you simply change the check: instead of validating this data structure is a multiple of cache line size, validate it is a multiple of 64 bytes.

Regards,
Cristian
  
Thomas Monjalon Sept. 8, 2016, 9:36 a.m. UTC | #2
2016-08-31 17:29, Dumitrescu, Cristian:
> From: Gowrishankar Muthukrishnan
> > rte table hash structs rte_bucket_4_8, rte_bucket_4_16 and
> > rte_bucket_4_32 have
> > to be cache aligned as required by their corresponding hash create functions
> > rte_table_hash_create_key8_lru etc.
> 
> Hi Gowrishankar,
> 
> My understanding is you are trying to work around the check invoked by the hash table create functions that verifies the size of the bucket header structure is a multiple of the cache line, right?
> 
> Given that the size of this structure is 1x, 2x or 3x times 64 bytes, the check passes on IA CPUs (cache line of 64 bytes; explicit alignment to cache line size is not needed in order to make the size of the structure a multiple of cache line), but not on PPC CPUs (cache line of 128 bytes), correct?
> 
> I don't think your proposal provides the best way to fix this issue, since your code leads to a considerable increase in the memory consumption used per bucket in most cases:
> 	- 100% more memory for 8-byte key hash table
> 	- 0% more for 16-byte key hash table (code does not fix anything, explicit alignment is not needed)
> 	- 50% more for 32-byte key hash table
> 
> I suggest you simply change the check: instead of validating this data structure is a multiple of cache line size, validate it is a multiple of 64 bytes.

Any news please?
The whole series is blocked for this patch.
Should we expect a v7?
  
Cristian Dumitrescu Sept. 8, 2016, 9:40 a.m. UTC | #3
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, September 8, 2016 10:36 AM
> To: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> Chao Zhu <chaozhu@linux.vnet.ibm.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Pradeep <pradeep@us.ibm.com>
> Subject: Re: [dpdk-dev] [PATCH v6 9/9] table: align rte table hash structs for
> cache line size
> 
> 2016-08-31 17:29, Dumitrescu, Cristian:
> > From: Gowrishankar Muthukrishnan
> > > rte table hash structs rte_bucket_4_8, rte_bucket_4_16 and
> > > rte_bucket_4_32 have
> > > to be cache aligned as required by their corresponding hash create
> functions
> > > rte_table_hash_create_key8_lru etc.
> >
> > Hi Gowrishankar,
> >
> > My understanding is you are trying to work around the check invoked by
> the hash table create functions that verifies the size of the bucket header
> structure is a multiple of the cache line, right?
> >
> > Given that the size of this structure is 1x, 2x or 3x times 64 bytes, the check
> passes on IA CPUs (cache line of 64 bytes; explicit alignment to cache line size
> is not needed in order to make the size of the structure a multiple of cache
> line), but not on PPC CPUs (cache line of 128 bytes), correct?
> >
> > I don't think your proposal provides the best way to fix this issue, since
> your code leads to a considerable increase in the memory consumption used
> per bucket in most cases:
> > 	- 100% more memory for 8-byte key hash table
> > 	- 0% more for 16-byte key hash table (code does not fix anything,
> explicit alignment is not needed)
> > 	- 50% more for 32-byte key hash table
> >
> > I suggest you simply change the check: instead of validating this data
> structure is a multiple of cache line size, validate it is a multiple of 64 bytes.
> 
> Any news please?
> The whole series is blocked for this patch.
> Should we expect a v7?

Yes, I think we should. Small fix for a considerable benefit.
  
Gowrishankar Sept. 8, 2016, 4:52 p.m. UTC | #4
Thanks Cristian and Thomas for your feedback. I have taken your suggestions
and sent out v7. Please check if the new patch is fine.

Thanks,
Gowrishankar

On Thursday 08 September 2016 03:10 PM, Dumitrescu, Cristian wrote:
>
>> -----Original Message-----
>> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
>> Sent: Thursday, September 8, 2016 10:36 AM
>> To: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
>> Chao Zhu <chaozhu@linux.vnet.ibm.com>; Richardson, Bruce
>> <bruce.richardson@intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev@intel.com>; Pradeep <pradeep@us.ibm.com>
>> Subject: Re: [dpdk-dev] [PATCH v6 9/9] table: align rte table hash structs for
>> cache line size
>>
>> 2016-08-31 17:29, Dumitrescu, Cristian:
>>> From: Gowrishankar Muthukrishnan
>>>> rte table hash structs rte_bucket_4_8, rte_bucket_4_16 and
>>>> rte_bucket_4_32 have
>>>> to be cache aligned as required by their corresponding hash create
>> functions
>>>> rte_table_hash_create_key8_lru etc.
>>> Hi Gowrishankar,
>>>
>>> My understanding is you are trying to work around the check invoked by
>> the hash table create functions that verifies the size of the bucket header
>> structure is a multiple of the cache line, right?
>>> Given that the size of this structure is 1x, 2x or 3x times 64 bytes, the check
>> passes on IA CPUs (cache line of 64 bytes; explicit alignment to cache line size
>> is not needed in order to make the size of the structure a multiple of cache
>> line), but not on PPC CPUs (cache line of 128 bytes), correct?
>>> I don't think your proposal provides the best way to fix this issue, since
>> your code leads to a considerable increase in the memory consumption used
>> per bucket in most cases:
>>> 	- 100% more memory for 8-byte key hash table
>>> 	- 0% more for 16-byte key hash table (code does not fix anything,
>> explicit alignment is not needed)
>>> 	- 50% more for 32-byte key hash table
>>>
>>> I suggest you simply change the check: instead of validating this data
>> structure is a multiple of cache line size, validate it is a multiple of 64 bytes.
>>
>> Any news please?
>> The whole series is blocked for this patch.
>> Should we expect a v7?
> Yes, I think we should. Small fix for a considerable benefit.
>
>
  

Patch

diff --git a/lib/librte_table/rte_table_hash_key16.c b/lib/librte_table/rte_table_hash_key16.c
index b7e000f..2102326 100644
--- a/lib/librte_table/rte_table_hash_key16.c
+++ b/lib/librte_table/rte_table_hash_key16.c
@@ -68,10 +68,10 @@  struct rte_bucket_4_16 {
 	uint64_t next_valid;
 
 	/* Cache line 1 */
-	uint64_t key[4][2];
+	uint64_t key[4][2] __rte_cache_aligned;
 
 	/* Cache line 2 */
-	uint8_t data[0];
+	uint8_t data[0] __rte_cache_aligned;
 };
 
 struct rte_table_hash {
diff --git a/lib/librte_table/rte_table_hash_key32.c b/lib/librte_table/rte_table_hash_key32.c
index a7aba49..619f63a 100644
--- a/lib/librte_table/rte_table_hash_key32.c
+++ b/lib/librte_table/rte_table_hash_key32.c
@@ -68,10 +68,10 @@  struct rte_bucket_4_32 {
 	uint64_t next_valid;
 
 	/* Cache lines 1 and 2 */
-	uint64_t key[4][4];
+	uint64_t key[4][4] __rte_cache_aligned;
 
 	/* Cache line 3 */
-	uint8_t data[0];
+	uint8_t data[0] __rte_cache_aligned;
 };
 
 struct rte_table_hash {
diff --git a/lib/librte_table/rte_table_hash_key8.c b/lib/librte_table/rte_table_hash_key8.c
index e2e2bdc..4d5e0cd 100644
--- a/lib/librte_table/rte_table_hash_key8.c
+++ b/lib/librte_table/rte_table_hash_key8.c
@@ -68,7 +68,7 @@  struct rte_bucket_4_8 {
 	uint64_t key[4];
 
 	/* Cache line 1 */
-	uint8_t data[0];
+	uint8_t data[0] __rte_cache_aligned;
 };
 
 struct rte_table_hash {