Message ID | 1468936391-138371-1-git-send-email-jianfeng.tan@intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 80CE04A65; Tue, 19 Jul 2016 15:53:34 +0200 (CEST) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 058C9379B for <dev@dpdk.org>; Tue, 19 Jul 2016 15:53:32 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 19 Jul 2016 06:53:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.28,389,1464678000"; d="scan'208"; a="1024793361" Received: from dpdk06.sh.intel.com ([10.239.129.195]) by fmsmga002.fm.intel.com with ESMTP; 19 Jul 2016 06:53:28 -0700 From: Jianfeng Tan <jianfeng.tan@intel.com> To: dev@dpdk.org Cc: yuanhan.liu@linux.intel.com, zhihong.wang@intel.com, Jianfeng Tan <jianfeng.tan@intel.com> Date: Tue, 19 Jul 2016 13:53:11 +0000 Message-Id: <1468936391-138371-1-git-send-email-jianfeng.tan@intel.com> X-Mailer: git-send-email 2.7.4 Subject: [dpdk-dev] [PATCH] examples/vhost: fix perf regression X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Jianfeng Tan
July 19, 2016, 1:53 p.m. UTC
We find significant perfermance drop introduced by below commit,
when vhost example is started with --mergeable 0 and inside vm,
kernel virtio-net driver is used to do ip based forwarding.
The root cause is that below commit adds support for
VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when
mergeable is disabled, it triggers big_packets path of virtio-net
driver. In this path, virtio driver uses 19 desc with 18 4K-sized
pages to receive each packet, so that it can receive a big packet
with size of 64K. But QEMU only creates 256 desc entries for each
vq, which results in that only 13 packets can be received. VM
kernel can quickly handle those packets and go to sleep (HLT).
As QEMU has no option to set the desc entries of a vq, so here,
we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6
with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we
disable tso of vhost example, to avoid VM kernel virtio driver
go into big_packets path.
Fixes: 859b480d5afd ("vhost: add guest offload setting")
Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
examples/vhost/main.c | 2 ++
1 file changed, 2 insertions(+)
Comments
On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: > We find significant perfermance drop introduced by below commit, > when vhost example is started with --mergeable 0 and inside vm, > kernel virtio-net driver is used to do ip based forwarding. > > The root cause is that below commit adds support for > VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when > mergeable is disabled, it triggers big_packets path of virtio-net > driver. In this path, virtio driver uses 19 desc with 18 4K-sized > pages to receive each packet, so that it can receive a big packet > with size of 64K. But QEMU only creates 256 desc entries for each > vq, which results in that only 13 packets can be received. VM > kernel can quickly handle those packets and go to sleep (HLT). > > As QEMU has no option to set the desc entries of a vq, so here, > we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 > with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we > disable tso of vhost example, to avoid VM kernel virtio driver > go into big_packets path. > > Fixes: 859b480d5afd ("vhost: add guest offload setting") > > Reported-by: Qian Xu <qian.q.xu@intel.com> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> We could apply this patch, but I don't think it actually fix anything: - it doesn't fix other vhost applications, say OVS, which is for sure way more widly used than vhost-example. - it doesn't even fix it when tso is enabled and mergeable-rx is disabled with this vhost-example. Thanks for the good root-cause, btw! --yliu
Hi Yuanhan, On 7/20/2016 9:44 AM, Yuanhan Liu wrote: > On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: >> We find significant perfermance drop introduced by below commit, >> when vhost example is started with --mergeable 0 and inside vm, >> kernel virtio-net driver is used to do ip based forwarding. >> >> The root cause is that below commit adds support for >> VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when >> mergeable is disabled, it triggers big_packets path of virtio-net >> driver. In this path, virtio driver uses 19 desc with 18 4K-sized >> pages to receive each packet, so that it can receive a big packet >> with size of 64K. But QEMU only creates 256 desc entries for each >> vq, which results in that only 13 packets can be received. VM >> kernel can quickly handle those packets and go to sleep (HLT). >> >> As QEMU has no option to set the desc entries of a vq, so here, >> we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 >> with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we >> disable tso of vhost example, to avoid VM kernel virtio driver >> go into big_packets path. >> >> Fixes: 859b480d5afd ("vhost: add guest offload setting") >> >> Reported-by: Qian Xu <qian.q.xu@intel.com> >> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> > We could apply this patch, but I don't think it actually fix anything: > > - it doesn't fix other vhost applications, say OVS, which is for sure > way more widly used than vhost-example. If I remember it correctly, OVS will enable mergeable. > > - it doesn't even fix it when tso is enabled and mergeable-rx is disabled > with this vhost-example. But we'd better avoid users go into such doubt that performance drops because of that commit under the case tso=off,mergeable=off, right? Thanks, Jianfeng > > Thanks for the good root-cause, btw! > > --yliu
My comments below. -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tan, Jianfeng Sent: Wednesday, July 20, 2016 10:44 AM To: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: dev@dpdk.org; Wang, Zhihong <zhihong.wang@intel.com> Subject: Re: [dpdk-dev] [PATCH] examples/vhost: fix perf regression Hi Yuanhan, On 7/20/2016 9:44 AM, Yuanhan Liu wrote: > On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: >> We find significant perfermance drop introduced by below commit, when >> vhost example is started with --mergeable 0 and inside vm, kernel >> virtio-net driver is used to do ip based forwarding. >> >> The root cause is that below commit adds support for >> VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when >> mergeable is disabled, it triggers big_packets path of virtio-net >> driver. In this path, virtio driver uses 19 desc with 18 4K-sized >> pages to receive each packet, so that it can receive a big packet >> with size of 64K. But QEMU only creates 256 desc entries for each vq, >> which results in that only 13 packets can be received. VM kernel can >> quickly handle those packets and go to sleep (HLT). >> >> As QEMU has no option to set the desc entries of a vq, so here, we >> disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 with >> VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we disable tso >> of vhost example, to avoid VM kernel virtio driver go into >> big_packets path. >> >> Fixes: 859b480d5afd ("vhost: add guest offload setting") >> >> Reported-by: Qian Xu <qian.q.xu@intel.com> >> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> > We could apply this patch, but I don't think it actually fix anything: > > - it doesn't fix other vhost applications, say OVS, which is for sure > way more widly used than vhost-example. If I remember it correctly, OVS will enable mergeable. > > - it doesn't even fix it when tso is enabled and mergeable-rx is disabled > with this vhost-example. But we'd better avoid users go into such doubt that performance drops because of that commit under the case tso=off,mergeable=off, right? Normally, when people enable TSO, they should turn on mergeable, if they don't turn on mergeable, then please don't expect high performance, so this is not a problem. They may get low performance due to the improper settings. As to a complete fix for the issue, we may need go back to the TSO feature design for vhost, currently, the feature negotiation code is in the application, but it's better to be considered in the vhost/virtio library so that application doesn't need to check/set the feature. But now it's too late for the complete fix, so the workaround is ok for this release from my view. Thanks, Jianfeng > > Thanks for the good root-cause, btw! > > --yliu
On Wed, Jul 20, 2016 at 10:44:13AM +0800, Tan, Jianfeng wrote: > Hi Yuanhan, > > On 7/20/2016 9:44 AM, Yuanhan Liu wrote: > >On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: > >>We find significant perfermance drop introduced by below commit, > >>when vhost example is started with --mergeable 0 and inside vm, > >>kernel virtio-net driver is used to do ip based forwarding. > >> > >>The root cause is that below commit adds support for > >>VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when > >>mergeable is disabled, it triggers big_packets path of virtio-net > >>driver. In this path, virtio driver uses 19 desc with 18 4K-sized > >>pages to receive each packet, so that it can receive a big packet > >>with size of 64K. But QEMU only creates 256 desc entries for each > >>vq, which results in that only 13 packets can be received. VM > >>kernel can quickly handle those packets and go to sleep (HLT). > >> > >>As QEMU has no option to set the desc entries of a vq, so here, > >>we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 > >>with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we > >>disable tso of vhost example, to avoid VM kernel virtio driver > >>go into big_packets path. > >> > >>Fixes: 859b480d5afd ("vhost: add guest offload setting") > >> > >>Reported-by: Qian Xu <qian.q.xu@intel.com> > >>Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> > >We could apply this patch, but I don't think it actually fix anything: > > > >- it doesn't fix other vhost applications, say OVS, which is for sure > > way more widly used than vhost-example. > > If I remember it correctly, OVS will enable mergeable. Yes, and actually, vhost-example also should have enabled it by default. Meanwhile, all features could be enabled/disabled by user. > > > >- it doesn't even fix it when tso is enabled and mergeable-rx is disabled > > with this vhost-example. > > But we'd better avoid users go into such doubt that performance drops > because of that commit under the case tso=off,mergeable=off, right? I doubt people would actually use vhost-example (besides developer like us), meaning they can NOT see the benifit from this patch; it also means that user __does__ go into doubt that performance drops for the case tso=off,mergeable=off. Actually, it looks wrong to me to fiddle with those flags in the vhost-example. If you want to disable tso, you should go disable it on the qemu side, with something like: csum=off,gso=off,guest_tso4=off,guest_tso6=off,... --yliu
On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: > We find significant perfermance drop introduced by below commit, > when vhost example is started with --mergeable 0 and inside vm, > kernel virtio-net driver is used to do ip based forwarding. > > The root cause is that below commit adds support for > VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when > mergeable is disabled, it triggers big_packets path of virtio-net > driver. In this path, virtio driver uses 19 desc with 18 4K-sized > pages to receive each packet, so that it can receive a big packet > with size of 64K. But QEMU only creates 256 desc entries for each > vq, which results in that only 13 packets can be received. VM > kernel can quickly handle those packets and go to sleep (HLT). > > As QEMU has no option to set the desc entries of a vq, so here, > we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 > with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we > disable tso of vhost example, to avoid VM kernel virtio driver > go into big_packets path. > > Fixes: 859b480d5afd ("vhost: add guest offload setting") And here you are patching vhost example to try to fix an "issue" in vhost lib, this is __logically__ wrong. --yliu > > Reported-by: Qian Xu <qian.q.xu@intel.com> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> > --- > examples/vhost/main.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/examples/vhost/main.c b/examples/vhost/main.c > index 3b98f42..92a9823 100644 > --- a/examples/vhost/main.c > +++ b/examples/vhost/main.c > @@ -327,6 +327,8 @@ port_init(uint8_t port) > if (enable_tso == 0) { > rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4); > rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO6); > + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_GUEST_TSO4); > + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_GUEST_TSO6); > } > > rx_rings = (uint16_t)dev_info.max_rx_queues; > -- > 2.7.4
On 7/20/2016 12:38 PM, Yuanhan Liu wrote: > On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: >> We find significant perfermance drop introduced by below commit, >> when vhost example is started with --mergeable 0 and inside vm, >> kernel virtio-net driver is used to do ip based forwarding. >> >> The root cause is that below commit adds support for >> VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when >> mergeable is disabled, it triggers big_packets path of virtio-net >> driver. In this path, virtio driver uses 19 desc with 18 4K-sized >> pages to receive each packet, so that it can receive a big packet >> with size of 64K. But QEMU only creates 256 desc entries for each >> vq, which results in that only 13 packets can be received. VM >> kernel can quickly handle those packets and go to sleep (HLT). >> >> As QEMU has no option to set the desc entries of a vq, so here, >> we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 >> with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we >> disable tso of vhost example, to avoid VM kernel virtio driver >> go into big_packets path. >> >> Fixes: 859b480d5afd ("vhost: add guest offload setting") > And here you are patching vhost example to try to fix an "issue" > in vhost lib, this is __logically__ wrong. > > --yliu This is not an issue from vhost lib's perspective, vhost lib should provide all features it supports by default. Applications can enable/disable features according to their own requirements. And the vhost example after this commit just triggers a slow path of virtio driver. So this fix just makes sure vhost example does not go into the slow path by default. By the way, if a fix patch should only involve those commits it will change? Thanks, Jianfeng
On Wed, Jul 20, 2016 at 01:50:34PM +0800, Tan, Jianfeng wrote: > > > On 7/20/2016 12:38 PM, Yuanhan Liu wrote: > >On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: > >>We find significant perfermance drop introduced by below commit, > >>when vhost example is started with --mergeable 0 and inside vm, > >>kernel virtio-net driver is used to do ip based forwarding. > >> > >>The root cause is that below commit adds support for > >>VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when > >>mergeable is disabled, it triggers big_packets path of virtio-net > >>driver. In this path, virtio driver uses 19 desc with 18 4K-sized > >>pages to receive each packet, so that it can receive a big packet > >>with size of 64K. But QEMU only creates 256 desc entries for each > >>vq, which results in that only 13 packets can be received. VM > >>kernel can quickly handle those packets and go to sleep (HLT). > >> > >>As QEMU has no option to set the desc entries of a vq, so here, > >>we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 > >>with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we > >>disable tso of vhost example, to avoid VM kernel virtio driver > >>go into big_packets path. > >> > >>Fixes: 859b480d5afd ("vhost: add guest offload setting") > >And here you are patching vhost example to try to fix an "issue" > >in vhost lib, this is __logically__ wrong. > > > > --yliu > > This is not an issue from vhost lib's perspective, vhost lib should provide > all features it supports by default. Bingo.., that's why "Fixes: 859b480d5afd ... " is wrong to me. > Applications can enable/disable > features according to their own requirements. Yes, application can, but application normally doesn't do that. And as stated in my early reply, the qemu is the place you should go for all those options enabling/disabling, but not vhost (not vhost-example). I think it's sometimes more handy if we can do that by introducing some vhost-example options, and I guess that's why those options are given. In another word, there is nothing wrong about the commit 859b480d5afd, if you want to "fix" anything here, following commit is something we need fix: Fixes: 9fd72e3cbd29 ("examples/vhost: add virtio offload") Because that commit just partially disables some TSO related features, letting the virtio net driver goes to the slow path. > And the vhost example after > this commit just triggers a slow path of virtio driver. So this fix just > makes sure vhost example does not go into the slow path by default. I have made a statement in the first time, that I am not object to have this patch at all. Meanwhile, the right "fix" is you need disable all TSO related features from QEMU, in such way, we should see no such issue from all vhost application, but not only this one, the one we used mostly internally. As you can see, it's more about the usage. > By the way, if a fix patch should only involve those commits it will change? IMO, logically, yes. --yliu
On 7/20/2016 2:13 PM, Yuanhan Liu wrote: > On Wed, Jul 20, 2016 at 01:50:34PM +0800, Tan, Jianfeng wrote: >> >> On 7/20/2016 12:38 PM, Yuanhan Liu wrote: >>> On Tue, Jul 19, 2016 at 01:53:11PM +0000, Jianfeng Tan wrote: >>>> We find significant perfermance drop introduced by below commit, >>>> when vhost example is started with --mergeable 0 and inside vm, >>>> kernel virtio-net driver is used to do ip based forwarding. >>>> >>>> The root cause is that below commit adds support for >>>> VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, and when >>>> mergeable is disabled, it triggers big_packets path of virtio-net >>>> driver. In this path, virtio driver uses 19 desc with 18 4K-sized >>>> pages to receive each packet, so that it can receive a big packet >>>> with size of 64K. But QEMU only creates 256 desc entries for each >>>> vq, which results in that only 13 packets can be received. VM >>>> kernel can quickly handle those packets and go to sleep (HLT). >>>> >>>> As QEMU has no option to set the desc entries of a vq, so here, >>>> we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 >>>> with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we >>>> disable tso of vhost example, to avoid VM kernel virtio driver >>>> go into big_packets path. >>>> >>>> Fixes: 859b480d5afd ("vhost: add guest offload setting") >>> And here you are patching vhost example to try to fix an "issue" >>> in vhost lib, this is __logically__ wrong. >>> >>> --yliu >> This is not an issue from vhost lib's perspective, vhost lib should provide >> all features it supports by default. > Bingo.., that's why "Fixes: 859b480d5afd ... " is wrong to me. > >> Applications can enable/disable >> features according to their own requirements. > Yes, application can, but application normally doesn't do that. And > as stated in my early reply, the qemu is the place you should go for > all those options enabling/disabling, but not vhost (not vhost-example). > > I think it's sometimes more handy if we can do that by introducing > some vhost-example options, and I guess that's why those options are > given. > > In another word, there is nothing wrong about the commit 859b480d5afd, > if you want to "fix" anything here, following commit is something > we need fix: > > Fixes: 9fd72e3cbd29 ("examples/vhost: add virtio offload") > > Because that commit just partially disables some TSO related features, > letting the virtio net driver goes to the slow path. Great, I see. And thanks for detailed clarification. I'll send v2. > >> And the vhost example after >> this commit just triggers a slow path of virtio driver. So this fix just >> makes sure vhost example does not go into the slow path by default. > I have made a statement in the first time, that I am not object to > have this patch at all. > > Meanwhile, the right "fix" is you need disable all TSO related features > from QEMU, in such way, we should see no such issue from all vhost > application, but not only this one, the one we used mostly internally. > > As you can see, it's more about the usage. Yes, I agree this is the BKM we should adopt and recommend users to use. Thanks, Jianfeng > >> By the way, if a fix patch should only involve those commits it will change? > IMO, logically, yes. > > --yliu
diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 3b98f42..92a9823 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -327,6 +327,8 @@ port_init(uint8_t port) if (enable_tso == 0) { rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4); rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO6); + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_GUEST_TSO4); + rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_GUEST_TSO6); } rx_rings = (uint16_t)dev_info.max_rx_queues;