From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mail.toke.dk (Postfix) with ESMTPS id 4AD1EA80426 for ; Sat, 10 Aug 2024 10:02:19 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=bwhvVXjA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723276938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=UPjljH6iA3oWOviZCmtQ15dhjMJkybfiIUkL/U+qP9w=; b=bwhvVXjApWKGiJ7V1xHpgLamQw7K7LW49vFycNOY8Ejq2DIoEoBTSJEiynY8hkNSXlkdAL 4sB+9NJc6XrnSfEsO8MfYpuaYsZ6+oBs5md5y345TiVjVXi9+KFSioTlI5EhjnNoCUXiNH EV7je1bwaVSU8T/571ojfOXTXOPNGcE= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-235-gLJHspP3NImEoxLRehl5Rw-1; Sat, 10 Aug 2024 04:02:14 -0400 X-MC-Unique: gLJHspP3NImEoxLRehl5Rw-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-4282164fcbcso21136295e9.2 for ; Sat, 10 Aug 2024 01:02:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723276933; x=1723881733; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UPjljH6iA3oWOviZCmtQ15dhjMJkybfiIUkL/U+qP9w=; b=hspV7zz5PmBqZbbyxD6wA7bQ8djlTXKfSytfievNKrr0RvCW7lyDPenJwgmof4xbMr vd0Yi8aTFvQGSY8Zx6E3O8SGBvfohrxV8dcZcW1Ri3dHBLutsKzSj9BVJHwEl59oBrzF 0whgAot1APhIQ2sx35fyZl9NWYAlLiPyz206v8x+9n+e9cwZtFZ46OER/jcJKRSFOKg8 ctlkNdLyS4sNkzGv1yNiFIk78XRHhLkfmbVLem2ZemPoeJvPd/c5HUWL1ZTglOJ0yvdU 2/xs3Apu4EyUD2h+c/RKzHhsMOHyuo73wemf2R/kPi0dmlu7Nyp9pQMF5Sv8rkQ2Svi/ La+Q== X-Forwarded-Encrypted: i=1; AJvYcCVl4sGh9FZgigOF/FXgWv2T26C2c/y6J8ejvbYxcI2N9STOin125dqPk7rHhok6jMkYtMA0dkn9cdzgdXQ1lu5oPJri0QhBdt2B X-Gm-Message-State: AOJu0Yx5u73SHwwMBO6wA68xKPy76anoE8fT4QOMVczv46IHWEYEcp2R YiLOTYgjsJI3T1owU7ryeulYvI1kOU82GdT9K0BD8iHRePett26besL0GlO1dTC/MPg2yD0w/YI dxXLotNK8Do0sJpyAcHSR9Mkiz20CsgzaM2MA0SC1CvCaj0GtXUmFFyCMxA== X-Received: by 2002:a05:600c:c1b:b0:426:6822:861 with SMTP id 5b1f17b1804b1-429c3a5a0eemr24962625e9.36.1723276933048; Sat, 10 Aug 2024 01:02:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEswwbJndKkTgRmb2GZMwgJGUmCsGI2khfChZ1jJnnBT6x87O9kGvzmqv4CHnpklOnjHkE8XA== X-Received: by 2002:a05:600c:c1b:b0:426:6822:861 with SMTP id 5b1f17b1804b1-429c3a5a0eemr24962345e9.36.1723276932491; Sat, 10 Aug 2024 01:02:12 -0700 (PDT) Received: from localhost (53.116.107.80.static.otenet.gr. [80.107.116.53]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429c750f0absm19507625e9.17.2024.08.10.01.02.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Aug 2024 01:02:12 -0700 (PDT) Date: Sat, 10 Aug 2024 10:02:09 +0200 From: Lorenzo Bianconi To: Toke =?iso-8859-1?Q?H=F8iland-J=F8rgensen?= Message-ID: References: <20220628194812.1453059-1-alexandr.lobakin@intel.com> <20220628194812.1453059-33-alexandr.lobakin@intel.com> <54aab7ec-80e9-44fd-8249-fe0cabda0393@intel.com> <308fd4f1-83a9-4b74-a482-216c8211a028@app.fastmail.com> <99662019-7e9b-410d-99fe-a85d04af215c@intel.com> <875xs9q2z6.fsf@toke.dk> <22333deb-21f8-43a9-b32f-bc3e60892661@intel.com> <8734ndq0cd.fsf@toke.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="4vCVb17oaXeHlWt+" Content-Disposition: inline In-Reply-To: <8734ndq0cd.fsf@toke.dk> Message-ID-Hash: 6C5K5ILQZIFWI7GYDNLHJTS2DTOABCR2 X-Message-ID-Hash: 6C5K5ILQZIFWI7GYDNLHJTS2DTOABCR2 X-MailFrom: lorenzo.bianconi@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Alexander Lobakin , Daniel Xu , Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Larysa Zaremba , Michal Swiatkowski , Jesper Dangaard Brouer , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Lorenzo Bianconi , David Miller , Eric Dumazet , Jakub Kicinski , Paolo Abeni , John Fastabend , Yajun Deng , Willem de Bruijn , "bpf@vger.kernel.org" , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.9 Precedence: list Subject: [xdp-hints] Re: [PATCH RFC bpf-next 32/52] bpf, cpumap: switch to GRO from netif_receive_skb_list() List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --4vCVb17oaXeHlWt+ Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Aug 09, Toke wrote: > Alexander Lobakin writes: >=20 > > From: Toke H=F8iland-J=F8rgensen > > Date: Fri, 09 Aug 2024 14:45:33 +0200 > > > >> Alexander Lobakin writes: > >>=20 > >>> From: Daniel Xu > >>> Date: Thu, 08 Aug 2024 16:52:51 -0400 > >>> > >>>> Hi, > >>>> > >>>> On Thu, Aug 8, 2024, at 7:57 AM, Alexander Lobakin wrote: > >>>>> From: Lorenzo Bianconi > >>>>> Date: Thu, 8 Aug 2024 06:54:06 +0200 > >>>>> > >>>>>>> Hi Alexander, > >>>>>>> > >>>>>>> On Tue, Jun 28, 2022, at 12:47 PM, Alexander Lobakin wrote: > >>>>>>>> cpumap has its own BH context based on kthread. It has a sane ba= tch > >>>>>>>> size of 8 frames per one cycle. > >>>>>>>> GRO can be used on its own, adjust cpumap calls to the > >>>>>>>> upper stack to use GRO API instead of netif_receive_skb_list() w= hich > >>>>>>>> processes skbs by batches, but doesn't involve GRO layer at all. > >>>>>>>> It is most beneficial when a NIC which frame come from is XDP > >>>>>>>> generic metadata-enabled, but in plenty of tests GRO performs be= tter > >>>>>>>> than listed receiving even given that it has to calculate full f= rame > >>>>>>>> checksums on CPU. > >>>>>>>> As GRO passes the skbs to the upper stack in the batches of > >>>>>>>> @gro_normal_batch, i.e. 8 by default, and @skb->dev point to the > >>>>>>>> device where the frame comes from, it is enough to disable GRO > >>>>>>>> netdev feature on it to completely restore the original behaviou= r: > >>>>>>>> untouched frames will be being bulked and passed to the upper st= ack > >>>>>>>> by 8, as it was with netif_receive_skb_list(). > >>>>>>>> > >>>>>>>> Signed-off-by: Alexander Lobakin > >>>>>>>> --- > >>>>>>>> kernel/bpf/cpumap.c | 43 ++++++++++++++++++++++++++++++++++++++= ----- > >>>>>>>> 1 file changed, 38 insertions(+), 5 deletions(-) > >>>>>>>> > >>>>>>> > >>>>>>> AFAICT the cpumap + GRO is a good standalone improvement. I think > >>>>>>> cpumap is still missing this. > >>>>> > >>>>> The only concern for having GRO in cpumap without metadata from the= NIC > >>>>> descriptor was that when the checksum status is missing, GRO calcul= ates > >>>>> the checksum on CPU, which is not really fast. > >>>>> But I remember sometimes GRO was faster despite that. > >>>> > >>>> Good to know, thanks. IIUC some kind of XDP hint support landed alre= ady? > >>>> > >>>> My use case could also use HW RSS hash to avoid a rehash in XDP prog. > >>> > >>> Unfortunately, for now it's impossible to get HW metadata such as RSS > >>> hash and checksum status in cpumap. They're implemented via kfuncs > >>> specific to a particular netdevice and this info is available only wh= en > >>> running XDP prog. > >>> > >>> But I think one solution could be: > >>> > >>> 1. We create some generic structure for cpumap, like > >>> > >>> struct cpumap_meta { > >>> u32 magic; > >>> u32 hash; > >>> } > >>> > >>> 2. We add such check in the cpumap code > >>> > >>> if (xdpf->metalen =3D=3D sizeof(struct cpumap_meta) && > >>> ) > >>> skb->hash =3D meta->hash; > >>> > >>> 3. In XDP prog, you call Rx hints kfuncs when they're available, obta= in > >>> RSS hash and then put it in the struct cpumap_meta as XDP frame metad= ata. > >>=20 > >> Yes, except don't make this cpumap-specific, make it generic for kernel > >> consumption of the metadata. That way it doesn't even have to be stored > >> in the xdp metadata area, it can be anywhere we want (and hence not > >> subject to ABI issues), and we can use it for skb creation after > >> redirect in other places than cpumap as well (say, on veth devices). > >>=20 > >> So it'll be: > >>=20 > >> struct kernel_meta { > >> u32 hash; > >> u32 timestamp; > >> ...etc > >> } > >>=20 > >> and a kfunc: > >>=20 > >> void store_xdp_kernel_meta(struct kernel meta *meta); > >>=20 > >> which the XDP program can call to populate the metadata area. > > > > Hmm, nice! > > > > But where to store this info in case of cpumap if not in xdp->data_meta? > > When you convert XDP frames to skbs in the cpumap code, you only have > > &xdp_frame and that's it. XDP prog was already run earlier from the > > driver code at that point. >=20 > Well, we could put it in skb_shared_info? IIRC, some of the metadata > (timestamps?) end up there when building an skb anyway, so we won't even > have to copy it around... Before vacation I started looking into it a bit, I will resume this work in= one week or so. Regards, Lorenzo >=20 > -Toke >=20 --4vCVb17oaXeHlWt+ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTquNwa3Txd3rGGn7Y6cBh0uS2trAUCZrcefgAKCRA6cBh0uS2t rAWhAQCetdkAp+Z/1Ns5m03sBessZJS+q8gRJodpZZdXxU1EuAEAvK0zIPcM7BiO hwuk4Jk43IjVvL7JR6J8TLYC7G85KA4= =qu8K -----END PGP SIGNATURE----- --4vCVb17oaXeHlWt+--