From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vs1-xe33.google.com (mail-vs1-xe33.google.com [IPv6:2607:f8b0:4864:20::e33]) by mail.toke.dk (Postfix) with ESMTPS id 85B86A1913D for ; Wed, 12 Jul 2023 17:16:45 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=F1klgqTx Received: by mail-vs1-xe33.google.com with SMTP id ada2fe7eead31-440c5960b58so2082534137.3 for ; Wed, 12 Jul 2023 08:16:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689175002; x=1691767002; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6u5wdruCv5qZAwZmf+4YbzqhZRiNuPLcSBIHcakSfVI=; b=F1klgqTxSXT+aoGrKzSpxAdDRnjg2ixwNDiAsSpHOMvk7JvNFELzX3/dPc1LkSaBzS 4Mw2rkFrPlpf1qKdPQZ7gw7XRGlYwUre0fhHqzyLCJHksk5ijbG1YLHs3X7hdFkA5MsH xFc5XrGn6GOd2PsnI6DNS86T/51uu+JYEVXuocd7ZveLxGk0RY3PsT229/TC7rW7Lwqp Rg4TOFp5CeyEYJFDzXOPBsyDVChgyxRUR+tM94MI0YKeYqYVPxEadaOwoGS7dbzIz8L8 o1AT+BxK1/WluXxjs6Qq4NS2wvwyGeRDoTMcjlQGyyVW6Z1ZefSEskdJDEr+cjRjNAcL fJ5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689175002; x=1691767002; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6u5wdruCv5qZAwZmf+4YbzqhZRiNuPLcSBIHcakSfVI=; b=V35FU9FiAfjJirNAC+KagM0N0Y9bNfJGX0R5IKXlFNdbIX22qgIGtvwr7WnBQ/SIUJ t7X+IcKtos4Q/RTR+Da8jmY6u3G1L5eWp0pHJEAuR2ED2pP7ZpHoQABwurfY1xE1d3Id Tld14mGKAabagSsBcOX39KBYPAo1g9TEPpLQBESo4cbeRZrnhpqAnEmR5sOaMzXx8neU 2RpPvUfKfN9eAFJ6nbQb/1pKIlgCIsNnyr92kKCgW3gTy/eyY9laqPJLcJYU2fwPA4Jn Azdfk6u0c2xwATBM9iuHtEsqVoqu4QDltRY85dZjM5VZUkpwganbMKDir+X/PZnUK9gc P8Uw== X-Gm-Message-State: ABy/qLa5C2rVoHykS9oAxDDJfJTItCeXIee3r/9mnWyxrw43vKZcBIBY NfWY9BXgMrgyg6V5MNfVurs//SdoG6YJkazV6xo= X-Google-Smtp-Source: APBJJlFwjbK9mcwkmPB4EO0CdhSct4jhru67FvfPKws3fYCV8cGbUAjcXb9F6/cc+EYcqGw+Az6OMJA1Q9PJU3SYg+c= X-Received: by 2002:a67:f653:0:b0:443:4e7d:c8db with SMTP id u19-20020a67f653000000b004434e7dc8dbmr9279109vso.2.1689175002669; Wed, 12 Jul 2023 08:16:42 -0700 (PDT) MIME-Version: 1.0 References: <20230707193006.1309662-1-sdf@google.com> <20230707193006.1309662-10-sdf@google.com> <20230711225657.kuvkil776fajonl5@MacBook-Pro-8.local> In-Reply-To: From: Willem de Bruijn Date: Wed, 12 Jul 2023 11:16:04 -0400 Message-ID: To: Stanislav Fomichev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: RR5KDDFTC2QPB7AORLU43VEZ5M5TFBFK X-Message-ID-Hash: RR5KDDFTC2QPB7AORLU43VEZ5M5TFBFK X-MailFrom: willemdebruijn.kernel@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Alexei Starovoitov , bpf , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Hao Luo , Jiri Olsa , Jakub Kicinski , =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , Willem de Bruijn , David Ahern , "Karlsson, Magnus" , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , "Fijalkowski, Maciej" , Jesper Dangaard Brouer , Network Development , xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next v3 09/14] net/mlx5e: Implement devtx kfuncs List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, Jul 12, 2023 at 1:36=E2=80=AFAM Stanislav Fomichev = wrote: > > On Tue, Jul 11, 2023 at 9:59=E2=80=AFPM Alexei Starovoitov > wrote: > > > > On Tue, Jul 11, 2023 at 8:29=E2=80=AFPM Stanislav Fomichev wrote: > > > > > > > > > This will slow things down, but not to the point where it's on par > > > with doing sw checksum. At least in theory. > > > We can't stay at skb when using AF_XDP. AF_XDP would benefit from hav= ing > > > the offloads. > > > > To clarify: yes, AF_XDP needs generalized HW offloads. > > Great! To reiterate, I'm mostly interested in af_xdp wrt tx > timestamps. So if the consensus is not to mix xdp-tx and af_xdp-tx, > I'm fine with switching to adding some fixed af_xdp descriptor format > to enable offloads on tx. > > > I just don't see how xdp tx offloads are moving a needle in that direct= ion. > > Let me try to explain how both might be similar, maybe I wasn't clear > enough on that. > For af_xdp tx packet, the userspace puts something in the af_xdp frame > metadata area (headrom) which then gets executed/interpreted by the > bpf program at devtx (which calls kfuncs to enable particular > offloads). > IOW, instead of defining some fixed layout for the tx offloads, the > userspace and bpf program have some agreement on the layout (and bpf > program "applies" the offloads by calling the kfuncs). > Also (in theory) the same hooks can be used for xdp-tx. > Does it make sense? But, again, happy to scratch that whole idea if > we're fine with a fixed layout for af_xdp. Checksum offload is an important demonstrator too. It is admittedly a non-trivial one. Checksum offload has often been discussed as a pain point ("protocol ossification"). In general, drivers can accept every CHECKSUM_COMPLETE skb that matches their advertised feature NETIF_F_[HW|IP|IPV6]_CSUM. I don't see why this would be different for kfuncs for packets coming from userspace. The problematic drivers are the ones that do not implement CHECKSUM_COMPLETE as intended, but ignore this simple protocol-independent hint in favor of parsing from scratch, possibly zeroing the field, computing multiple layers, etc. All of which is unnecessary with LCO. An AF_XDP user can be expected to apply LCO and only request checksum insertion for the innermost checksum. The biggest problem is with these devices that parse in hardware (and possibly also in the driver to identify and fix up hardware limitations) is that they will fail if encountering an unknown protocol. Which brings us to advertising limited typed support: NETIF_F_HW_CSUM vs NETIF_F_IP_CSUM. The fact that some devices that deviate from industry best practices cannot support more advanced packet formats is unfortunate, but not a reason to hold others back. No different from current kernel path. The BPF program can fallback onto software checksumming on these devices, like the kernel path. Perhaps we do need to pass along with csum_start and csum_off a csum_type that matches the existing NETIF_F_[HW|IP|IPV6]_CSUM, to let drivers return with -EOPNOTSUPP quickly if for the generic case. For implementation in essence it is just reordering driver code that already exists for the skb case. I think the ice patch series to support rx timestamping is a good indication of what it takes to support XDP kfuncs: not so much new code, but reordering the driver logic. Which also indicates to me that the driver *is* the right place to implement this logic, rather than reimplement it in a BPF library. It avoids both code duplication and dependency hell, if the library ships independent from the driver.