From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by mail.toke.dk (Postfix) with ESMTPS id C2AD9A18E54 for ; Wed, 12 Jul 2023 04:37:38 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=S7VBb2jp Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2b701dee4bfso105739561fa.0 for ; Tue, 11 Jul 2023 19:37:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689129455; x=1691721455; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ADCOd4p+TNxKA3O8fQeclFRSALEHnH+Ken/uTXornfI=; b=S7VBb2jpxj0fdFjUfB/4olKFS2r6xy5YdpIiVXo/3z01kYrsumMQ9N+xEKxEcRk5HK ligxnzEll/MRWpAyPv5P1Tw4BDSk/gEoTRNrmS68DB4EjhQJRgVAQKsGXKME5rPypmMw SxqBekvAcZqtW5m9fc0S/GB/Toaer4oZigfEoeHMGR2HkompMEh1qbNLAD+BU1JKFm5s Ls4xUD7F46ZI3+V5HqMiBQMEwgYXmuSOBNxNYFQSMCUap0eBlCjacj3L+wgmdvZHj3TF Ggp7qhlygdzg/RmCzhpUithNYSzDwjvuZBnjBDQj32a4ojgReybnr1IKcB//14sHkjEn RDGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689129455; x=1691721455; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ADCOd4p+TNxKA3O8fQeclFRSALEHnH+Ken/uTXornfI=; b=HKBIqqiQoa//qciOn+fd1wjopOZHrzsGG/zBjhPcnivBXPyhfhTegtPFdW46PYOylB 1bEgAaf+DC1kAhhIe8+zTrfvXXSVV6dE/yER2uSCWgIt4ZzQDhAXRid5iGjsidl9DtHV 2ACqcdJFTsEcjwYEkeXRmHJVvgoY4ik9zeydQVkSL6lLKZQ+IeIK5Hq9wv3kkb2CkbML o4FGWrZuqzuv4Jt9Es3y/20Y1oEnrfzn0ZwMtle5QQYxpbl9y4w0HJ/Jt22glFdUhLkm b/unA3MLzQqBNt722cd9WuU/5ftObdhn4fn3Jn8vEKoamolclj9SpOjVr3onDyUvcIkh G1sg== X-Gm-Message-State: ABy/qLbFQ/eTiSm79KBE8t99HOJmYF+7hiiwqE4HfbcfsGdpBTzGPnkW PSjoL0eB0arP/rb91cOaAWjI9LZSd2l/4lukSHo= X-Google-Smtp-Source: APBJJlEfaTgLAP90EVDPWG1AyVXbL0U355vIE8aih/khypvHPoN1TbMFitlNivgR40va0gZEhxPsEdwIkz91tvYD96U= X-Received: by 2002:a2e:3313:0:b0:2b6:fa3e:f2fa with SMTP id d19-20020a2e3313000000b002b6fa3ef2famr14053544ljc.32.1689129455256; Tue, 11 Jul 2023 19:37:35 -0700 (PDT) MIME-Version: 1.0 References: <20230707193006.1309662-1-sdf@google.com> <20230707193006.1309662-10-sdf@google.com> <20230711225657.kuvkil776fajonl5@MacBook-Pro-8.local> <20230711173226.7e9cca4a@kernel.org> In-Reply-To: <20230711173226.7e9cca4a@kernel.org> From: Alexei Starovoitov Date: Tue, 11 Jul 2023 19:37:23 -0700 Message-ID: To: Jakub Kicinski Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: X6WHLQNQJTTVPP7VJKIWY7BLLJSP2HF7 X-Message-ID-Hash: X6WHLQNQJTTVPP7VJKIWY7BLLJSP2HF7 X-MailFrom: alexei.starovoitov@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Stanislav Fomichev , bpf , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Hao Luo , Jiri Olsa , =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , Willem de Bruijn , David Ahern , "Karlsson, Magnus" , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , "Fijalkowski, Maciej" , Jesper Dangaard Brouer , Network Development , xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next v3 09/14] net/mlx5e: Implement devtx kfuncs List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, Jul 11, 2023 at 5:32=E2=80=AFPM Jakub Kicinski wr= ote: > > On Tue, 11 Jul 2023 15:56:57 -0700 Alexei Starovoitov wrote: > > I think this proves my point: csum is not generalizable even across vet= h and mlx5. > > Above is a square peg that tries to fit csum_start/offset api (that mak= es sense from SW pov) > > into HW that has different ideas about csum-ing. > > > > Here is what mlx5 does: > > mlx5e_txwqe_build_eseg_csum(struct mlx5e_txqsq *sq, struct sk_buff *skb= , > > struct mlx5e_accel_tx_state *accel, > > struct mlx5_wqe_eth_seg *eseg) > > { > > if (unlikely(mlx5e_ipsec_txwqe_build_eseg_csum(sq, skb, eseg))) > > return; > > > > if (likely(skb->ip_summed =3D=3D CHECKSUM_PARTIAL)) { > > eseg->cs_flags =3D MLX5_ETH_WQE_L3_CSUM; > > if (skb->encapsulation) { > > This should be irrelevant today, as LCO exists? Hmm. Maybe. But LCO is an example that prog devs have to be aware of and use it properly. Meaning for certain protocols compute outer csum LCO way and let inner go through HW csuming. In this case I have no idea what these mlx5 flags do. I hope this part of the code was tested with udp tunnels. > > eseg->cs_flags |=3D MLX5_ETH_WQE_L3_INNER_CSUM = | > > MLX5_ETH_WQE_L4_INNER_CSUM; > > sq->stats->csum_partial_inner++; > > } else { > > eseg->cs_flags |=3D MLX5_ETH_WQE_L4_CSUM; > > sq->stats->csum_partial++; > > } > > > > How would you generalize that into csum api that will work across NICs = ? > > > > My answer stands: you cannot. > > > > My proposal again: > > add driver specifc kfuncs and hooks for things like csum. > > > > Kuba, > > since you nacked driver specific stuff please suggest a way to unblock = this stalemate. > > I hope I'm not misremembering but I think I suggested at the beginning > to create a structure describing packet geometry and requested offloads, > and for the prog fill that in. hmm. but that's what skb is for. skb =3D=3D packet geometry =3D=3D layout of headers, payload, inner vs outer, csum partial, gso params. bpf at tc layer supposed to interact with that correctly. If the packet is modified skb geometry should be adjusted accordingly. Like BPF_F_RECOMPUTE_CSUM flag in bpf_skb_store_bytes(). > > All operating systems I know end up doing that, we'll end up doing > that as well. The question is whether we're willing to learn from > experience or prefer to go on a wild ride first... I don't follow. This thread was aimed to add xdp layer knobs. To me XDP is a driver level. 'struct xdp_md' along with BPF_F_XDP_HAS_FRAGS is the best abstraction we could do generalizing dma-buffers (page and multi-page) that drivers operate on. Everything else at driver level is too unique to generalize. skb layer is already doing its job. In that sense "generic XDP" is a feature for testing only. Trying to make "generic XDP" fast is missing the point of XDP. AF_XDP is a different concept. Exposing timestamp, csum, TSO to AF_XDP users is a different design challenge. I'm all for doing that, but trying to combine "timestamp in xdp tx" and "timestamp in AF_XDP" will lead to bad trade-off-s for both. Which I think this patchset demonstrates.