XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Stanislav Fomichev" <sdf@google.com>, bpf <bpf@vger.kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Song Liu" <song@kernel.org>, "Yonghong Song" <yhs@fb.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>, "Hao Luo" <haoluo@google.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Toke Høiland-Jørgensen" <toke@kernel.org>,
	"Willem de Bruijn" <willemb@google.com>,
	"David Ahern" <dsahern@kernel.org>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Network Development" <netdev@vger.kernel.org>,
	xdp-hints@xdp-project.net
Subject: [xdp-hints] Re: [RFC bpf-next v3 09/14] net/mlx5e: Implement devtx kfuncs
Date: Tue, 11 Jul 2023 19:37:23 -0700	[thread overview]
Message-ID: <CAADnVQJ3iyoZaxaALWd4zTsDT3Z=czU4g7qpmBFWPUs5ucqCMg@mail.gmail.com> (raw)
In-Reply-To: <20230711173226.7e9cca4a@kernel.org>

On Tue, Jul 11, 2023 at 5:32 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 11 Jul 2023 15:56:57 -0700 Alexei Starovoitov wrote:
> > I think this proves my point: csum is not generalizable even across veth and mlx5.
> > Above is a square peg that tries to fit csum_start/offset api (that makes sense from SW pov)
> > into HW that has different ideas about csum-ing.
> >
> > Here is what mlx5 does:
> > mlx5e_txwqe_build_eseg_csum(struct mlx5e_txqsq *sq, struct sk_buff *skb,
> >                             struct mlx5e_accel_tx_state *accel,
> >                             struct mlx5_wqe_eth_seg *eseg)
> > {
> >         if (unlikely(mlx5e_ipsec_txwqe_build_eseg_csum(sq, skb, eseg)))
> >                 return;
> >
> >         if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
> >                 eseg->cs_flags = MLX5_ETH_WQE_L3_CSUM;
> >                 if (skb->encapsulation) {
>
> This should be irrelevant today, as LCO exists?

Hmm. Maybe. But LCO is an example that prog devs have to be aware of
and use it properly.
Meaning for certain protocols compute outer csum LCO way and
let inner go through HW csuming.
In this case I have no idea what these mlx5 flags do.
I hope this part of the code was tested with udp tunnels.

> >                         eseg->cs_flags |= MLX5_ETH_WQE_L3_INNER_CSUM |
> >                                           MLX5_ETH_WQE_L4_INNER_CSUM;
> >                         sq->stats->csum_partial_inner++;
> >                 } else {
> >                         eseg->cs_flags |= MLX5_ETH_WQE_L4_CSUM;
> >                         sq->stats->csum_partial++;
> >                 }
> >
> > How would you generalize that into csum api that will work across NICs ?
> >
> > My answer stands: you cannot.
> >
> > My proposal again:
> > add driver specifc kfuncs and hooks for things like csum.
> >
> > Kuba,
> > since you nacked driver specific stuff please suggest a way to unblock this stalemate.
>
> I hope I'm not misremembering but I think I suggested at the beginning
> to create a structure describing packet geometry and requested offloads,
> and for the prog fill that in.

hmm. but that's what skb is for. skb == packet geometry ==
layout of headers, payload, inner vs outer, csum partial, gso params.

bpf at tc layer supposed to interact with that correctly.
If the packet is modified skb geometry should be adjusted accordingly.
Like BPF_F_RECOMPUTE_CSUM flag in bpf_skb_store_bytes().

>
> All operating systems I know end up doing that, we'll end up doing
> that as well. The question is whether we're willing to learn from
> experience or prefer to go on a wild ride first...

I don't follow. This thread was aimed to add xdp layer knobs.
To me XDP is a driver level. 'struct xdp_md' along with
BPF_F_XDP_HAS_FRAGS is the best abstraction we could do generalizing
dma-buffers (page and multi-page) that drivers operate on.
Everything else at driver level is too unique to generalize.
skb layer is already doing its job.

In that sense "generic XDP" is a feature for testing only.
Trying to make "generic XDP" fast is missing the point of XDP.

AF_XDP is a different concept. Exposing timestamp,
csum, TSO to AF_XDP users is a different design challenge.
I'm all for doing that, but trying to combine "timestamp in xdp tx"
and "timestamp in AF_XDP" will lead to bad trade-off-s for both.
Which I think this patchset demonstrates.

  reply	other threads:[~2023-07-12  2:37 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 19:29 [xdp-hints] [RFC bpf-next v3 00/14] bpf: Netdev TX metadata Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 01/14] bpf: Rename some xdp-metadata functions into dev-bound Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 02/14] bpf: Make it easier to add new metadata kfunc Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 03/14] xsk: Support XDP_TX_METADATA_LEN Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 04/14] bpf: Implement devtx hook points Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 05/14] bpf: Implement devtx timestamp kfunc Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 06/14] net: veth: Implement devtx timestamp kfuncs Stanislav Fomichev
2023-07-07 19:29 ` [xdp-hints] [RFC bpf-next v3 07/14] bpf: Introduce tx checksum devtx kfuncs Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 08/14] net: veth: Implement devtx tx checksum Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 09/14] net/mlx5e: Implement devtx kfuncs Stanislav Fomichev
2023-07-11 22:56   ` [xdp-hints] " Alexei Starovoitov
2023-07-11 23:24     ` Stanislav Fomichev
2023-07-11 23:45       ` Alexei Starovoitov
2023-07-12  0:14         ` Stanislav Fomichev
2023-07-12  2:50           ` Alexei Starovoitov
2023-07-12  3:29             ` Stanislav Fomichev
2023-07-12  4:59               ` Alexei Starovoitov
2023-07-12  5:36                 ` Stanislav Fomichev
2023-07-12 15:16                   ` Willem de Bruijn
2023-07-12 16:28                     ` Willem de Bruijn
2023-07-12 19:03                     ` Alexei Starovoitov
2023-07-12 19:11                       ` Willem de Bruijn
2023-07-12 19:42                         ` Alexei Starovoitov
2023-07-12 20:09                           ` Jakub Kicinski
2023-07-12 20:53                             ` Stanislav Fomichev
2023-07-12  0:32     ` Jakub Kicinski
2023-07-12  2:37       ` Alexei Starovoitov [this message]
2023-07-12  3:07         ` Jakub Kicinski
2023-07-12  3:23           ` Alexei Starovoitov
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 10/14] selftests/xsk: Support XDP_TX_METADATA_LEN Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 11/14] selftests/bpf: Add helper to query current netns cookie Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 12/14] selftests/bpf: Add csum helpers Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 13/14] selftests/bpf: Extend xdp_metadata with devtx kfuncs Stanislav Fomichev
2023-07-07 19:30 ` [xdp-hints] [RFC bpf-next v3 14/14] selftests/bpf: Extend xdp_hw_metadata " Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQJ3iyoZaxaALWd4zTsDT3Z=czU4g7qpmBFWPUs5ucqCMg@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@kernel.org \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=toke@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox