XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Jesper Dangaard Brouer <jbrouer@redhat.com>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	brouer@redhat.com, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, song@kernel.org, yhs@fb.com,
	john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com,
	jolsa@kernel.org, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org,
	bpf@vger.kernel.org
Subject: [xdp-hints] Re: [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver
Date: Mon, 31 Oct 2022 10:00:54 -0700	[thread overview]
Message-ID: <CAKH8qBt1qM1n0X5uwxcBph9gLOv3FXR2q11viUoxxn35Z2_=ag@mail.gmail.com> (raw)
In-Reply-To: <20221031142032.164247-1-alexandr.lobakin@intel.com>

On Mon, Oct 31, 2022 at 7:22 AM Alexander Lobakin
<alexandr.lobakin@intel.com> wrote:
>
> From: Stanislav Fomichev <sdf@google.com>
> Date: Fri, 28 Oct 2022 11:46:14 -0700
>
> > On Fri, Oct 28, 2022 at 3:37 AM Jesper Dangaard Brouer
> > <jbrouer@redhat.com> wrote:
> > >
> > >
> > > On 28/10/2022 08.22, Martin KaFai Lau wrote:
> > > > On 10/27/22 1:00 PM, Stanislav Fomichev wrote:
> > > >> Example on how the metadata is prepared from the BPF context
> > > >> and consumed by AF_XDP:
> > > >>
> > > >> - bpf_xdp_metadata_have_rx_timestamp to test whether it's supported;
> > > >>    if not, I'm assuming verifier will remove this "if (0)" branch
> > > >> - bpf_xdp_metadata_rx_timestamp returns a _copy_ of metadata;
> > > >>    the program has to bpf_xdp_adjust_meta+memcpy it;
> > > >>    maybe returning a pointer is better?
> > > >> - af_xdp consumer grabs it from data-<expected_metadata_offset> and
> > > >>    makes sure timestamp is not empty
> > > >> - when loading the program, we pass BPF_F_XDP_HAS_METADATA+prog_ifindex
> > > >>
> > > >> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> > > >> Cc: Jakub Kicinski <kuba@kernel.org>
> > > >> Cc: Willem de Bruijn <willemb@google.com>
> > > >> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> > > >> Cc: Anatoly Burakov <anatoly.burakov@intel.com>
> > > >> Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
> > > >> Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
> > > >> Cc: Maryam Tahhan <mtahhan@redhat.com>
> > > >> Cc: xdp-hints@xdp-project.net
> > > >> Cc: netdev@vger.kernel.org
> > > >> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > >> ---
> > > >>   .../testing/selftests/bpf/progs/xskxceiver.c  | 22 ++++++++++++++++++
> > > >>   tools/testing/selftests/bpf/xskxceiver.c      | 23 ++++++++++++++++++-
> > > >>   2 files changed, 44 insertions(+), 1 deletion(-)
>
> [...]
>
> > > IMHO sizeof() should come from a struct describing data_meta area see:
> > >
> > > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_kern.c#L62
> >
> > I guess I should've used pointers for the return type instead, something like:
> >
> > extern __u64 *bpf_xdp_metadata_rx_timestamp(struct xdp_md *ctx) __ksym;
> >
> > {
> >    ...
> >     __u64 *rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx);
> >     if (rx_timestamp) {
> >         bpf_xdp_adjust_meta(ctx, -(int)sizeof(*rx_timestamp));
> >         __builtin_memcpy(data_meta, rx_timestamp, sizeof(*rx_timestamp));
> >     }
> > }
> >
> > Does that look better?
>
> I guess it will then be resolved to a direct store, right?
> I mean, to smth like
>
>         if (rx_timestamp)
>                 *(u32 *)data_meta = *rx_timestamp;
>
> , where *rx_timestamp points directly to the Rx descriptor?

Right. I should've used that form from the beginning, that memcpy is
confusing :-(

> >
> > > >> +        if (ret != 0)
> > > >> +            return XDP_DROP;
> > > >> +
> > > >> +        data = (void *)(long)ctx->data;
> > > >> +        data_meta = (void *)(long)ctx->data_meta;
> > > >> +
> > > >> +        if (data_meta + sizeof(__u32) > data)
> > > >> +            return XDP_DROP;
> > > >> +
> > > >> +        rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx);
> > > >> +        __builtin_memcpy(data_meta, &rx_timestamp, sizeof(__u32));
> > >
> > > So, this approach first stores hints on some other memory location, and
> > > then need to copy over information into data_meta area. That isn't good
> > > from a performance perspective.
> > >
> > > My idea is to store it in the final data_meta destination immediately.
> >
> > This approach doesn't have to store the hints in the other memory
> > location. xdp_buff->priv can point to the real hw descriptor and the
> > kfunc can have a bytecode that extracts the data from the hw
> > descriptors. For this particular RFC, we can think that 'skb' is that
> > hw descriptor for veth driver.
>
> I really do think intermediate stores can be avoided with this
> approach.
> Oh, and BTW, if we plan to use a particular Hint in the BPF program
> only, there's no need to place it in the metadata area at all, is
> there? You only want to get it in your code, so just retrieve it to
> the stack and that's it. data_meta is only for cases when you want
> hints to appear in AF_XDP.

Correct.

> > > Do notice that in my approach, the existing ethtool config setting and
> > > socket options (for timestamps) still apply.  Thus, each individual
> > > hardware hint are already configurable. Thus we already have a config
> > > interface. I do acknowledge, that in-case a feature is disabled it still
> > > takes up space in data_meta areas, but importantly it is NOT stored into
> > > the area (for performance reasons).
> >
> > That should be the case with this rfc as well, isn't it? Worst case
> > scenario, that kfunc bytecode can explicitly check ethtool options and
> > return false if it's disabled?
>
> (to Jesper)
>
> Once again, Ethtool idea doesn't work. Let's say you have roughly
> 50% of frames to be consumed by XDP, other 50% go to skb path and
> the stack. In skb, I want as many HW data as possible: checksums,
> hash and so on. Let's say in XDP prog I want only timestamp. What's
> then? Disable everything but stamp and kill skb path? Enable
> everything and kill XDP path?
> Stanislav's approach allows you to request only fields you need from
> the BPF prog directly, I don't see any reasons for adding one more
> layer "oh no I won't give you checksum because it's disabled via
> Ethtool".
> Maybe I get something wrong, pls explain then :P

Agree, good point.

> > > >> +    }
> > > >
> > > > Thanks for the patches.  I took a quick look at patch 1 and 2 but
> > > > haven't had a chance to look at the implementation details (eg.
> > > > KF_UNROLL...etc), yet.
> > > >
> > >
> > > Yes, thanks for the patches, even-though I don't agree with the
> > > approach, at-least until my concerns/use-case can be resolved.
> > > IMHO the best way to convince people is through code. So, thank you for
> > > the effort.  Hopefully we can use some of these ideas and I can also
> > > change/adjust my XDP-hints ideas to incorporate some of this :-)
> >
> > Thank you for the feedback as well, appreciate it!
> > Definitely, looking forward to a v2 from you with some more clarity on
> > how those btf ids are handled by the bpf/af_xdp side!
> >
> > > > Overall (with the example here) looks promising.  There is a lot of
> > > > flexibility on whether the xdp prog needs any hint at all, which hint it
> > > > needs, and how to store it.
> > > >
> > >
> > > I do see the advantage that XDP prog only populates metadata it needs.
> > > But how can we use/access this in __xdp_build_skb_from_frame() ?
> >
> > I don't think __xdp_build_skb_from_frame is automagically solved by
> > either proposal?
>
> It's solved in my approach[0], so that __xdp_build_skb_from_frame()
> are automatically get skb fields populated with the metadata if
> available. But I always use a fixed generic structure, which can't
> compete with your series in terms of flexibility (but solves stuff
> like inter-vendor redirects and so on).
> So in general I feel like there should be 2 options for metadata for
> users:
>
> 1) I use one particular vendor and I always compile AF_XDP programs
>    from fresh source code. I need to read/write only fields I want
>    to. I'd go with kfunc or kptr here (but I don't think BPF progs
>    should parse descriptor formats on their own, so your unroll NDO
>    approach looks optimal for me for that case);
> 2) I use multiple vendors, pre-compiled AF_XDP programs or just old
>    source code, I use veth and/or cpumap. So it's sorta
>    back-forward-left-right-compatibility path. So here we could just
>    use a fixed structure.

For (2) I really like Toke's suggestion about some extra helper that
prepares the metadata that the kernel path will later on be able to
understand.
The only downside I see is that it has to be called explicitly, but if
we assume that libxdp can also abstract this detail, doesn't sound
like a huge issue to me?

> > For this proposal, there has to be some expected kernel metadata
> > format that bpf programs will prepare and the kernel will understand?
> > Think of it like xdp_hints_common from your proposal; the program will
> > have to put together xdp_hints_skb into xdp metadata with the parts
> > that can be populated into skb by the kernel.
> >
> > For your btf ids proposal, it seems there has to be some extra kernel
> > code to parse all possible driver btf_if formats and copy the
> > metadata?
>
> That's why I define a "generic" struct, so that its consumers
> wouldn't have to if-else through a dozen of possible IDs :P
>
> [...]
>
> Great stuff from my PoV, I'd probably like to have some helpers for
> writing this new NDO, so that small vendors wouldn't be afraid of
> implementing it as Jakub mentioned. But still sorta optimal and
> elegant for me, I'm not sure I want to post a "demo version" of my
> series anymore :D
> I feel like this way + one more "everything-compat-fixed" couple
> would satisfy most of potential users.

Awesome, thanks for the review and the feedback!

  parent reply	other threads:[~2022-10-31 17:01 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 20:00 [xdp-hints] [RFC bpf-next 0/5] xdp: hints via kfuncs Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 1/5] bpf: Support inlined/unrolled kfuncs for xdp metadata Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 2/5] veth: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-10-28  8:40   ` [xdp-hints] " Jesper Dangaard Brouer
2022-10-28 18:46     ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 3/5] libbpf: Pass prog_ifindex via bpf_object_open_opts Stanislav Fomichev
2022-10-27 20:05   ` [xdp-hints] " Andrii Nakryiko
2022-10-27 20:10     ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 4/5] selftests/bpf: Convert xskxceiver to use custom program Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver Stanislav Fomichev
2022-10-28  6:22   ` [xdp-hints] " Martin KaFai Lau
2022-10-28 10:37     ` Jesper Dangaard Brouer
2022-10-28 18:46       ` Stanislav Fomichev
2022-10-31 14:20         ` Alexander Lobakin
2022-10-31 14:29           ` Alexander Lobakin
2022-10-31 17:00           ` Stanislav Fomichev [this message]
2022-11-01 13:18             ` Jesper Dangaard Brouer
2022-11-01 20:12               ` Stanislav Fomichev
2022-11-01 22:23               ` Toke Høiland-Jørgensen
2022-10-28 15:58 ` [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs John Fastabend
2022-10-28 18:04   ` Jakub Kicinski
2022-10-28 18:46     ` Stanislav Fomichev
2022-10-28 23:16       ` John Fastabend
2022-10-29  1:14         ` Jakub Kicinski
2022-10-31 14:10           ` Bezdeka, Florian
2022-10-31 15:28             ` Toke Høiland-Jørgensen
2022-10-31 17:00               ` Stanislav Fomichev
2022-10-31 22:57                 ` Martin KaFai Lau
2022-11-01  1:59                   ` Stanislav Fomichev
2022-11-01 12:52                     ` Toke Høiland-Jørgensen
2022-11-01 13:43                       ` David Ahern
2022-11-01 14:20                         ` Toke Høiland-Jørgensen
2022-11-01 17:05                     ` Martin KaFai Lau
2022-11-01 20:12                       ` Stanislav Fomichev
2022-11-02 14:06                       ` Jesper Dangaard Brouer
2022-11-02 22:01                         ` Toke Høiland-Jørgensen
2022-11-02 23:10                           ` Stanislav Fomichev
2022-11-03  0:09                             ` Toke Høiland-Jørgensen
2022-11-03 12:01                               ` Jesper Dangaard Brouer
2022-11-03 12:48                                 ` Toke Høiland-Jørgensen
2022-11-03 15:25                                   ` Jesper Dangaard Brouer
2022-10-31 19:36               ` Yonghong Song
2022-10-31 22:09                 ` Stanislav Fomichev
2022-10-31 22:38                   ` Yonghong Song
2022-10-31 22:55                     ` Stanislav Fomichev
2022-11-01 14:23                       ` Jesper Dangaard Brouer
2022-11-01 17:31                   ` Martin KaFai Lau
2022-11-01 20:12                     ` Stanislav Fomichev
2022-11-01 21:17                       ` Martin KaFai Lau
2022-10-31 17:01           ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKH8qBt1qM1n0X5uwxcBph9gLOv3FXR2q11viUoxxn35Z2_=ag@mail.gmail.com' \
    --to=sdf@google.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=jbrouer@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox