From: John Fastabend <john.fastabend@gmail.com>
To: Stanislav Fomichev <sdf@google.com>, Jakub Kicinski <kuba@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>,
bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
yhs@fb.com, kpsingh@kernel.org, haoluo@google.com,
jolsa@kernel.org, Willem de Bruijn <willemb@google.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Anatoly Burakov <anatoly.burakov@intel.com>,
Alexander Lobakin <alexandr.lobakin@intel.com>,
Magnus Karlsson <magnus.karlsson@gmail.com>,
Maryam Tahhan <mtahhan@redhat.com>,
xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs
Date: Fri, 28 Oct 2022 16:16:17 -0700 [thread overview]
Message-ID: <635c62c12652d_b1ba208d0@john.notmuch> (raw)
In-Reply-To: <CAKH8qBshi5dkhqySXA-Rg66sfX0-eTtVYz1ymHfBxSE=Mt2duA@mail.gmail.com>
Stanislav Fomichev wrote:
> On Fri, Oct 28, 2022 at 11:05 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Fri, 28 Oct 2022 08:58:18 -0700 John Fastabend wrote:
> > > A bit of extra commentary. By exposing the raw kptr to the rx
> > > descriptor we don't need driver writers to do anything.
> > > And can easily support all the drivers out the gate with simple
> > > one or two line changes. This pushes the interesting parts
> > > into userspace and then BPF writers get to do the work without
> > > bother driver folks and also if its not done today it doesn't
> > > matter because user space can come along and make it work
> > > later. So no scattered kernel dependencies which I really
> > > would like to avoid here. Its actually very painful to have
> > > to support clusters with N kernels and M devices if they
> > > have different features. Doable but annoying and much nicer
> > > if we just say 6.2 has support for kptr rx descriptor reading
> > > and all XDP drivers support it. So timestamp, rxhash work
> > > across the board.
> >
> > IMHO that's a bit of wishful thinking. Driver support is just a small
> > piece, you'll have different HW and FW versions, feature conflicts etc.
> > In the end kernel version is just one variable and there are many others
> > you'll already have to track.
Agree.
> >
> > And it's actually harder to abstract away inter HW generation
> > differences if the user space code has to handle all of it.
I don't see how its any harder in practice though?
>
> I've had the same concern:
>
> Until we have some userspace library that abstracts all these details,
> it's not really convenient to use. IIUC, with a kptr, I'd get a blob
> of data and I need to go through the code and see what particular type
> it represents for my particular device and how the data I need is
> represented there. There are also these "if this is device v1 -> use
> v1 descriptor format; if it's a v2->use this another struct; etc"
> complexities that we'll be pushing onto the users. With kfuncs, we put
> this burden on the driver developers, but I agree that the drawback
> here is that we actually have to wait for the implementations to catch
> up.
I agree with everything there, you will get a blob of data and then
will need to know what field you want to read using BTF. But, we
already do this for BPF programs all over the place so its not a big
lift for us. All other BPF tracing/observability requires the same
logic. I think users of BPF in general perhaps XDP/tc are the only
place left to write BPF programs without thinking about BTF and
kernel data structures.
But, with proposed kptr the complexity lives in userspace and can be
fixed, added, updated without having to bother with kernel updates, etc.
From my point of view of supporting Cilium its a win and much preferred
to having to deal with driver owners on all cloud vendors, distributions,
and so on.
If vendor updates firmware with new fields I get those immediately.
>
> Jakub mentions FW and I haven't even thought about that; so yeah, bpf
> programs might have to take a lot of other state into consideration
> when parsing the descriptors; all those details do seem like they
> belong to the driver code.
I would prefer to avoid being stuck on requiring driver writers to
be involved. With just a kptr I can support the device and any
firwmare versions without requiring help.
>
> Feel free to send it early with just a handful of drivers implemented;
> I'm more interested about bpf/af_xdp/user api story; if we have some
> nice sample/test case that shows how the metadata can be used, that
> might push us closer to the agreement on the best way to proceed.
I'll try to do a intel and mlx implementation to get a cross section.
I have a good collection of nics here so should be able to show a
couple firmware versions. It could be fine I think to have the raw
kptr access and then also kfuncs for some things perhaps.
>
>
>
> > > To find the offset of fields (rxhash, timestamp) you can use
> > > standard BTF relocations we have all this machinery built up
> > > already for all the other structs we read, net_devices, task
> > > structs, inodes, ... so its not a big hurdle at all IMO. We
> > > can add userspace libs if folks really care, but its just a read so
> > > I'm not even sure that is helpful.
> > >
> > > I think its nicer than having kfuncs that need to be written
> > > everywhere. My $.02 although I'll poke around with below
> > > some as well. Feel free to just hang tight until I have some
> > > code at the moment I have intel, mellanox drivers that I
> > > would want to support.
> >
> > I'd prefer if we left the door open for new vendors. Punting descriptor
> > parsing to user space will indeed result in what you just said - major
> > vendors are supported and that's it.
I'm not sure about why it would make it harder for new vendors? I think
the opposite, it would be easier because I don't need vendor support
at all. Thinking it over seems there could be room for both.
Thanks!
next prev parent reply other threads:[~2022-10-28 23:16 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 20:00 [xdp-hints] " Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 1/5] bpf: Support inlined/unrolled kfuncs for xdp metadata Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 2/5] veth: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-10-28 8:40 ` [xdp-hints] " Jesper Dangaard Brouer
2022-10-28 18:46 ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 3/5] libbpf: Pass prog_ifindex via bpf_object_open_opts Stanislav Fomichev
2022-10-27 20:05 ` [xdp-hints] " Andrii Nakryiko
2022-10-27 20:10 ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 4/5] selftests/bpf: Convert xskxceiver to use custom program Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver Stanislav Fomichev
2022-10-28 6:22 ` [xdp-hints] " Martin KaFai Lau
2022-10-28 10:37 ` Jesper Dangaard Brouer
2022-10-28 18:46 ` Stanislav Fomichev
2022-10-31 14:20 ` Alexander Lobakin
2022-10-31 14:29 ` Alexander Lobakin
2022-10-31 17:00 ` Stanislav Fomichev
2022-11-01 13:18 ` Jesper Dangaard Brouer
2022-11-01 20:12 ` Stanislav Fomichev
2022-11-01 22:23 ` Toke Høiland-Jørgensen
2022-10-28 15:58 ` [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs John Fastabend
2022-10-28 18:04 ` Jakub Kicinski
2022-10-28 18:46 ` Stanislav Fomichev
2022-10-28 23:16 ` John Fastabend [this message]
2022-10-29 1:14 ` Jakub Kicinski
2022-10-31 14:10 ` Bezdeka, Florian
2022-10-31 15:28 ` Toke Høiland-Jørgensen
2022-10-31 17:00 ` Stanislav Fomichev
2022-10-31 22:57 ` Martin KaFai Lau
2022-11-01 1:59 ` Stanislav Fomichev
2022-11-01 12:52 ` Toke Høiland-Jørgensen
2022-11-01 13:43 ` David Ahern
2022-11-01 14:20 ` Toke Høiland-Jørgensen
2022-11-01 17:05 ` Martin KaFai Lau
2022-11-01 20:12 ` Stanislav Fomichev
2022-11-02 14:06 ` Jesper Dangaard Brouer
2022-11-02 22:01 ` Toke Høiland-Jørgensen
2022-11-02 23:10 ` Stanislav Fomichev
2022-11-03 0:09 ` Toke Høiland-Jørgensen
2022-11-03 12:01 ` Jesper Dangaard Brouer
2022-11-03 12:48 ` Toke Høiland-Jørgensen
2022-11-03 15:25 ` Jesper Dangaard Brouer
2022-10-31 19:36 ` Yonghong Song
2022-10-31 22:09 ` Stanislav Fomichev
2022-10-31 22:38 ` Yonghong Song
2022-10-31 22:55 ` Stanislav Fomichev
2022-11-01 14:23 ` Jesper Dangaard Brouer
2022-11-01 17:31 ` Martin KaFai Lau
2022-11-01 20:12 ` Stanislav Fomichev
2022-11-01 21:17 ` Martin KaFai Lau
2022-10-31 17:01 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=635c62c12652d_b1ba208d0@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=alexandr.lobakin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mtahhan@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox