XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: John Fastabend <john.fastabend@gmail.com>
Cc: Stanislav Fomichev <sdf@google.com>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, kpsingh@kernel.org, haoluo@google.com,
	jolsa@kernel.org, Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs
Date: Fri, 28 Oct 2022 18:14:31 -0700	[thread overview]
Message-ID: <20221028181431.05173968@kernel.org> (raw)
In-Reply-To: <635c62c12652d_b1ba208d0@john.notmuch>

On Fri, 28 Oct 2022 16:16:17 -0700 John Fastabend wrote:
> > > And it's actually harder to abstract away inter HW generation
> > > differences if the user space code has to handle all of it.  
> 
> I don't see how its any harder in practice though?

You need to find out what HW/FW/config you're running, right?
And all you have is a pointer to a blob of unknown type.

Take timestamps for example, some NICs support adjusting the PHC 
or doing SW corrections (with different versions of hw/fw/server
platforms being capable of both/one/neither).

Sure you can extract all this info with tracing and careful
inspection via uAPI. But I don't think that's _easier_.
And the vendors can't run the results thru their validation 
(for whatever that's worth).

> > I've had the same concern:
> > 
> > Until we have some userspace library that abstracts all these details,
> > it's not really convenient to use. IIUC, with a kptr, I'd get a blob
> > of data and I need to go through the code and see what particular type
> > it represents for my particular device and how the data I need is
> > represented there. There are also these "if this is device v1 -> use
> > v1 descriptor format; if it's a v2->use this another struct; etc"
> > complexities that we'll be pushing onto the users. With kfuncs, we put
> > this burden on the driver developers, but I agree that the drawback
> > here is that we actually have to wait for the implementations to catch
> > up.  
> 
> I agree with everything there, you will get a blob of data and then
> will need to know what field you want to read using BTF. But, we
> already do this for BPF programs all over the place so its not a big
> lift for us. All other BPF tracing/observability requires the same
> logic. I think users of BPF in general perhaps XDP/tc are the only
> place left to write BPF programs without thinking about BTF and
> kernel data structures.
> 
> But, with proposed kptr the complexity lives in userspace and can be
> fixed, added, updated without having to bother with kernel updates, etc.
> From my point of view of supporting Cilium its a win and much preferred
> to having to deal with driver owners on all cloud vendors, distributions,
> and so on.
> 
> If vendor updates firmware with new fields I get those immediately.

Conversely it's a valid concern that those who *do* actually update
their kernel regularly will have more things to worry about.

> > Jakub mentions FW and I haven't even thought about that; so yeah, bpf
> > programs might have to take a lot of other state into consideration
> > when parsing the descriptors; all those details do seem like they
> > belong to the driver code.  
> 
> I would prefer to avoid being stuck on requiring driver writers to
> be involved. With just a kptr I can support the device and any
> firwmare versions without requiring help.

1) where are you getting all those HW / FW specs :S
2) maybe *you* can but you're not exactly not an ex-driver developer :S

> > Feel free to send it early with just a handful of drivers implemented;
> > I'm more interested about bpf/af_xdp/user api story; if we have some
> > nice sample/test case that shows how the metadata can be used, that
> > might push us closer to the agreement on the best way to proceed.  
> 
> I'll try to do a intel and mlx implementation to get a cross section.
> I have a good collection of nics here so should be able to show a
> couple firmware versions. It could be fine I think to have the raw
> kptr access and then also kfuncs for some things perhaps.
> 
> > > I'd prefer if we left the door open for new vendors. Punting descriptor
> > > parsing to user space will indeed result in what you just said - major
> > > vendors are supported and that's it.  
> 
> I'm not sure about why it would make it harder for new vendors? I think
> the opposite, 

TBH I'm only replying to the email because of the above part :)
I thought this would be self evident, but I guess our perspectives 
are different.

Perhaps you look at it from the perspective of SW running on someone
else's cloud, an being able to move to another cloud, without having 
to worry if feature X is available in xdp or just skb.

I look at it from the perspective of maintaining a cloud, with people
writing random XDP applications. If I swap a NIC from an incumbent to a
(superior) startup, and cloud users are messing with raw descriptor -
I'd need to go find every XDP program out there and make sure it
understands the new descriptors.

There is a BPF foundation or whatnot now - what about starting a
certification program for cloud providers and making it clear what
features must be supported to be compatible with XDP 1.0, XDP 2.0 etc?

> it would be easier because I don't need vendor support at all.

Can you support the enfabrica NIC on day 1? :) To an extent, its just
shifting the responsibility from the HW vendor to the middleware vendor.

> Thinking it over seems there could be room for both.

Are you thinking more or less Stan's proposal but with one of 
the callbacks being "give me the raw thing"? Probably as a ro dynptr?
Possible, but I don't think we need to hold off Stan's work.

  reply	other threads:[~2022-10-29  1:14 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 20:00 [xdp-hints] " Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 1/5] bpf: Support inlined/unrolled kfuncs for xdp metadata Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 2/5] veth: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-10-28  8:40   ` [xdp-hints] " Jesper Dangaard Brouer
2022-10-28 18:46     ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 3/5] libbpf: Pass prog_ifindex via bpf_object_open_opts Stanislav Fomichev
2022-10-27 20:05   ` [xdp-hints] " Andrii Nakryiko
2022-10-27 20:10     ` Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 4/5] selftests/bpf: Convert xskxceiver to use custom program Stanislav Fomichev
2022-10-27 20:00 ` [xdp-hints] [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver Stanislav Fomichev
2022-10-28  6:22   ` [xdp-hints] " Martin KaFai Lau
2022-10-28 10:37     ` Jesper Dangaard Brouer
2022-10-28 18:46       ` Stanislav Fomichev
2022-10-31 14:20         ` Alexander Lobakin
2022-10-31 14:29           ` Alexander Lobakin
2022-10-31 17:00           ` Stanislav Fomichev
2022-11-01 13:18             ` Jesper Dangaard Brouer
2022-11-01 20:12               ` Stanislav Fomichev
2022-11-01 22:23               ` Toke Høiland-Jørgensen
2022-10-28 15:58 ` [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs John Fastabend
2022-10-28 18:04   ` Jakub Kicinski
2022-10-28 18:46     ` Stanislav Fomichev
2022-10-28 23:16       ` John Fastabend
2022-10-29  1:14         ` Jakub Kicinski [this message]
2022-10-31 14:10           ` Bezdeka, Florian
2022-10-31 15:28             ` Toke Høiland-Jørgensen
2022-10-31 17:00               ` Stanislav Fomichev
2022-10-31 22:57                 ` Martin KaFai Lau
2022-11-01  1:59                   ` Stanislav Fomichev
2022-11-01 12:52                     ` Toke Høiland-Jørgensen
2022-11-01 13:43                       ` David Ahern
2022-11-01 14:20                         ` Toke Høiland-Jørgensen
2022-11-01 17:05                     ` Martin KaFai Lau
2022-11-01 20:12                       ` Stanislav Fomichev
2022-11-02 14:06                       ` Jesper Dangaard Brouer
2022-11-02 22:01                         ` Toke Høiland-Jørgensen
2022-11-02 23:10                           ` Stanislav Fomichev
2022-11-03  0:09                             ` Toke Høiland-Jørgensen
2022-11-03 12:01                               ` Jesper Dangaard Brouer
2022-11-03 12:48                                 ` Toke Høiland-Jørgensen
2022-11-03 15:25                                   ` Jesper Dangaard Brouer
2022-10-31 19:36               ` Yonghong Song
2022-10-31 22:09                 ` Stanislav Fomichev
2022-10-31 22:38                   ` Yonghong Song
2022-10-31 22:55                     ` Stanislav Fomichev
2022-11-01 14:23                       ` Jesper Dangaard Brouer
2022-11-01 17:31                   ` Martin KaFai Lau
2022-11-01 20:12                     ` Stanislav Fomichev
2022-11-01 21:17                       ` Martin KaFai Lau
2022-10-31 17:01           ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221028181431.05173968@kernel.org \
    --to=kuba@kernel.org \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox