[xdp-hints] Re: XDP-hints via local BTF info

XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>
Cc: brouer@redhat.com,
	"xdp-hints@xdp-project.net" <xdp-hints@xdp-project.net>,
	Eelco Chaudron <echaudro@redhat.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>
Subject: [xdp-hints] Re: XDP-hints via local BTF info
Date: Fri, 19 Nov 2021 15:53:39 +0100	[thread overview]
Message-ID: <871r3cdwng.fsf@toke.dk> (raw)
In-Reply-To: <61966ec0722fe_2f3212080@john.notmuch>

Just a few additional comments, as I think y'all mostly covered everything:

>> >> However, the
>> >> *format* for this configuration could very well be BTF-based, so userspace
>> >> can get whatever format it wants, assuming the hardware supports it.
>> >>
>> >> So, say we have this fancy programmable hardware, and we write a program
>> >> with a struct definition like:
>> >>
>> >> struct my_meta_format {
>> >>         __u64 rx_timestamp;
>> >>         __u64 magic_colour_of_packet;
>> >>         __u32 btf_id;
>> >> };
>> >>
>> >> and from userspace we can then do:
>> >>
>> >> dev_metadata_configure(ifindex, BTF_OF_STRUCT(my_meta_format));
>
> I have some doubts/questions about complexity on firmware/driver side
> to consume such sparse info and create such complex reconfig of hw.
> But, maybe some simple pattern matching would sufficient on hw side
> and useful to get things moving forward.

Just a quick note on this: if we're using BTF as a configuration format,
that's basically just another way of passing in a list of metadata item
names + data types, and their order. So the above would tell the
hardware (or driver) to enable "u64 rx_timestamp" and "u64
magic_colour_of_packet", where the only way the driver could figure out
what that's supposed to mean is by string matching on the names.

We could of course provide some common names in the core that many
drivers could support, but my main point with this is that BTF is "just"
a convenient format to pack this list into via a struct definition, it's
not magic faerie dust that makes sure there's also a *semantic* match. :)

> Seeing real hardware with support here would be great.

I don't think the BTF support has to go all the way to the hardware, a
driver could support this format just fine today (cf the above).

>> I've also been down the same rabbit hole, wanting userspace to define 
>> BTF layout as the config interface that HW will get reconfigured via.
>> I no-longer believe in this mode.  One reason is the existing config 
>> interfaces that enable/disable NIC HW features.
>> 
>> One way we can allow userspace to define the contents of the XDP-hints 
>> struct, not the HW config, is to add this new BPF 'hints-hook'.
>> Userspace can query the BPF-prog loaded in the 'hints-hook' and see that 
>> BTF structs it provide.
>> This is similar to that I do for AF_XDP in [1], as the XDP BPF-prog 
>> defines the layout and AF_XDP userspace queries the BTF avail.
>
> I expected, but it didn't happen yet(?), is first users would go a
> different route. The way I see it is, hw vendor can configure the NIC
> to put any hints they like in the header via firmware update. The user
> space would understand the layout of the hints because it programmed
> these hints. In general its not very friendly for distributions and
> their end users, but for a DPDK user running on top of AF_XDP this
> would be all thats needed. Or an embedded end system at a telco or POC
> on IDS would work.

People could still do this, of course, but I view the BTF layout stuff
mostly as a way to make something like this nicer to consume: you'll
be able to have essentially the same workflow, but you have
introspection of the result so you can verify you don't have a
misconfiguration somewhere, etc.

>>> > It would be great if we could know it is fixed, but I do not
>>> > understand how the user can know this, especially since the
>>> > control of this is out-of-band. How would we deal with the
>>> > following scenario?
>> > 
>> > App 1 comes up, opens up an AF_XDP socket and requests metadata_1
>> > App 2 comes up, opens up another AF_XDP socket on the same netdev and requests metadata_2
>> > 
>> > We can provide the apps with two different btf_ids, but is this
>> > something that an existing driver can support and how does this
>> > scale as we add sockets and different usages of metadata? Note that
>> > we have no idea what the destination is until after we have
>> > executed our XDP program and potentially used the metadata area
>> > there. But our population of the metadata field is before the XDP
>> > program. Kind of chicken and egg.
>> > 
>> > The idea of a separate metadata population hook point on the
>> > netdev/queue_id level could potentially solve this. Well, as long
>> > as you are not attaching several sockets to the same netdev and
>> > queue_id, but that is rare.
>
> Interesting, but I would get basic single config working first. If user
> really wants multiple configs then I would guess the NIC might partition
> the hardware into VFs or virtual interfaces of some kind.

Or manually configure the metadata to be the union of what the two
applications require. I don't think that's completely unreasonable,
actually: for instance, a web server still expects the network
interfaces to have IP addresses assigned before it starts up, and I view
this as similar.

So, if App 1 requires metadata X and Y, and App 2 requires Y and Z, the
administrator would enable all three, and the apps would both be able to
find what they need because the BTF exported by the driver tells them
where in the metadata struct they're each located...

-Toke

next prev parent reply	other threads:[~2021-11-19 14:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17 17:22 [xdp-hints] " Jesper Dangaard Brouer
2021-11-17 20:07 ` [xdp-hints] " Karlsson, Magnus
2021-11-17 22:48   ` Toke Høiland-Jørgensen
2021-11-18  8:05     ` Karlsson, Magnus
2021-11-18 14:30       ` Jesper Dangaard Brouer
2021-11-18 14:57         ` Karlsson, Magnus
2021-11-18 15:18         ` John Fastabend
2021-11-19 14:53           ` Toke Høiland-Jørgensen [this message]
2021-11-22 12:45             ` [xdp-hints] Basic/Dumb question WAS(Re: " Jamal Hadi Salim
2021-11-22 13:59               ` [xdp-hints] " Toke Høiland-Jørgensen
2021-11-22 15:31                 ` Tom Herbert
2021-11-22 18:25                   ` Toke Høiland-Jørgensen
2021-11-22 12:57             ` [xdp-hints] " Alexander Lobakin
2021-11-24 11:54               ` Jesper Dangaard Brouer
2021-11-25 20:04                 ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871r3cdwng.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=brouer@redhat.com \
    --cc=echaudro@redhat.com \
    --cc=ederson.desouza@intel.com \
    --cc=jbrouer@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=xdp-hints@xdp-project.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox