From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Tom Herbert <tom@sipanda.io>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
John Fastabend <john.fastabend@gmail.com>,
Jesper Dangaard Brouer <jbrouer@redhat.com>,
"Karlsson, Magnus" <magnus.karlsson@intel.com>,
"Desouza, Ederson" <ederson.desouza@intel.com>,
brouer@redhat.com,
"xdp-hints@xdp-project.net" <xdp-hints@xdp-project.net>,
Eelco Chaudron <echaudro@redhat.com>,
Andrii Nakryiko <andrii@kernel.org>,
"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
"Burakov, Anatoly" <anatoly.burakov@intel.com>
Subject: [xdp-hints] Re: Basic/Dumb question WAS(Re: Re: XDP-hints via local BTF info
Date: Mon, 22 Nov 2021 19:25:00 +0100 [thread overview]
Message-ID: <87fsrocakj.fsf@toke.dk> (raw)
In-Reply-To: <CAOuuhY93sc57L8xzkwo66UXjhijitPXtWkzykGuU1BTa+F72pw@mail.gmail.com>
Tom Herbert <tom@sipanda.io> writes:
> On Mon, Nov 22, 2021 at 5:59 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Jamal Hadi Salim <jhs@mojatatu.com> writes:
>>
>> > And it goes something like this:
>> >
>> > Why does the metadata have to go in the DMA descriptors?
>> > Our experience with the XDP metadata is: you start accessing
>> > that there is a performance penalty (extra cache miss(es)).
>> >
>> > Why is the metadata not encapped as part of the data? We
>> > dont have MTU issues on receive since that is entirely a local matter;
>> > meaning the hardware can expand the packet as much as it wants within
>> > the boundaries of alloced DMA buffer space and XDP and any other
>> > subsystem (TC for example) can take advantage of the metadata.
>>
>> Once the hardware learns how to do that, this is absolutely the
>> direction we want to go in. But for existing stuff, the metadata is in
>> the descriptor and needs to be translated somehow...
>>
> Toke,
>
> By existing stuff I think you mean legacy stuff :-)
Well, whatever term you want to use for "the stuff that's in hardware
today" :)
> I tend to agree with Jamal. Descriptor space is exceedingly limited
> and it's unclear to me that the benefits of getting a few bits of
> *hints* from the device outweigh the costs to develop XDP hints and
> perpetually maintain the facility. That is to say it doesn't seem to
> address the two fundamental limitations of the venerable 50 year old
> device driver model: 1) the narrow waist of PCIe bus in expressing
> ancillary information 2) the black box nature of devices that prevent
> the stack from having any visibility into what it's actually doing
> (and hence we can only get hints at best and not real operational data
> for primary protocol processing).
>
> I believe CXL and programmable devices are the emerging architectural
> solution. This will enable split plane acceleration. For instance, if
> we do this right, a NIC would be able to perform all the stateless
> processing of a TCP/IP packet such that when the host gets the packet
> it could immediately jump to the TCP receive processing function ala
> TXDP-- or even more than that the device could process the whole
> received TCP datapath and the host just puts the received data on the
> socket (necessary to solve the TLS offload OOO problem). All this
> requires a very tight integration between the stack and the device to
> the extent that the lines are blurred and the boundary of the software
> stack is effectively pushed into the device (but definitely not TOE!).
This is certainly something that xdp-hints should be able to address.
And it does: the XDP metadata area is right before the packet data, and
while today that bit of memory is not covered by DMA, there is
absolutely no reason why it couldn't be for devices that support putting
arbitrary stuff in there. In which case the driver can just export a BTF
description of what the hardware writes into that area, and XDP can
access it directly using xdp-hints.
All the talk about translating from hardware descriptors applies to
existing hardware (or "legacy" if you want to call it that), but if the
hardware can just write the metadata directly, driver translation is
obviously not needed.
-Toke
next prev parent reply other threads:[~2021-11-22 18:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-17 17:22 [xdp-hints] " Jesper Dangaard Brouer
2021-11-17 20:07 ` [xdp-hints] " Karlsson, Magnus
2021-11-17 22:48 ` Toke Høiland-Jørgensen
2021-11-18 8:05 ` Karlsson, Magnus
2021-11-18 14:30 ` Jesper Dangaard Brouer
2021-11-18 14:57 ` Karlsson, Magnus
2021-11-18 15:18 ` John Fastabend
2021-11-19 14:53 ` Toke Høiland-Jørgensen
2021-11-22 12:45 ` [xdp-hints] Basic/Dumb question WAS(Re: " Jamal Hadi Salim
2021-11-22 13:59 ` [xdp-hints] " Toke Høiland-Jørgensen
2021-11-22 15:31 ` Tom Herbert
2021-11-22 18:25 ` Toke Høiland-Jørgensen [this message]
2021-11-22 12:57 ` [xdp-hints] " Alexander Lobakin
2021-11-24 11:54 ` Jesper Dangaard Brouer
2021-11-25 20:04 ` Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fsrocakj.fsf@toke.dk \
--to=toke@redhat.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=brouer@redhat.com \
--cc=echaudro@redhat.com \
--cc=ederson.desouza@intel.com \
--cc=jbrouer@redhat.com \
--cc=jhs@mojatatu.com \
--cc=john.fastabend@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=tom@sipanda.io \
--cc=xdp-hints@xdp-project.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox