XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
* [xdp-hints] XDP Redirect and TX Metadata
@ 2024-02-12  8:27 Florian Kauer
  2024-02-12 13:41 ` [xdp-hints] " Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Kauer @ 2024-02-12  8:27 UTC (permalink / raw)
  To: xdp-hints, xdp-newbies

Hi,
I am currently implementing an eBPF for redirecting from one physical interface to another. So basically loading the following at enp8s0:

SEC("prog")
int xdp_redirect(struct xdp_md *ctx) {
	/* ... */
	return bpf_redirect(3 /* enp5s0 */, 0);
}

I made three observations that I would like to discuss with you:

1. The redirection only works when I ALSO load some eBPF at the egress interface (enp5s0). It can be just

SEC("prog")
int xdp_pass(struct xdp_md *ctx) {
	return XDP_PASS;
}

but there has to be at least something. Otherwise, only xdp_redirect is called, but xdp_devmap_xmit is not.
It seems somewhat reasonable that the interface where the traffic is redirected to also needs to have the
XDP functionality initialized somehow, but it was unexpected at first. It tested it with an i225-IT (igc driver)
and a 82576 (igb driver). So, is this a bug or a feature?

2. For the RX side, the metadata is documented as "XDP RX Metadata" (https://docs.kernel.org/networking/xdp-rx-metadata.html),
while for TX it is "AF_XDP TX Metadata" (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
That seems to imply that TX metadata only works for AF_XDP, but not for direct redirection. Is there a reason for that?

3. At least for the igc, the egress queue is currently selected by using the smp_processor_id.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igc/igc_main.c?h=v6.8-rc4#n2453)
For our application, I would like to define the queue on a per-packet basis via the eBPF.
This would allow to steer the traffic to the correct queue when using TAPRIO full hardware offload.
Do you see any problem with introducing a new metadata field to define the egress queue?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2024-02-12  8:27 [xdp-hints] XDP Redirect and TX Metadata Florian Kauer
@ 2024-02-12 13:41 ` Toke Høiland-Jørgensen
  2024-02-12 14:35   ` Florian Kauer
  0 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-12 13:41 UTC (permalink / raw)
  To: Florian Kauer, xdp-hints, xdp-newbies

Florian Kauer <florian.kauer@linutronix.de> writes:

> Hi,
> I am currently implementing an eBPF for redirecting from one physical interface to another. So basically loading the following at enp8s0:
>
> SEC("prog")
> int xdp_redirect(struct xdp_md *ctx) {
> 	/* ... */
> 	return bpf_redirect(3 /* enp5s0 */, 0);
> }
>
> I made three observations that I would like to discuss with you:
>
> 1. The redirection only works when I ALSO load some eBPF at the egress interface (enp5s0). It can be just
>
> SEC("prog")
> int xdp_pass(struct xdp_md *ctx) {
> 	return XDP_PASS;
> }
>
> but there has to be at least something. Otherwise, only xdp_redirect is called, but xdp_devmap_xmit is not.
> It seems somewhat reasonable that the interface where the traffic is redirected to also needs to have the
> XDP functionality initialized somehow, but it was unexpected at first. It tested it with an i225-IT (igc driver)
> and a 82576 (igb driver). So, is this a bug or a feature?

I personally consider it a bug, but all the Intel drivers work this way,
unfortunately. The was some discussion around making the XDP feature
bits read-write, making it possible to enable XDP via ethtool instead of
having to load a dummy XDP program. But no patches have materialised yet.

> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
> TX it is "AF_XDP TX Metadata"
> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
> That seems to imply that TX metadata only works for AF_XDP, but not
> for direct redirection. Is there a reason for that?

Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
around to extending this to the regular XDP forwarding path yet.

> 3. At least for the igc, the egress queue is currently selected by
> using the smp_processor_id.
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igc/igc_main.c?h=v6.8-rc4#n2453)
> For our application, I would like to define the queue on a per-packet
> basis via the eBPF. This would allow to steer the traffic to the
> correct queue when using TAPRIO full hardware offload. Do you see any
> problem with introducing a new metadata field to define the egress
> queue?

Well, a couple :)

1. We'd have to find agreement across drivers for a numbering scheme to
refer to queues.

2. Selecting queues based on CPU index the way its done now means we
guarantee that the same queue will only be served from one CPU. Which
means we don't have to do any locking, which helps tremendously with
performance. Drivers handle the case where there are more CPUs than
queues a bit differently, but the ones that do generally have a lock
(with associated performance overhead).

As a workaround, you can use a cpumap to steer packets to specific CPUs
and perform the egress redirect inside the cpumap instead of directly on
RX. Requires a bit of knowledge of the hardware configuration, but it
may be enough for what you're trying to do.

-Toke

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2024-02-12 13:41 ` [xdp-hints] " Toke Høiland-Jørgensen
@ 2024-02-12 14:35   ` Florian Kauer
  2024-02-13 13:00     ` Toke Høiland-Jørgensen
  2025-01-14 16:47     ` Marcus Wichelmann
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Kauer @ 2024-02-12 14:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, xdp-hints, xdp-newbies

On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
> Florian Kauer <florian.kauer@linutronix.de> writes:
> 
>> Hi,
>> I am currently implementing an eBPF for redirecting from one physical interface to another. So basically loading the following at enp8s0:
>>
>> SEC("prog")
>> int xdp_redirect(struct xdp_md *ctx) {
>> 	/* ... */
>> 	return bpf_redirect(3 /* enp5s0 */, 0);
>> }
>>
>> I made three observations that I would like to discuss with you:
>>
>> 1. The redirection only works when I ALSO load some eBPF at the egress interface (enp5s0). It can be just
>>
>> SEC("prog")
>> int xdp_pass(struct xdp_md *ctx) {
>> 	return XDP_PASS;
>> }
>>
>> but there has to be at least something. Otherwise, only xdp_redirect is called, but xdp_devmap_xmit is not.
>> It seems somewhat reasonable that the interface where the traffic is redirected to also needs to have the
>> XDP functionality initialized somehow, but it was unexpected at first. It tested it with an i225-IT (igc driver)
>> and a 82576 (igb driver). So, is this a bug or a feature?
> 
> I personally consider it a bug, but all the Intel drivers work this way,
> unfortunately. The was some discussion around making the XDP feature
> bits read-write, making it possible to enable XDP via ethtool instead of
> having to load a dummy XDP program. But no patches have materialised yet.

I see, thanks! So at least it is expected behavior for now.
How do other non-Intel drivers handle this?


>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>> TX it is "AF_XDP TX Metadata"
>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>> That seems to imply that TX metadata only works for AF_XDP, but not
>> for direct redirection. Is there a reason for that?
> 
> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
> around to extending this to the regular XDP forwarding path yet.

Ok, that is fine. I had the fear that there is some fundamental problem
that prevents to implement this.


>> 3. At least for the igc, the egress queue is currently selected by
>> using the smp_processor_id.
>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igc/igc_main.c?h=v6.8-rc4#n2453)
>> For our application, I would like to define the queue on a per-packet
>> basis via the eBPF. This would allow to steer the traffic to the
>> correct queue when using TAPRIO full hardware offload. Do you see any
>> problem with introducing a new metadata field to define the egress
>> queue?
> 
> Well, a couple :)
> 
> 1. We'd have to find agreement across drivers for a numbering scheme to
> refer to queues.

Good point! At least we already refer to queues in the MQPRIO qdisc
( queues count1@offset1 count2@offset2 ... ).
There might be different alternatives (like using the traffic class)
for this IF we want to implement this ...

> 2. Selecting queues based on CPU index the way its done now means we
> guarantee that the same queue will only be served from one CPU. Which
> means we don't have to do any locking, which helps tremendously with
> performance. Drivers handle the case where there are more CPUs than
> queues a bit differently, but the ones that do generally have a lock
> (with associated performance overhead).

... but this will likely completely prevent to implement this in the
straight forward way. You are right, we do not want the CPUs to constantly
fight for access to the same queues for every packet.

> As a workaround, you can use a cpumap to steer packets to specific CPUs
> and perform the egress redirect inside the cpumap instead of directly on
> RX. Requires a bit of knowledge of the hardware configuration, but it
> may be enough for what you're trying to do.

So I really like this approach on first glance since it prevents the issue
you describe above.

However, as you write, it is very hardware dependent and also depends on
how exactly the driver handles the CPU -> Queue mapping internally.
I have the feeling that the mapping CPU % Queue Number -> Queue as it is
implemented in the moment might neither be stable over time nor over
different drivers, even if it is the most likely one.

What do you think maybe about exporting an interface (e.g. via ethtool)
to define the mapping of CPU -> Queue?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2024-02-12 14:35   ` Florian Kauer
@ 2024-02-13 13:00     ` Toke Høiland-Jørgensen
  2025-01-14 16:47     ` Marcus Wichelmann
  1 sibling, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-13 13:00 UTC (permalink / raw)
  To: Florian Kauer, xdp-hints, xdp-newbies

Florian Kauer <florian.kauer@linutronix.de> writes:

> On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
>> Florian Kauer <florian.kauer@linutronix.de> writes:
>> 
>>> Hi,
>>> I am currently implementing an eBPF for redirecting from one physical interface to another. So basically loading the following at enp8s0:
>>>
>>> SEC("prog")
>>> int xdp_redirect(struct xdp_md *ctx) {
>>> 	/* ... */
>>> 	return bpf_redirect(3 /* enp5s0 */, 0);
>>> }
>>>
>>> I made three observations that I would like to discuss with you:
>>>
>>> 1. The redirection only works when I ALSO load some eBPF at the egress interface (enp5s0). It can be just
>>>
>>> SEC("prog")
>>> int xdp_pass(struct xdp_md *ctx) {
>>> 	return XDP_PASS;
>>> }
>>>
>>> but there has to be at least something. Otherwise, only xdp_redirect is called, but xdp_devmap_xmit is not.
>>> It seems somewhat reasonable that the interface where the traffic is redirected to also needs to have the
>>> XDP functionality initialized somehow, but it was unexpected at first. It tested it with an i225-IT (igc driver)
>>> and a 82576 (igb driver). So, is this a bug or a feature?
>> 
>> I personally consider it a bug, but all the Intel drivers work this way,
>> unfortunately. The was some discussion around making the XDP feature
>> bits read-write, making it possible to enable XDP via ethtool instead of
>> having to load a dummy XDP program. But no patches have materialised yet.
>
> I see, thanks! So at least it is expected behavior for now.
> How do other non-Intel drivers handle this?

I believe Mellanox drivers have some kind of global switch that can
completely disable XDP, but if it's enabled (which it is by default)
everything works including redirect. Other drivers just have XDP
features always enabled.

>>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>>> TX it is "AF_XDP TX Metadata"
>>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>>> That seems to imply that TX metadata only works for AF_XDP, but not
>>> for direct redirection. Is there a reason for that?
>> 
>> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
>> around to extending this to the regular XDP forwarding path yet.
>
> Ok, that is fine. I had the fear that there is some fundamental problem
> that prevents to implement this.
>
>
>>> 3. At least for the igc, the egress queue is currently selected by
>>> using the smp_processor_id.
>>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igc/igc_main.c?h=v6.8-rc4#n2453)
>>> For our application, I would like to define the queue on a per-packet
>>> basis via the eBPF. This would allow to steer the traffic to the
>>> correct queue when using TAPRIO full hardware offload. Do you see any
>>> problem with introducing a new metadata field to define the egress
>>> queue?
>> 
>> Well, a couple :)
>> 
>> 1. We'd have to find agreement across drivers for a numbering scheme to
>> refer to queues.
>
> Good point! At least we already refer to queues in the MQPRIO qdisc
> ( queues count1@offset1 count2@offset2 ... ).
> There might be different alternatives (like using the traffic class)
> for this IF we want to implement this ...

Oh, plenty of options; the tricky bit is agreeing on one, and figuring
out what the right kernel abstraction is. For instance, in the regular
networking stack, the concept of a queue is exposed from the driver into
the core stack, but for XDP it isn't.

>> 2. Selecting queues based on CPU index the way its done now means we
>> guarantee that the same queue will only be served from one CPU. Which
>> means we don't have to do any locking, which helps tremendously with
>> performance. Drivers handle the case where there are more CPUs than
>> queues a bit differently, but the ones that do generally have a lock
>> (with associated performance overhead).
>
> ... but this will likely completely prevent to implement this in the
> straight forward way. You are right, we do not want the CPUs to constantly
> fight for access to the same queues for every packet.
>
>> As a workaround, you can use a cpumap to steer packets to specific CPUs
>> and perform the egress redirect inside the cpumap instead of directly on
>> RX. Requires a bit of knowledge of the hardware configuration, but it
>> may be enough for what you're trying to do.
>
> So I really like this approach on first glance since it prevents the issue
> you describe above.
>
> However, as you write, it is very hardware dependent and also depends on
> how exactly the driver handles the CPU -> Queue mapping internally.
> I have the feeling that the mapping CPU % Queue Number -> Queue as it is
> implemented in the moment might neither be stable over time nor over
> different drivers, even if it is the most likely one.

No, the application would have to figure that out. FWIW I looked at this
for other reasons at some point and didn't find any drivers that did
something different than using the CPU number (with or without the
modulus operation). So in practice I think using the CPU ID as a proxy
for queue number will work just fine on most hardware...

> What do you think maybe about exporting an interface (e.g. via
> ethtool) to define the mapping of CPU -> Queue?

Well, this would require ethtool to know about those queues, which means
defining a driver<->stack concept of queues for XDP. There was some
attempt at doing this some years ago, but it never went anywhere,
unfortunately. I personally think doing something like this would be
worthwhile, but it's a decidedly non-trivial undertaking :)

-Toke

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2024-02-12 14:35   ` Florian Kauer
  2024-02-13 13:00     ` Toke Høiland-Jørgensen
@ 2025-01-14 16:47     ` Marcus Wichelmann
  2025-01-14 18:07       ` Florian Kauer
  1 sibling, 1 reply; 8+ messages in thread
From: Marcus Wichelmann @ 2025-01-14 16:47 UTC (permalink / raw)
  To: Florian Kauer, Toke Høiland-Jørgensen, xdp-hints, xdp-newbies
  Cc: hawk, sdf

Am 12.02.24 um 15:35 schrieb Florian Kauer:
> On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
>> Florian Kauer <florian.kauer@linutronix.de> writes:
>>
>>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>>> TX it is "AF_XDP TX Metadata"
>>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>>> That seems to imply that TX metadata only works for AF_XDP, but not
>>> for direct redirection. Is there a reason for that?
>>
>> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
>> around to extending this to the regular XDP forwarding path yet.
> 
> Ok, that is fine. I had the fear that there is some fundamental problem
> that prevents to implement this.

Hi,
are there any updates on this? I'm currently looking into this as well.

I'd like to have a way to enable the TX checksum offload when redirecting from
one device to another.
Stanislav Fomichev already implemented [1] the TX offload support for the AF_XDP
use case (thanks for that), but for now, this cannot be used for "regular"
redirects.

I'm currently in a position where I can invest some work into this, but figured
it would make sense to ask you first:

Do you already have concrete plans or ideas in mind, how the API to trigger the
TX offloads should look like?
I have seen the talk [2] from Jesper about this, but I'm not sure if the
proposals in there are still up to date.

I think it could be possible to introduce a program flag, just like
`BPF_F_XDP_HAS_FRAGS`, and if this flag is set, interpret a part of the
metadata area as a `struct xsk_tx_metadata`. Then, the code to apply the
offloads from that struct when xmit-ing the frame could be reused, as it
is already implemented in `mlx5e_xmit_xdp_frame` for example.
But the "xsk" in the struct name may be a bit confusing. :/

Do you think this could work or could you guide me into a direction that may
have a chance to be upstreamable? Also, is there any recent work on this that
I should know off?

Thanks!

Marcus Wichelmann
Hetzner Cloud GmbH

[1] https://lore.kernel.org/bpf/20231127190319.1190813-3-sdf@google.com/
[2] https://lpc.events/event/16/contributions/1362/attachments/1056/2017/xdp-hints-lpc2022.pdf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2025-01-14 16:47     ` Marcus Wichelmann
@ 2025-01-14 18:07       ` Florian Kauer
  2025-01-15 10:23         ` Jesper Dangaard Brouer
       [not found]         ` <173693662447.106245.11283936919402528400@gauss>
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Kauer @ 2025-01-14 18:07 UTC (permalink / raw)
  To: Marcus Wichelmann, Toke Høiland-Jørgensen, xdp-hints,
	xdp-newbies
  Cc: hawk, sdf

Hi Marcus,

On 1/14/25 17:47, Marcus Wichelmann wrote:
> Am 12.02.24 um 15:35 schrieb Florian Kauer:
>> On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
>>> Florian Kauer <florian.kauer@linutronix.de> writes:
>>>
>>>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>>>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>>>> TX it is "AF_XDP TX Metadata"
>>>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>>>> That seems to imply that TX metadata only works for AF_XDP, but not
>>>> for direct redirection. Is there a reason for that?
>>>
>>> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
>>> around to extending this to the regular XDP forwarding path yet.
>>
>> Ok, that is fine. I had the fear that there is some fundamental problem
>> that prevents to implement this.
> 
> Hi,
> are there any updates on this? I'm currently looking into this as well.

I am still interested, but have no implementation planned short- or mid-term.
So, looking forward to your implementation :-)

Greetings,
Florian

> 
> I'd like to have a way to enable the TX checksum offload when redirecting from
> one device to another.
> Stanislav Fomichev already implemented [1] the TX offload support for the AF_XDP
> use case (thanks for that), but for now, this cannot be used for "regular"
> redirects.
> 
> I'm currently in a position where I can invest some work into this, but figured
> it would make sense to ask you first:
> 
> Do you already have concrete plans or ideas in mind, how the API to trigger the
> TX offloads should look like?
> I have seen the talk [2] from Jesper about this, but I'm not sure if the
> proposals in there are still up to date.
> 
> I think it could be possible to introduce a program flag, just like
> `BPF_F_XDP_HAS_FRAGS`, and if this flag is set, interpret a part of the
> metadata area as a `struct xsk_tx_metadata`. Then, the code to apply the
> offloads from that struct when xmit-ing the frame could be reused, as it
> is already implemented in `mlx5e_xmit_xdp_frame` for example.
> But the "xsk" in the struct name may be a bit confusing. :/
> 
> Do you think this could work or could you guide me into a direction that may
> have a chance to be upstreamable? Also, is there any recent work on this that
> I should know off?
> 
> Thanks!
> 
> Marcus Wichelmann
> Hetzner Cloud GmbH
> 
> [1] https://lore.kernel.org/bpf/20231127190319.1190813-3-sdf@google.com/
> [2] https://lpc.events/event/16/contributions/1362/attachments/1056/2017/xdp-hints-lpc2022.pdf
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
  2025-01-14 18:07       ` Florian Kauer
@ 2025-01-15 10:23         ` Jesper Dangaard Brouer
       [not found]         ` <173693662447.106245.11283936919402528400@gauss>
  1 sibling, 0 replies; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2025-01-15 10:23 UTC (permalink / raw)
  To: Florian Kauer, Marcus Wichelmann,
	Toke Høiland-Jørgensen, xdp-hints, xdp-newbies
  Cc: Stanislav Fomichev, Arthur Fabre, Jakub Sitnicki, Netdev, kernel-team



On 14/01/2025 19.07, Florian Kauer wrote:
> Hi Marcus,
> 
> On 1/14/25 17:47, Marcus Wichelmann wrote:
>> Am 12.02.24 um 15:35 schrieb Florian Kauer:
>>> On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
>>>> Florian Kauer <florian.kauer@linutronix.de> writes:
>>>>
>>>>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>>>>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>>>>> TX it is "AF_XDP TX Metadata"
>>>>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>>>>> That seems to imply that TX metadata only works for AF_XDP, but not
>>>>> for direct redirection. Is there a reason for that?
>>>>
>>>> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
>>>> around to extending this to the regular XDP forwarding path yet.
>>>
>>> Ok, that is fine. I had the fear that there is some fundamental problem
>>> that prevents to implement this.
>>
>> Hi,
>> are there any updates on this? I'm currently looking into this as well.
> 
> I am still interested, but have no implementation planned short- or mid-term.
> So, looking forward to your implementation :-)
> 
> Greetings,
> Florian
> 
>>
>> I'd like to have a way to enable the TX checksum offload when redirecting from
>> one device to another.
>> Stanislav Fomichev already implemented [1] the TX offload support for the AF_XDP
>> use case (thanks for that), but for now, this cannot be used for "regular"
>> redirects.
>>
>> I'm currently in a position where I can invest some work into this, but figured
>> it would make sense to ask you first:
>>
>> Do you already have concrete plans or ideas in mind, how the API to trigger the
>> TX offloads should look like?
>> I have seen the talk [2] from Jesper about this, but I'm not sure if the
>> proposals in there are still up to date.

My talk is outdated. My co-workers Arthur and Jakub did a
presentation[3] at LPC2024.  Alexei liked the Compressed Key-Value store
idea from that presentation[3].   So, we are currently working on a
Compressed Key-Value store that Arthur named "traits".  We are almost
done benchmarking this, see traits0N_* documents in [4].

[3] https://lpc.events/event/18/contributions/1935/
[4] https://github.com/xdp-project/xdp-project/blob/main/areas/hints/

Our implementation is primarily focused on the RX side, and transferring 
  RX hardware metadata to CPUMAP+veth when doing XDP_REDIRECT.

You ask is about TX side, right?

>> I think it could be possible to introduce a program flag, just like
>> `BPF_F_XDP_HAS_FRAGS`, and if this flag is set, interpret a part of the
>> metadata area as a `struct xsk_tx_metadata`. Then, the code to apply the
>> offloads from that struct when xmit-ing the frame could be reused, as it
>> is already implemented in `mlx5e_xmit_xdp_frame` for example.
>> But the "xsk" in the struct name may be a bit confusing. :/
>>
>> Do you think this could work or could you guide me into a direction that may
>> have a chance to be upstreamable? Also, is there any recent work on this that
>> I should know off?
>>
>> Thanks!
>>
>> Marcus Wichelmann
>> Hetzner Cloud GmbH
>>
>> [1] https://lore.kernel.org/bpf/20231127190319.1190813-3-sdf@google.com/
>> [2] https://lpc.events/event/16/contributions/1362/attachments/1056/2017/xdp-hints-lpc2022.pdf
>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [xdp-hints] Re: XDP Redirect and TX Metadata
       [not found]         ` <173693662447.106245.11283936919402528400@gauss>
@ 2025-01-15 10:30           ` Marcus Wichelmann
  0 siblings, 0 replies; 8+ messages in thread
From: Marcus Wichelmann @ 2025-01-15 10:30 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Florian Kauer,
	Toke Høiland-Jørgensen, xdp-hints, xdp-newbies
  Cc: Stanislav Fomichev, Arthur Fabre, Jakub Sitnicki, Netdev, kernel-team

Hi!

Am 15.01.25 um 11:23 schrieb Jesper Dangaard Brouer via xdp-hints:
> 
> On 14/01/2025 19.07, Florian Kauer wrote:
>> Hi Marcus,
>>
>> On 1/14/25 17:47, Marcus Wichelmann wrote:
>>> Am 12.02.24 um 15:35 schrieb Florian Kauer:
>>>> On 12.02.24 14:41, Toke Høiland-Jørgensen wrote:
>>>>> Florian Kauer <florian.kauer@linutronix.de> writes:
>>>>>
>>>>>> 2. For the RX side, the metadata is documented as "XDP RX Metadata"
>>>>>> (https://docs.kernel.org/networking/xdp-rx-metadata.html), while for
>>>>>> TX it is "AF_XDP TX Metadata"
>>>>>> (https://www.kernel.org/doc/html/next/networking/xsk-tx-metadata.html).
>>>>>> That seems to imply that TX metadata only works for AF_XDP, but not
>>>>>> for direct redirection. Is there a reason for that?
>>>>>
>>>>> Well, IIRC, AF_XDP was the most pressing use case, and no one has gotten
>>>>> around to extending this to the regular XDP forwarding path yet.
>>>>
>>>> Ok, that is fine. I had the fear that there is some fundamental problem
>>>> that prevents to implement this.
>>>
>>> Hi,
>>> are there any updates on this? I'm currently looking into this as well.
>>
>> I am still interested, but have no implementation planned short- or mid-term.
>> So, looking forward to your implementation 🙂
>>
>> Greetings,
>> Florian
>>
>>>
>>> I'd like to have a way to enable the TX checksum offload when redirecting from
>>> one device to another.
>>> Stanislav Fomichev already implemented [1] the TX offload support for the AF_XDP
>>> use case (thanks for that), but for now, this cannot be used for "regular"
>>> redirects.
>>>
>>> I'm currently in a position where I can invest some work into this, but figured
>>> it would make sense to ask you first:
>>>
>>> Do you already have concrete plans or ideas in mind, how the API to trigger the
>>> TX offloads should look like?
>>> I have seen the talk [2] from Jesper about this, but I'm not sure if the
>>> proposals in there are still up to date.
> 
> My talk is outdated. My co-workers Arthur and Jakub did a
> presentation[3] at LPC2024.  Alexei liked the Compressed Key-Value store
> idea from that presentation[3].   So, we are currently working on a
> Compressed Key-Value store that Arthur named "traits".  We are almost
> done benchmarking this, see traits0N_* documents in [4].
> 
> [3] https://lpc.events/event/18/contributions/1935/
> [4] https://github.com/xdp-project/xdp-project/blob/main/areas/hints/

Ah, thanks, I will look into this. I have seen some patches about the traits but had not realized what this is for. Great!

> Our implementation is primarily focused on the RX side, and transferring  RX hardware metadata to CPUMAP+veth when doing XDP_REDIRECT.
> 
> You ask is about TX side, right?

Yes. Would the traits work for the TX metadata as well?

I should probably wait for it then and implement a temporary solution for our use cases in the meantime.

> 
>>> I think it could be possible to introduce a program flag, just like
>>> `BPF_F_XDP_HAS_FRAGS`, and if this flag is set, interpret a part of the
>>> metadata area as a `struct xsk_tx_metadata`. Then, the code to apply the
>>> offloads from that struct when xmit-ing the frame could be reused, as it
>>> is already implemented in `mlx5e_xmit_xdp_frame` for example.
>>> But the "xsk" in the struct name may be a bit confusing. :/
>>>
>>> Do you think this could work or could you guide me into a direction that may
>>> have a chance to be upstreamable? Also, is there any recent work on this that
>>> I should know off?
>>>
>>> Thanks!
>>>
>>> Marcus Wichelmann
>>> Hetzner Cloud GmbH
>>>
>>> [1] https://lore.kernel.org/bpf/20231127190319.1190813-3-sdf@google.com/
>>> [2] https://lpc.events/event/16/contributions/1362/attachments/1056/2017/xdp-hints-lpc2022.pdf
>>>

Marcus

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-01-15 10:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-12  8:27 [xdp-hints] XDP Redirect and TX Metadata Florian Kauer
2024-02-12 13:41 ` [xdp-hints] " Toke Høiland-Jørgensen
2024-02-12 14:35   ` Florian Kauer
2024-02-13 13:00     ` Toke Høiland-Jørgensen
2025-01-14 16:47     ` Marcus Wichelmann
2025-01-14 18:07       ` Florian Kauer
2025-01-15 10:23         ` Jesper Dangaard Brouer
     [not found]         ` <173693662447.106245.11283936919402528400@gauss>
2025-01-15 10:30           ` Marcus Wichelmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox