XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
* A look into XDP hints for AF_XDP
@ 2021-06-24  0:10 Desouza, Ederson
  2021-06-24 19:54 ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 8+ messages in thread
From: Desouza, Ederson @ 2021-06-24  0:10 UTC (permalink / raw)
  To: xdp-hints, bpf; +Cc: saeed, Lobakin, Alexandr, Swiatkowski, Michal, brouer

Hi,

Following current discussions around XDP hints, it's clear that
currently the focus is on BPF applications. But my interest is in the
AF_XDP side of things - user space applications.

In there, there's not much help from BPF CO-RE - who's going to rewrite
user space structs, after all? So, I decided to give a try at a
possible implementation, using igc driver as I'm more used to it, and
come here ask some questions about it.

For the curious, here's my branch with current work:

https://github.com/edersondisouza/linux/tree/xdp-hints

It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
decided to incorporate some of the CO-RE related feedback, so I could
have something that also works with BPF applications. Please not that
I'm not trying to jump ahead of them in incorporating the feedback -
probably they have something more robust here - but if you see some
value in my patches, feel free to reuse/incorporate them (if they are
just an example of what not to do, it's still an example =D ).
I also added some XDP ZC patches for igc that are still moving to
mainline.

In there, I basically defined a sample of "generic hints", that is
basically an struct with common hints, such as RX and TX timestamp,
hash, etc. I also included two more members to that struct: field_map
and extension_id. The first, shows which members are actually valid in
the data, the second is an arbitrary id that drivers can use to say
"there's extra data" beyond the generic members, and how to interpret
what's there is driver specific. A BTF is also created to represent
this struct, and registering is done the same way Saeed's patch did.

User space developers that need to get the struct can use something
like to get it from the driver:

  # tools/bpf/bpftool/bpftool net xdp show
  xdp:
  enp6s0(5) md_btf_id(60) md_btf_enabled(1)

And use the btf_id to get the struct:

  # bpftool btf dump file /sys/kernel/btf/igc format c

Currently though, that's bad - as in this case the struct has no types,
only the field names. Why?

With the driver specific struct (or by using the generic one, if no
specific fields are needed), the application can then access the XDP
frame metadata. I've also added some helpers to aid getting the
metadata.

I added some examples on how to use those (they may be too simplistic),
so it's possible to get a feel on how this API might work.

My goals for this email are to check if this approach is valid and what
pitfalls can you see. I didn't send a patch series yet to not jump
ahead Alexandr and Michal work (I can rebase on top of their work
later) and because the igc RX and TX timestamp implementation I'm using
to provide more real looking data is not yet complete.

Another goal is to ensure that AF_XDP side is not forgotten in the XDP
hints discussion =D

Naturally, if someone finds any issue trying those patches, please let
me know!

Thanks!
-- 
Ederson de Souza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24  0:10 A look into XDP hints for AF_XDP Desouza, Ederson
@ 2021-06-24 19:54 ` Jesper Dangaard Brouer
  2021-06-24 21:54   ` Desouza, Ederson
  0 siblings, 1 reply; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2021-06-24 19:54 UTC (permalink / raw)
  To: Desouza, Ederson
  Cc: xdp-hints, bpf, saeed, Lobakin, Alexandr, Swiatkowski, Michal,
	brouer, Andrii Nakryiko, Karlsson, Magnus

On Thu, 24 Jun 2021 00:10:12 +0000
"Desouza, Ederson" <ederson.desouza@intel.com> wrote:

> Following current discussions around XDP hints, it's clear that
> currently the focus is on BPF applications. But my interest is in the
> AF_XDP side of things - user space applications.

I agree, that most of the discussion is focused on BPF-programs being
loaded into the kernel via libbpf.  I actually also care about getting
this working for AF_XDP.

We've discussed this with Magnus (meeting yesterday) and I think we
agree that this is also something we want for AF_XDP.  IIRC the plan is
to use one bit to indicate if a packet is carrying info in metadata
area, as (1) AF_XDP descriptor don't have room for storing the BTF-ID,
and (2) if bit is not set, then we can avoid touching that cache-line.
If the bit is set, then the BTF-ID is stored in metadata area
(preferably as the last member, as ctx->data_meta is a minus offset
from ctx->data, making it accessible via a fixed offset from data).

For the BPF-programs it would make sense to store the BTF-ID in
xdp_buff/xdp_frame and make it accessible via xdp_md (ctx seen from
BPF-prog).  To help AF_XDP the *proposal* is to (also) store it in
metadata area itself.


> In there, there's not much help from BPF CO-RE - who's going to rewrite
> user space structs, after all? 

Well, AFAIK most of the offset relocation happens in user-space by
libbpf.  Which Alexei also indicate in the other thread[1]. To better
understand BTF/CO-RE I've coded up an example here[2]. 

 [1] https://lore.kernel.org/bpf/CAADnVQKv5SLBfnBWnEBFqf0-DQv+NZuixGiCVx1hewfQFhHSKg@mail.gmail.com/
 [2] https://github.com/xdp-project/bpf-examples/blob/master/ktrace-CO-RE/ktrace01_kern.c

I'm trying to understand how libbpf does this.  So, I added a --debug
option that makes libbpf print verbose messages. See commit[3] that
also contains output example.

 [3] https://github.com/xdp-project/bpf-examples/commit/0542d8a7a327b642d105

Some of the --debug output:

 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
 [...]
 libbpf: CO-RE relocating [0] struct sk_buff___local: found target candidate [2965] struct sk_buff in [vmlinux]
 libbpf: prog 'udp_send_skb': relo #1: matching candidate #0 [2965] struct sk_buff.hash (0:55 @ offset 148)
 libbpf: prog 'udp_send_skb': relo #1: patched insn #1 (ALU/ALU64) imm 4 -> 148
 libbpf: prog 'udp_send_skb': relo #2: kind <byte_off> (0), spec is [7] struct sk_buff___local.len (0:0 @ offset 0)
 libbpf: prog 'udp_send_skb': relo #2: matching candidate #0 [2965] struct sk_buff.len (0:6 @ offset 112)
 libbpf: prog 'udp_send_skb': relo #2: patched insn #8 (ALU/ALU64) imm 0 -> 112
 libbpf: prog 'udp_send_skb': relo #3: kind <target_type_id> (7), spec is [7] struct sk_buff___local
 libbpf: prog 'udp_send_skb': relo #3: matching candidate #0 [2965] struct sk_buff
 libbpf: prog 'udp_send_skb': relo #3: patched insn #24 (ALU/ALU64) imm 7 -> 2965

As indicated in [1] a BTF matching is being done in userspace. First
libbpf loads kernels BTF from '/sys/kernel/btf/vmlinux'.  Then it have
the BTF from BPF-prog 'sk_buff___local' which finds target 'struct
sk_buff' as btf_id 2965.  Afterwards it patches the relocations in the
byte code.


> So, I decided to give a try at a possible implementation, using igc
> driver as I'm more used to it, and come here ask some questions about
> it.
> 
> For the curious, here's my branch with current work:
> 
> https://github.com/edersondisouza/linux/tree/xdp-hints
> 
> It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
> decided to incorporate some of the CO-RE related feedback, so I could
> have something that also works with BPF applications. Please not that
> I'm not trying to jump ahead of them in incorporating the feedback -
> probably they have something more robust here - but if you see some
> value in my patches, feel free to reuse/incorporate them (if they are
> just an example of what not to do, it's still an example =D ).
> I also added some XDP ZC patches for igc that are still moving to
> mainline.
> 
> In there, I basically defined a sample of "generic hints", that is
> basically an struct with common hints, such as RX and TX timestamp,
> hash, etc. I also included two more members to that struct: field_map
> and extension_id. The first, shows which members are actually valid in
> the data, the second is an arbitrary id that drivers can use to say
> "there's extra data" beyond the generic members, and how to interpret
> what's there is driver specific. A BTF is also created to represent
> this struct, and registering is done the same way Saeed's patch did.
> 
> User space developers that need to get the struct can use something
> like to get it from the driver:
> 
>   # tools/bpf/bpftool/bpftool net xdp show
>   xdp:
>   enp6s0(5) md_btf_id(60) md_btf_enabled(1)
> 
> And use the btf_id to get the struct:
> 
>   # bpftool btf dump file /sys/kernel/btf/igc format c
> 
> Currently though, that's bad - as in this case the struct has no
> types, only the field names. Why?

I don't follow, what is not working?

> With the driver specific struct (or by using the generic one, if no
> specific fields are needed), the application can then access the XDP
> frame metadata. I've also added some helpers to aid getting the
> metadata.
> 
> I added some examples on how to use those (they may be too simplistic),
> so it's possible to get a feel on how this API might work.
> 
> My goals for this email are to check if this approach is valid and what
> pitfalls can you see. I didn't send a patch series yet to not jump
> ahead Alexandr and Michal work (I can rebase on top of their work
> later) and because the igc RX and TX timestamp implementation I'm using
> to provide more real looking data is not yet complete.
> 
> Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> hints discussion =D

Thanks for pointing that out :-)

> Naturally, if someone finds any issue trying those patches, please let
> me know!

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24 19:54 ` Jesper Dangaard Brouer
@ 2021-06-24 21:54   ` Desouza, Ederson
  2021-06-24 22:17     ` Desouza, Ederson
  2021-07-07 22:26     ` Andrii Nakryiko
  0 siblings, 2 replies; 8+ messages in thread
From: Desouza, Ederson @ 2021-06-24 21:54 UTC (permalink / raw)
  To: brouer
  Cc: Swiatkowski, Michal, xdp-hints, Lobakin, Alexandr, Karlsson,
	Magnus, saeed, bpf, andrii.nakryiko

On Thu, 2021-06-24 at 21:54 +0200, Jesper Dangaard Brouer wrote:
> On Thu, 24 Jun 2021 00:10:12 +0000
> "Desouza, Ederson" <ederson.desouza@intel.com> wrote:
> 
> > Following current discussions around XDP hints, it's clear that
> > currently the focus is on BPF applications. But my interest is in the
> > AF_XDP side of things - user space applications.
> 
> I agree, that most of the discussion is focused on BPF-programs being
> loaded into the kernel via libbpf.  I actually also care about getting
> this working for AF_XDP.
> 
> We've discussed this with Magnus (meeting yesterday) and I think we
> agree that this is also something we want for AF_XDP.  IIRC the plan is
> to use one bit to indicate if a packet is carrying info in metadata
> area, as (1) AF_XDP descriptor don't have room for storing the BTF-ID,
> and (2) if bit is not set, then we can avoid touching that cache-line.
> If the bit is set, then the BTF-ID is stored in metadata area
> (preferably as the last member, as ctx->data_meta is a minus offset
> from ctx->data, making it accessible via a fixed offset from data).
> 
> For the BPF-programs it would make sense to store the BTF-ID in
> xdp_buff/xdp_frame and make it accessible via xdp_md (ctx seen from
> BPF-prog).  To help AF_XDP the *proposal* is to (also) store it in
> metadata area itself.
> 
> 
> > In there, there's not much help from BPF CO-RE - who's going to rewrite
> > user space structs, after all? 
> 
> Well, AFAIK most of the offset relocation happens in user-space by
> libbpf.  Which Alexei also indicate in the other thread[1]. To better
> understand BTF/CO-RE I've coded up an example here[2]. 
> 
>  [1] https://lore.kernel.org/bpf/CAADnVQKv5SLBfnBWnEBFqf0-DQv+NZuixGiCVx1hewfQFhHSKg@mail.gmail.com/
>  [2] https://github.com/xdp-project/bpf-examples/blob/master/ktrace-CO-RE/ktrace01_kern.c
> 
> I'm trying to understand how libbpf does this.  So, I added a --debug
> option that makes libbpf print verbose messages. See commit[3] that
> also contains output example.
> 
>  [3] https://github.com/xdp-project/bpf-examples/commit/0542d8a7a327b642d105
> 
> Some of the --debug output:
> 
>  libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
>  [...]
>  libbpf: CO-RE relocating [0] struct sk_buff___local: found target candidate [2965] struct sk_buff in [vmlinux]
>  libbpf: prog 'udp_send_skb': relo #1: matching candidate #0 [2965] struct sk_buff.hash (0:55 @ offset 148)
>  libbpf: prog 'udp_send_skb': relo #1: patched insn #1 (ALU/ALU64) imm 4 -> 148
>  libbpf: prog 'udp_send_skb': relo #2: kind <byte_off> (0), spec is [7] struct sk_buff___local.len (0:0 @ offset 0)
>  libbpf: prog 'udp_send_skb': relo #2: matching candidate #0 [2965] struct sk_buff.len (0:6 @ offset 112)
>  libbpf: prog 'udp_send_skb': relo #2: patched insn #8 (ALU/ALU64) imm 0 -> 112
>  libbpf: prog 'udp_send_skb': relo #3: kind <target_type_id> (7), spec is [7] struct sk_buff___local
>  libbpf: prog 'udp_send_skb': relo #3: matching candidate #0 [2965] struct sk_buff
>  libbpf: prog 'udp_send_skb': relo #3: patched insn #24 (ALU/ALU64) imm 7 -> 2965
> 
> As indicated in [1] a BTF matching is being done in userspace. First
> libbpf loads kernels BTF from '/sys/kernel/btf/vmlinux'.  Then it have
> the BTF from BPF-prog 'sk_buff___local' which finds target 'struct
> sk_buff' as btf_id 2965.  Afterwards it patches the relocations in the
> byte code.
> 

Hmmm... that's something I definitely want to try =D

> 
> > So, I decided to give a try at a possible implementation, using igc
> > driver as I'm more used to it, and come here ask some questions about
> > it.
> > 
> > For the curious, here's my branch with current work:
> > 
> > https://github.com/edersondisouza/linux/tree/xdp-hints
> > 
> > It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
> > decided to incorporate some of the CO-RE related feedback, so I could
> > have something that also works with BPF applications. Please not that
> > I'm not trying to jump ahead of them in incorporating the feedback -
> > probably they have something more robust here - but if you see some
> > value in my patches, feel free to reuse/incorporate them (if they are
> > just an example of what not to do, it's still an example =D ).
> > I also added some XDP ZC patches for igc that are still moving to
> > mainline.
> > 
> > In there, I basically defined a sample of "generic hints", that is
> > basically an struct with common hints, such as RX and TX timestamp,
> > hash, etc. I also included two more members to that struct: field_map
> > and extension_id. The first, shows which members are actually valid in
> > the data, the second is an arbitrary id that drivers can use to say
> > "there's extra data" beyond the generic members, and how to interpret
> > what's there is driver specific. A BTF is also created to represent
> > this struct, and registering is done the same way Saeed's patch did.
> > 
> > User space developers that need to get the struct can use something
> > like to get it from the driver:
> > 
> >   # tools/bpf/bpftool/bpftool net xdp show
> >   xdp:
> >   enp6s0(5) md_btf_id(60) md_btf_enabled(1)
> > 
> > And use the btf_id to get the struct:
> > 
> >   # bpftool btf dump file /sys/kernel/btf/igc format c
> > 
> > Currently though, that's bad - as in this case the struct has no
> > types, only the field names. Why?
> 
> I don't follow, what is not working?

I get something like this:

  struct xdp_hints {
         yet_another_timestamp;
         rx_timestamp;
         tx_timestamp;
         hash32;
         extension_id;
         field_map;
  };

Note how there's no type before the fields, one has to figure out if
`rx_timestamp` is u32 or u64.


> 
> > With the driver specific struct (or by using the generic one, if no
> > specific fields are needed), the application can then access the XDP
> > frame metadata. I've also added some helpers to aid getting the
> > metadata.
> > 
> > I added some examples on how to use those (they may be too simplistic),
> > so it's possible to get a feel on how this API might work.
> > 
> > My goals for this email are to check if this approach is valid and what
> > pitfalls can you see. I didn't send a patch series yet to not jump
> > ahead Alexandr and Michal work (I can rebase on top of their work
> > later) and because the igc RX and TX timestamp implementation I'm using
> > to provide more real looking data is not yet complete.
> > 
> > Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> > hints discussion =D
> 
> Thanks for pointing that out :-)
> 
> > Naturally, if someone finds any issue trying those patches, please let
> > me know!
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24 21:54   ` Desouza, Ederson
@ 2021-06-24 22:17     ` Desouza, Ederson
  2021-06-24 22:39       ` Alexei Starovoitov
  2021-07-07 22:26     ` Andrii Nakryiko
  1 sibling, 1 reply; 8+ messages in thread
From: Desouza, Ederson @ 2021-06-24 22:17 UTC (permalink / raw)
  To: brouer
  Cc: Swiatkowski, Michal, xdp-hints, Lobakin, Alexandr, Karlsson,
	Magnus, saeed, bpf, andrii.nakryiko

On Thu, 2021-06-24 at 14:54 -0700, Ederson de Souza wrote:
> On Thu, 2021-06-24 at 21:54 +0200, Jesper Dangaard Brouer wrote:
> > On Thu, 24 Jun 2021 00:10:12 +0000
> > "Desouza, Ederson" <ederson.desouza@intel.com> wrote:
> > 
> > > Following current discussions around XDP hints, it's clear that
> > > currently the focus is on BPF applications. But my interest is in the
> > > AF_XDP side of things - user space applications.
> > 
> > I agree, that most of the discussion is focused on BPF-programs being
> > loaded into the kernel via libbpf.  I actually also care about getting
> > this working for AF_XDP.
> > 
> > We've discussed this with Magnus (meeting yesterday) and I think we
> > agree that this is also something we want for AF_XDP.  IIRC the plan is
> > to use one bit to indicate if a packet is carrying info in metadata
> > area, as (1) AF_XDP descriptor don't have room for storing the BTF-ID,
> > and (2) if bit is not set, then we can avoid touching that cache-line.
> > If the bit is set, then the BTF-ID is stored in metadata area
> > (preferably as the last member, as ctx->data_meta is a minus offset
> > from ctx->data, making it accessible via a fixed offset from data).
> > 
> > For the BPF-programs it would make sense to store the BTF-ID in
> > xdp_buff/xdp_frame and make it accessible via xdp_md (ctx seen from
> > BPF-prog).  To help AF_XDP the *proposal* is to (also) store it in
> > metadata area itself.
> > 
> > 
> > > In there, there's not much help from BPF CO-RE - who's going to rewrite
> > > user space structs, after all? 
> > 
> > Well, AFAIK most of the offset relocation happens in user-space by
> > libbpf.  Which Alexei also indicate in the other thread[1]. To better
> > understand BTF/CO-RE I've coded up an example here[2]. 
> > 
> >  [1] https://lore.kernel.org/bpf/CAADnVQKv5SLBfnBWnEBFqf0-DQv+NZuixGiCVx1hewfQFhHSKg@mail.gmail.com/
> >  [2] https://github.com/xdp-project/bpf-examples/blob/master/ktrace-CO-RE/ktrace01_kern.c
> > 
> > I'm trying to understand how libbpf does this.  So, I added a --debug
> > option that makes libbpf print verbose messages. See commit[3] that
> > also contains output example.
> > 
> >  [3] https://github.com/xdp-project/bpf-examples/commit/0542d8a7a327b642d105
> > 
> > Some of the --debug output:
> > 
> >  libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
> >  [...]
> >  libbpf: CO-RE relocating [0] struct sk_buff___local: found target candidate [2965] struct sk_buff in [vmlinux]
> >  libbpf: prog 'udp_send_skb': relo #1: matching candidate #0 [2965] struct sk_buff.hash (0:55 @ offset 148)
> >  libbpf: prog 'udp_send_skb': relo #1: patched insn #1 (ALU/ALU64) imm 4 -> 148
> >  libbpf: prog 'udp_send_skb': relo #2: kind <byte_off> (0), spec is [7] struct sk_buff___local.len (0:0 @ offset 0)
> >  libbpf: prog 'udp_send_skb': relo #2: matching candidate #0 [2965] struct sk_buff.len (0:6 @ offset 112)
> >  libbpf: prog 'udp_send_skb': relo #2: patched insn #8 (ALU/ALU64) imm 0 -> 112
> >  libbpf: prog 'udp_send_skb': relo #3: kind <target_type_id> (7), spec is [7] struct sk_buff___local
> >  libbpf: prog 'udp_send_skb': relo #3: matching candidate #0 [2965] struct sk_buff
> >  libbpf: prog 'udp_send_skb': relo #3: patched insn #24 (ALU/ALU64) imm 7 -> 2965
> > 
> > As indicated in [1] a BTF matching is being done in userspace. First
> > libbpf loads kernels BTF from '/sys/kernel/btf/vmlinux'.  Then it have
> > the BTF from BPF-prog 'sk_buff___local' which finds target 'struct
> > sk_buff' as btf_id 2965.  Afterwards it patches the relocations in the
> > byte code.
> > 
> 
> Hmmm... that's something I definitely want to try =D

Wait - it may be done in user space by libbpf, but it needs the
instrumented object code. It won't work for pure user space
applications, like those which use AF_XDP. Unless we're going to build
them in a special way, like we do for the kernel side of BPF
applications.

> 
> > 
> > > So, I decided to give a try at a possible implementation, using igc
> > > driver as I'm more used to it, and come here ask some questions about
> > > it.
> > > 
> > > For the curious, here's my branch with current work:
> > > 
> > > https://github.com/edersondisouza/linux/tree/xdp-hints
> > > 
> > > It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
> > > decided to incorporate some of the CO-RE related feedback, so I could
> > > have something that also works with BPF applications. Please not that
> > > I'm not trying to jump ahead of them in incorporating the feedback -
> > > probably they have something more robust here - but if you see some
> > > value in my patches, feel free to reuse/incorporate them (if they are
> > > just an example of what not to do, it's still an example =D ).
> > > I also added some XDP ZC patches for igc that are still moving to
> > > mainline.
> > > 
> > > In there, I basically defined a sample of "generic hints", that is
> > > basically an struct with common hints, such as RX and TX timestamp,
> > > hash, etc. I also included two more members to that struct: field_map
> > > and extension_id. The first, shows which members are actually valid in
> > > the data, the second is an arbitrary id that drivers can use to say
> > > "there's extra data" beyond the generic members, and how to interpret
> > > what's there is driver specific. A BTF is also created to represent
> > > this struct, and registering is done the same way Saeed's patch did.
> > > 
> > > User space developers that need to get the struct can use something
> > > like to get it from the driver:
> > > 
> > >   # tools/bpf/bpftool/bpftool net xdp show
> > >   xdp:
> > >   enp6s0(5) md_btf_id(60) md_btf_enabled(1)
> > > 
> > > And use the btf_id to get the struct:
> > > 
> > >   # bpftool btf dump file /sys/kernel/btf/igc format c
> > > 
> > > Currently though, that's bad - as in this case the struct has no
> > > types, only the field names. Why?
> > 
> > I don't follow, what is not working?
> 
> I get something like this:
> 
>   struct xdp_hints {
>          yet_another_timestamp;
>          rx_timestamp;
>          tx_timestamp;
>          hash32;
>          extension_id;
>          field_map;
>   };
> 
> Note how there's no type before the fields, one has to figure out if
> `rx_timestamp` is u32 or u64.
> 
> 
> > 
> > > With the driver specific struct (or by using the generic one, if no
> > > specific fields are needed), the application can then access the XDP
> > > frame metadata. I've also added some helpers to aid getting the
> > > metadata.
> > > 
> > > I added some examples on how to use those (they may be too simplistic),
> > > so it's possible to get a feel on how this API might work.
> > > 
> > > My goals for this email are to check if this approach is valid and what
> > > pitfalls can you see. I didn't send a patch series yet to not jump
> > > ahead Alexandr and Michal work (I can rebase on top of their work
> > > later) and because the igc RX and TX timestamp implementation I'm using
> > > to provide more real looking data is not yet complete.
> > > 
> > > Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> > > hints discussion =D
> > 
> > Thanks for pointing that out :-)
> > 
> > > Naturally, if someone finds any issue trying those patches, please let
> > > me know!
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24 22:17     ` Desouza, Ederson
@ 2021-06-24 22:39       ` Alexei Starovoitov
  2021-07-07 16:38         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2021-06-24 22:39 UTC (permalink / raw)
  To: Desouza, Ederson
  Cc: brouer, Swiatkowski, Michal, xdp-hints, Lobakin, Alexandr,
	Karlsson, Magnus, saeed, bpf, andrii.nakryiko

On Thu, Jun 24, 2021 at 3:18 PM Desouza, Ederson
<ederson.desouza@intel.com> wrote:
>
> Wait - it may be done in user space by libbpf, but it needs the
> instrumented object code. It won't work for pure user space
> applications, like those which use AF_XDP. Unless we're going to build
> them in a special way, like we do for the kernel side of BPF
> applications.

It can be made to work. See my reply to Magnus.
It's not a lot of code to make that happen.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24 22:39       ` Alexei Starovoitov
@ 2021-07-07 16:38         ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2021-07-07 16:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Desouza, Ederson
  Cc: brouer, Swiatkowski, Michal, xdp-hints, Lobakin, Alexandr,
	Karlsson, Magnus, saeed, bpf, andrii.nakryiko


On 25/06/2021 00.39, Alexei Starovoitov wrote:
> On Thu, Jun 24, 2021 at 3:18 PM Desouza, Ederson
> <ederson.desouza@intel.com> wrote:
>> Wait - it may be done in user space by libbpf, but it needs the
>> instrumented object code. It won't work for pure user space
>> applications, like those which use AF_XDP. Unless we're going to build
>> them in a special way, like we do for the kernel side of BPF
>> applications.
> It can be made to work. See my reply to Magnus.
> It's not a lot of code to make that happen.

I agree with Alexei, it will not be a lot of code to interpret the BTF 
info in userspace.

In userspace AF_XDP code, we could simply decode the offset of e.g. 
member named "rxhash32" and validate that the expected size is 32 bit (4 
bytes). Then we store the offset associated with rxhash32 for a given 
BTF-ID. When AF_XDP program see BTF-ID it can lookup the offset of 
rxhash32 and move those 4-bytes into a variable for rxhash32.


Implementation details (sorry to complicate this slightly): Because 
metadata area grows with a negative offset seen from ctx->data, and 
AF_XDP descriptor don't know the size of metadata area (like XDP does). 
Then the offset we store (e.g. associated with rxhash32) need to be 
converted to a negative offset from packet ctx->data start.


--Jesper



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-06-24 21:54   ` Desouza, Ederson
  2021-06-24 22:17     ` Desouza, Ederson
@ 2021-07-07 22:26     ` Andrii Nakryiko
  2021-07-15 19:34       ` Desouza, Ederson
  1 sibling, 1 reply; 8+ messages in thread
From: Andrii Nakryiko @ 2021-07-07 22:26 UTC (permalink / raw)
  To: Desouza, Ederson
  Cc: brouer, Swiatkowski, Michal, xdp-hints, Lobakin, Alexandr,
	Karlsson, Magnus, saeed, bpf

On Thu, Jun 24, 2021 at 2:54 PM Desouza, Ederson
<ederson.desouza@intel.com> wrote:
>
> On Thu, 2021-06-24 at 21:54 +0200, Jesper Dangaard Brouer wrote:
> > On Thu, 24 Jun 2021 00:10:12 +0000
> > "Desouza, Ederson" <ederson.desouza@intel.com> wrote:
> >
> > > Following current discussions around XDP hints, it's clear that
> > > currently the focus is on BPF applications. But my interest is in the
> > > AF_XDP side of things - user space applications.
> >
> > I agree, that most of the discussion is focused on BPF-programs being
> > loaded into the kernel via libbpf.  I actually also care about getting
> > this working for AF_XDP.
> >
> > We've discussed this with Magnus (meeting yesterday) and I think we
> > agree that this is also something we want for AF_XDP.  IIRC the plan is
> > to use one bit to indicate if a packet is carrying info in metadata
> > area, as (1) AF_XDP descriptor don't have room for storing the BTF-ID,
> > and (2) if bit is not set, then we can avoid touching that cache-line.
> > If the bit is set, then the BTF-ID is stored in metadata area
> > (preferably as the last member, as ctx->data_meta is a minus offset
> > from ctx->data, making it accessible via a fixed offset from data).
> >
> > For the BPF-programs it would make sense to store the BTF-ID in
> > xdp_buff/xdp_frame and make it accessible via xdp_md (ctx seen from
> > BPF-prog).  To help AF_XDP the *proposal* is to (also) store it in
> > metadata area itself.
> >
> >
> > > In there, there's not much help from BPF CO-RE - who's going to rewrite
> > > user space structs, after all?
> >
> > Well, AFAIK most of the offset relocation happens in user-space by
> > libbpf.  Which Alexei also indicate in the other thread[1]. To better
> > understand BTF/CO-RE I've coded up an example here[2].
> >
> >  [1] https://lore.kernel.org/bpf/CAADnVQKv5SLBfnBWnEBFqf0-DQv+NZuixGiCVx1hewfQFhHSKg@mail.gmail.com/
> >  [2] https://github.com/xdp-project/bpf-examples/blob/master/ktrace-CO-RE/ktrace01_kern.c
> >
> > I'm trying to understand how libbpf does this.  So, I added a --debug
> > option that makes libbpf print verbose messages. See commit[3] that
> > also contains output example.
> >
> >  [3] https://github.com/xdp-project/bpf-examples/commit/0542d8a7a327b642d105
> >
> > Some of the --debug output:
> >
> >  libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
> >  [...]
> >  libbpf: CO-RE relocating [0] struct sk_buff___local: found target candidate [2965] struct sk_buff in [vmlinux]
> >  libbpf: prog 'udp_send_skb': relo #1: matching candidate #0 [2965] struct sk_buff.hash (0:55 @ offset 148)
> >  libbpf: prog 'udp_send_skb': relo #1: patched insn #1 (ALU/ALU64) imm 4 -> 148
> >  libbpf: prog 'udp_send_skb': relo #2: kind <byte_off> (0), spec is [7] struct sk_buff___local.len (0:0 @ offset 0)
> >  libbpf: prog 'udp_send_skb': relo #2: matching candidate #0 [2965] struct sk_buff.len (0:6 @ offset 112)
> >  libbpf: prog 'udp_send_skb': relo #2: patched insn #8 (ALU/ALU64) imm 0 -> 112
> >  libbpf: prog 'udp_send_skb': relo #3: kind <target_type_id> (7), spec is [7] struct sk_buff___local
> >  libbpf: prog 'udp_send_skb': relo #3: matching candidate #0 [2965] struct sk_buff
> >  libbpf: prog 'udp_send_skb': relo #3: patched insn #24 (ALU/ALU64) imm 7 -> 2965
> >
> > As indicated in [1] a BTF matching is being done in userspace. First
> > libbpf loads kernels BTF from '/sys/kernel/btf/vmlinux'.  Then it have
> > the BTF from BPF-prog 'sk_buff___local' which finds target 'struct
> > sk_buff' as btf_id 2965.  Afterwards it patches the relocations in the
> > byte code.
> >
>
> Hmmm... that's something I definitely want to try =D
>
> >
> > > So, I decided to give a try at a possible implementation, using igc
> > > driver as I'm more used to it, and come here ask some questions about
> > > it.
> > >
> > > For the curious, here's my branch with current work:
> > >
> > > https://github.com/edersondisouza/linux/tree/xdp-hints
> > >
> > > It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
> > > decided to incorporate some of the CO-RE related feedback, so I could
> > > have something that also works with BPF applications. Please not that
> > > I'm not trying to jump ahead of them in incorporating the feedback -
> > > probably they have something more robust here - but if you see some
> > > value in my patches, feel free to reuse/incorporate them (if they are
> > > just an example of what not to do, it's still an example =D ).
> > > I also added some XDP ZC patches for igc that are still moving to
> > > mainline.
> > >
> > > In there, I basically defined a sample of "generic hints", that is
> > > basically an struct with common hints, such as RX and TX timestamp,
> > > hash, etc. I also included two more members to that struct: field_map
> > > and extension_id. The first, shows which members are actually valid in
> > > the data, the second is an arbitrary id that drivers can use to say
> > > "there's extra data" beyond the generic members, and how to interpret
> > > what's there is driver specific. A BTF is also created to represent
> > > this struct, and registering is done the same way Saeed's patch did.
> > >
> > > User space developers that need to get the struct can use something
> > > like to get it from the driver:
> > >
> > >   # tools/bpf/bpftool/bpftool net xdp show
> > >   xdp:
> > >   enp6s0(5) md_btf_id(60) md_btf_enabled(1)
> > >
> > > And use the btf_id to get the struct:
> > >
> > >   # bpftool btf dump file /sys/kernel/btf/igc format c
> > >
> > > Currently though, that's bad - as in this case the struct has no
> > > types, only the field names. Why?
> >
> > I don't follow, what is not working?
>
> I get something like this:
>
>   struct xdp_hints {
>          yet_another_timestamp;
>          rx_timestamp;
>          tx_timestamp;
>          hash32;
>          extension_id;
>          field_map;
>   };

it could be due to corrupted BTF. Can you show output of

bpftool btf dump file /sys/kernel/btf/igc

(note no "format c").

>
> Note how there's no type before the fields, one has to figure out if
> `rx_timestamp` is u32 or u64.
>
>
> >
> > > With the driver specific struct (or by using the generic one, if no
> > > specific fields are needed), the application can then access the XDP
> > > frame metadata. I've also added some helpers to aid getting the
> > > metadata.
> > >
> > > I added some examples on how to use those (they may be too simplistic),
> > > so it's possible to get a feel on how this API might work.
> > >
> > > My goals for this email are to check if this approach is valid and what
> > > pitfalls can you see. I didn't send a patch series yet to not jump
> > > ahead Alexandr and Michal work (I can rebase on top of their work
> > > later) and because the igc RX and TX timestamp implementation I'm using
> > > to provide more real looking data is not yet complete.
> > >
> > > Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> > > hints discussion =D
> >
> > Thanks for pointing that out :-)
> >
> > > Naturally, if someone finds any issue trying those patches, please let
> > > me know!
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A look into XDP hints for AF_XDP
  2021-07-07 22:26     ` Andrii Nakryiko
@ 2021-07-15 19:34       ` Desouza, Ederson
  0 siblings, 0 replies; 8+ messages in thread
From: Desouza, Ederson @ 2021-07-15 19:34 UTC (permalink / raw)
  To: andrii.nakryiko
  Cc: Karlsson, Magnus, bpf, xdp-hints, Lobakin, Alexandr, Swiatkowski,
	Michal, brouer, saeed

On Wed, 2021-07-07 at 15:26 -0700, Andrii Nakryiko wrote:
[snip]
> > > I don't follow, what is not working?
> > 
> > I get something like this:
> > 
> >   struct xdp_hints {
> >          yet_another_timestamp;
> >          rx_timestamp;
> >          tx_timestamp;
> >          hash32;
> >          extension_id;
> >          field_map;
> >   };
> 
> it could be due to corrupted BTF. Can you show output of
> 
> bpftool btf dump file /sys/kernel/btf/igc
> 
> (note no "format c").
> 

Errr... found out the issue on my side - this is a BTF described by
hand, and I didn't name the types - only defined their size and
bits_offset. Sorry for the noise!

> > 
> > Note how there's no type before the fields, one has to figure out if
> > `rx_timestamp` is u32 or u64.
> > 
> > 
> > > 
> > > > With the driver specific struct (or by using the generic one, if no
> > > > specific fields are needed), the application can then access the XDP
> > > > frame metadata. I've also added some helpers to aid getting the
> > > > metadata.
> > > > 
> > > > I added some examples on how to use those (they may be too simplistic),
> > > > so it's possible to get a feel on how this API might work.
> > > > 
> > > > My goals for this email are to check if this approach is valid and what
> > > > pitfalls can you see. I didn't send a patch series yet to not jump
> > > > ahead Alexandr and Michal work (I can rebase on top of their work
> > > > later) and because the igc RX and TX timestamp implementation I'm using
> > > > to provide more real looking data is not yet complete.
> > > > 
> > > > Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> > > > hints discussion =D
> > > 
> > > Thanks for pointing that out :-)
> > > 
> > > > Naturally, if someone finds any issue trying those patches, please let
> > > > me know!
> > > 
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-15 19:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-24  0:10 A look into XDP hints for AF_XDP Desouza, Ederson
2021-06-24 19:54 ` Jesper Dangaard Brouer
2021-06-24 21:54   ` Desouza, Ederson
2021-06-24 22:17     ` Desouza, Ederson
2021-06-24 22:39       ` Alexei Starovoitov
2021-07-07 16:38         ` Jesper Dangaard Brouer
2021-07-07 22:26     ` Andrii Nakryiko
2021-07-15 19:34       ` Desouza, Ederson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox