XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Desouza, Ederson" <ederson.desouza@intel.com>
Cc: "xdp-hints@xdp-project.net" <xdp-hints@xdp-project.net>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"saeed@kernel.org" <saeed@kernel.org>,
	"Lobakin, Alexandr" <alexandr.lobakin@intel.com>,
	"Swiatkowski, Michal" <michal.swiatkowski@intel.com>,
	brouer@redhat.com, Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>
Subject: Re: A look into XDP hints for AF_XDP
Date: Thu, 24 Jun 2021 21:54:11 +0200	[thread overview]
Message-ID: <20210624215411.79324c9d@carbon> (raw)
In-Reply-To: <be4583429b45d618e592585c35eed5f1c113ed68.camel@intel.com>

On Thu, 24 Jun 2021 00:10:12 +0000
"Desouza, Ederson" <ederson.desouza@intel.com> wrote:

> Following current discussions around XDP hints, it's clear that
> currently the focus is on BPF applications. But my interest is in the
> AF_XDP side of things - user space applications.

I agree, that most of the discussion is focused on BPF-programs being
loaded into the kernel via libbpf.  I actually also care about getting
this working for AF_XDP.

We've discussed this with Magnus (meeting yesterday) and I think we
agree that this is also something we want for AF_XDP.  IIRC the plan is
to use one bit to indicate if a packet is carrying info in metadata
area, as (1) AF_XDP descriptor don't have room for storing the BTF-ID,
and (2) if bit is not set, then we can avoid touching that cache-line.
If the bit is set, then the BTF-ID is stored in metadata area
(preferably as the last member, as ctx->data_meta is a minus offset
from ctx->data, making it accessible via a fixed offset from data).

For the BPF-programs it would make sense to store the BTF-ID in
xdp_buff/xdp_frame and make it accessible via xdp_md (ctx seen from
BPF-prog).  To help AF_XDP the *proposal* is to (also) store it in
metadata area itself.


> In there, there's not much help from BPF CO-RE - who's going to rewrite
> user space structs, after all? 

Well, AFAIK most of the offset relocation happens in user-space by
libbpf.  Which Alexei also indicate in the other thread[1]. To better
understand BTF/CO-RE I've coded up an example here[2]. 

 [1] https://lore.kernel.org/bpf/CAADnVQKv5SLBfnBWnEBFqf0-DQv+NZuixGiCVx1hewfQFhHSKg@mail.gmail.com/
 [2] https://github.com/xdp-project/bpf-examples/blob/master/ktrace-CO-RE/ktrace01_kern.c

I'm trying to understand how libbpf does this.  So, I added a --debug
option that makes libbpf print verbose messages. See commit[3] that
also contains output example.

 [3] https://github.com/xdp-project/bpf-examples/commit/0542d8a7a327b642d105

Some of the --debug output:

 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
 [...]
 libbpf: CO-RE relocating [0] struct sk_buff___local: found target candidate [2965] struct sk_buff in [vmlinux]
 libbpf: prog 'udp_send_skb': relo #1: matching candidate #0 [2965] struct sk_buff.hash (0:55 @ offset 148)
 libbpf: prog 'udp_send_skb': relo #1: patched insn #1 (ALU/ALU64) imm 4 -> 148
 libbpf: prog 'udp_send_skb': relo #2: kind <byte_off> (0), spec is [7] struct sk_buff___local.len (0:0 @ offset 0)
 libbpf: prog 'udp_send_skb': relo #2: matching candidate #0 [2965] struct sk_buff.len (0:6 @ offset 112)
 libbpf: prog 'udp_send_skb': relo #2: patched insn #8 (ALU/ALU64) imm 0 -> 112
 libbpf: prog 'udp_send_skb': relo #3: kind <target_type_id> (7), spec is [7] struct sk_buff___local
 libbpf: prog 'udp_send_skb': relo #3: matching candidate #0 [2965] struct sk_buff
 libbpf: prog 'udp_send_skb': relo #3: patched insn #24 (ALU/ALU64) imm 7 -> 2965

As indicated in [1] a BTF matching is being done in userspace. First
libbpf loads kernels BTF from '/sys/kernel/btf/vmlinux'.  Then it have
the BTF from BPF-prog 'sk_buff___local' which finds target 'struct
sk_buff' as btf_id 2965.  Afterwards it patches the relocations in the
byte code.


> So, I decided to give a try at a possible implementation, using igc
> driver as I'm more used to it, and come here ask some questions about
> it.
> 
> For the curious, here's my branch with current work:
> 
> https://github.com/edersondisouza/linux/tree/xdp-hints
> 
> It's on top of Alexandr Lobakin and Michal Swiatkowski work - but I
> decided to incorporate some of the CO-RE related feedback, so I could
> have something that also works with BPF applications. Please not that
> I'm not trying to jump ahead of them in incorporating the feedback -
> probably they have something more robust here - but if you see some
> value in my patches, feel free to reuse/incorporate them (if they are
> just an example of what not to do, it's still an example =D ).
> I also added some XDP ZC patches for igc that are still moving to
> mainline.
> 
> In there, I basically defined a sample of "generic hints", that is
> basically an struct with common hints, such as RX and TX timestamp,
> hash, etc. I also included two more members to that struct: field_map
> and extension_id. The first, shows which members are actually valid in
> the data, the second is an arbitrary id that drivers can use to say
> "there's extra data" beyond the generic members, and how to interpret
> what's there is driver specific. A BTF is also created to represent
> this struct, and registering is done the same way Saeed's patch did.
> 
> User space developers that need to get the struct can use something
> like to get it from the driver:
> 
>   # tools/bpf/bpftool/bpftool net xdp show
>   xdp:
>   enp6s0(5) md_btf_id(60) md_btf_enabled(1)
> 
> And use the btf_id to get the struct:
> 
>   # bpftool btf dump file /sys/kernel/btf/igc format c
> 
> Currently though, that's bad - as in this case the struct has no
> types, only the field names. Why?

I don't follow, what is not working?

> With the driver specific struct (or by using the generic one, if no
> specific fields are needed), the application can then access the XDP
> frame metadata. I've also added some helpers to aid getting the
> metadata.
> 
> I added some examples on how to use those (they may be too simplistic),
> so it's possible to get a feel on how this API might work.
> 
> My goals for this email are to check if this approach is valid and what
> pitfalls can you see. I didn't send a patch series yet to not jump
> ahead Alexandr and Michal work (I can rebase on top of their work
> later) and because the igc RX and TX timestamp implementation I'm using
> to provide more real looking data is not yet complete.
> 
> Another goal is to ensure that AF_XDP side is not forgotten in the XDP
> hints discussion =D

Thanks for pointing that out :-)

> Naturally, if someone finds any issue trying those patches, please let
> me know!

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2021-06-24 19:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  0:10 Desouza, Ederson
2021-06-24 19:54 ` Jesper Dangaard Brouer [this message]
2021-06-24 21:54   ` Desouza, Ederson
2021-06-24 22:17     ` Desouza, Ederson
2021-06-24 22:39       ` Alexei Starovoitov
2021-07-07 16:38         ` Jesper Dangaard Brouer
2021-07-07 22:26     ` Andrii Nakryiko
2021-07-15 19:34       ` Desouza, Ederson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210624215411.79324c9d@carbon \
    --to=brouer@redhat.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=ederson.desouza@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=michal.swiatkowski@intel.com \
    --cc=saeed@kernel.org \
    --cc=xdp-hints@xdp-project.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox