[xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper

XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
@ 2022-07-04 11:00 Zaremba, Larysa
  2022-07-04 18:26 ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 7+ messages in thread
From: Zaremba, Larysa @ 2022-07-04 11:00 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jesper Dangaard Brouer, bpf
  Cc: xdp-hints, Lobakin, Alexandr

Toke Høiland-Jørgensen <toke@redhat.com> writes:
> 
> Jesper Dangaard Brouer <jbrouer@redhat.com> writes:
> 
> > On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
> >> Jesper Dangaard Brouer <brouer@redhat.com> writes:
> >>
> >>> XDP BPF-prog's need a way to interact with the XDP-hints. This 
> >>> patch introduces a BPF-helper function, that allow XDP BPF-prog's 
> >>> to interact with the XDP-hints.
> >>>
> >>> BPF-prog can query if any XDP-hints have been setup and if this is 
> >>> compatible with the xdp_hints_common struct. If XDP-hints are 
> >>> available the BPF "origin" is returned (see enum
> >>> xdp_hints_btf_origin) as BTF can come from different sources or 
> >>> origins
> e.g. vmlinux, module or local.
> >>
> >> I'm not sure I quite understand what this origin is supposed to be 
> >> good for?
> >
> > Some background info on BTF is needed here: BTF_ID numbers are not 
> > globally unique identifiers, thus we need to know where it originate 
> > from, to make it unique (as we store this BTF_ID in XDP-hints).
> >
> > There is a connection between origin "vmlinux" and "module", which 
> > is that vmlinux will start at ID=1 and end at a max ID number.  
> > Modules refer to ID's in "vmlinux", and for this to work, they will 
> > shift their own numbering to start after ID=max-vmlinux-id.
> >
> > Origin "local" is for BTF information stored in the BPF-ELF object file.
> > Their numbering starts at ID=1.  The use-case is that a BPF-prog 
> > want to extend the kernel drivers BTF-layout, and e.g. add a 
> > RX-timestamp like [1].  Then BPF-prog can check if it knows module's 
> > BTF_ID and then extend via bpf_xdp_adjust_meta, and update BTF_ID in 
> > XDP-hints and call the helper (I introduced) marking this as origin 
> > "local" for kernel to know this is no-longer origin "module".
> 
> Right, I realise that :)
> 
> My point was that just knowing "this is a BTF ID coming from a module"
> is not terribly useful; you could already figure that out by just 
> looking at the ID and seeing if it's larger than the maximum ID in vmlinux BTF.
> 
> Rather, what we need is a way to identify *which* module the BTF ID 
> comes from; and luckily, the kernel assigns a unique ID to every BTF
> *object* as well as to each type ID within that object. These can be 
> dumped by bpftool:
> 
> # bpftool btf
> bpftool btf
> [sudo] password for alrua:
> 1: name [vmlinux]  size 4800187B
> 2: name [serio]  size 2588B
> 3: name [i8042]  size 11786B
> 4: name [rng_core]  size 8184B
> [...]
> 2062: name <anon>  size 36965B
> 	pids bpftool(547298)
> 
> IDs 2-4 are module BTF objects, and that last one is the ID of a BTF 
> object loaded along with a BPF program by bpftool itself... So we *do* 
> in fact have a unique ID, by combining the BTF object ID with the type 
> ID; this is what Alexander is proposing to put into the xdp-hints 
> struct as well (combining the two IDs into a single u64).

That's correct, concept was previously discussed [1]. The ID of BTF object wasn't
exposed in CO-RE allocations though, we've changed it in the first 4 patches.
The main logic is in "libbpf: factor out BTF loading from load_module_btfs()"
and "libbpf: patch module BTF ID into BPF insns".

We have a sample that wasn't included eventually, but can possibly
give a general understanding of our approach [2].

[1] https://lore.kernel.org/all/CAEf4BzZO=7MKWfx2OCwEc+sKkfPZYzaELuobi4q5p1bOKk4AQQ@mail.gmail.com/
[2] https://github.com/alobakin/linux/pull/16/files#diff-c5983904cbe0c280453d59e8a1eefb56c67018c38d5da0c1122abc86225fc7c9

> >> What is a BPF (or AF_XDP) program supposed to do with the 
> >> information "this XDP hints struct came from a module?" without 
> >> knowing which module that was?
> >
> > For AF_XDP my claim is the userspace program will already know that 
> > driver it is are talking to because it need to "bind" to a specific 
> > interface (and attach to XDP-prog to ifindex). See sample code[2] 
> > for get_driver_name from ifindex.
> > Thus, part of using XDP-hints already involves (resolving and) 
> > opening /sys/kernel/btf/driver_name.  So the origin "module" is 
> > enough for the API end-user to make the BTF_ID unique.
> 
> This will probably work in the most common cases, but offers no way to 
> verify that this "offline" resolution of module ID is actually correct.
> Explicitly encoding the full unique ID will be more robust.
> 
> > Runtime the BPF-prog and kernel can find out what net_device the 
> > origin "module" refers to via xdp_buff->rxq->dev.  When an 
> > end-user/program attach XDP they also need to know the ifindex, 
> > again giving them knowledge that make origin "module" BTF_ID's 
> > unique for them,
> 
> Right, but then the BPF program needs to keep its own lookup table 
> from ifindex to BTF ID? If we just encode the full ID in the packet, 
> it's a simple check, and we can likely create a "magic" CO-RE macro 
> that turns a struct definition into the right ID check at load time...
> 
> >> Ultimately, the origin is useful for a consumer to check that the 
> >> metadata is in the format that it's expecting it to be in (so it 
> >> can just load the data from the appropriate offsets). But to answer 
> >> this, we really need a unique identifier; so I think the approach 
> >> in Alexander's series of encoding the ID of the BTF structure 
> >> itself into the next 32 bits is better? That way we'll have a unique "pointer"
> >> to the actual struct that's in the metadata area and can act on this.
> >
> > I would really like an explanation from Alexander, how his approach 
> > creates unique identifier across all kernel modules.  I don't get it 
> > from reading the code.  To me it looks like some extra BTF "type"
> > information about the BTF_ID.
> >
> > E.g. how do BTF "local" BPF-ELF object's get a unique identifier, 
> > that doesn't overlap with e.g. kernel modules?
> 
> See above: the kernel generates a unique (until the next reboot) ID 
> for every BTF object when it's loaded into the kernel.
> 
> >>> RFC/TODO: Improve patch: Can verifier validate provided BTF on
> "update"
> >>> and detect if compatible with common struct???
> >>
> >> If we have the unique ID as mentioned above, I think the kernel 
> >> probably could resolve this automatically: whenever a module is 
> >> loaded, the kernel could walk the BTF information from that module 
> >> an simply inspect all the metadata structs and see if they contain 
> >> the embedded xdp_hints_common struct. The IDs of any metadata 
> >> structs that do contain the common struct can then be kept in a 
> >> central lookup table and the consumption code can then simply 
> >> compare the BTF ID to this table when building an SKB?
> >
> > I'm not against the idea for the kernel to keep track of these structs.
> > I just don't like the idea of checking this runtime, especially as 
> > this approach for walking all other modules BTF struct's doesn't scale.
> 
> Yeah, we should optimise this. See below...
> 
> >> As for the validation on the BPF side:n
> >>
> >>> +	if (flags & HINTS_BTF_UPDATE) {
> >>> +		is_compat_common = !!(flags &
> HINTS_BTF_COMPAT_COMMON);
> >>> +	/* TODO: Can kernel validate if hints are BTF compat with common?
> */
> >>> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to
> prove
> >>> +compat_common ? */
> >>
> >> If we use the "global ID + lookup table" approach above, we don't 
> >> really need to validate anything here: if the program says it's 
> >> writing metadata with a format given by a specific ID, that implies 
> >> compatibility (or not) as given by the ID. We could sanity-check 
> >> the metadata area size, but the consumption code has to do that 
> >> anyway, so I'm not sure it's worth the runtime overhead to have an 
> >> additional check here?
> >
> > As you know I hate "runtime checks", and try hard to push checks to 
> > "setup time".  Maybe we could have verifier (or libbpf) do the check 
> > at setup/load time, by identifying the helper call and check if 
> > provided BTF do match COMPAT_COMMON claim.
> >
> > For this to work, the verifier need to be able to resolve origin 
> > "module", which happens at BPF load-time, so we would need to set 
> > the ifindex (curr used for XDP-hardware-offload) at BPF load-time.
> 
> If we make the UAPI on the BPF side just accept a full BTF object+type 
> ID, and also require that the value being passed to the helper is a 
> compile-time constant (so it is visible to the verifier at 
> verification time), it is straight- forward for the verifier to just 
> lookup the BTF type, check if it contains the "hints_common" struct 
> and if it does, rewrite the helper call to set the right value of the "compat_common"
> flag without exposing the flag itself as UAPI.
> 
> The driver code would probably still have to set this flag "manually", 
> but that's internal kernel API, so that's probably fine...
> 
> >> As for safety of the metadata content itself, I don't really think 
> >> we can do anything to guarantee this: in any case the BPF program 
> >> can pass a valid BTF ID and still write garbage values into the 
> >> actual fields, so the consumption code has to do enough validation 
> >> that this won't crash the kernel anyway. But this is no different 
> >> from the packet
> data itself:
> >> XDP is basically in a position to be a MITM attacker of the network 
> >> stack itself, which is why loading XDP programs is a privileged 
> >> operation...
> >
> > I agree, that we cannot stop the end-user from screwing up their 
> > BPF-prog to provide garbage in the fields, as long as it doesn't 
> > crash the kernel.  I do think it would improve usability for 
> > end-users if we can detect and report that their BPF-prog have 
> > gotten out of sync with the running kernel and their claim that 
> > their BTF layout are COMPAT_COMMON isn't actually true.  But I guess 
> > it is shouldn't block the code, as it's only an extra usability help.
> 
> Yeah, I agree this could be error prone; which is why I think not 
> exposing the flag itself as UAPI is a better solution ;)
> 
> -Toke

Please CC me in the hints discussions.
- Larysa

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-07-04 11:00 [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Zaremba, Larysa
@ 2022-07-04 18:26 ` Jesper Dangaard Brouer
  2022-07-05 17:07   ` Larysa Zaremba
  0 siblings, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2022-07-04 18:26 UTC (permalink / raw)
  To: Zaremba, Larysa, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, bpf, Andrii Nakryiko, Netdev
  Cc: brouer, xdp-hints, Lobakin, Alexandr



On 04/07/2022 13.00, Zaremba, Larysa wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>
>> Jesper Dangaard Brouer <jbrouer@redhat.com> writes:
>>
>>> On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
>>>> Jesper Dangaard Brouer <brouer@redhat.com> writes:
>>>>
>>>>> XDP BPF-prog's need a way to interact with the XDP-hints. This
>>>>> patch introduces a BPF-helper function, that allow XDP BPF-prog's
>>>>> to interact with the XDP-hints.
>>>>>
>>>>> BPF-prog can query if any XDP-hints have been setup and if this is
>>>>> compatible with the xdp_hints_common struct. If XDP-hints are
>>>>> available the BPF "origin" is returned (see enum
>>>>> xdp_hints_btf_origin) as BTF can come from different sources or
>>>>> origins e.g. vmlinux, module or local.
>>>>
>>>> I'm not sure I quite understand what this origin is supposed to be
>>>> good for?
>>>
>>> Some background info on BTF is needed here: BTF_ID numbers are not
>>> globally unique identifiers, thus we need to know where it originate
>>> from, to make it unique (as we store this BTF_ID in XDP-hints).
>>>
>>> There is a connection between origin "vmlinux" and "module", which
>>> is that vmlinux will start at ID=1 and end at a max ID number.
>>> Modules refer to ID's in "vmlinux", and for this to work, they will
>>> shift their own numbering to start after ID=max-vmlinux-id.
>>>
>>> Origin "local" is for BTF information stored in the BPF-ELF object file.
>>> Their numbering starts at ID=1.  The use-case is that a BPF-prog
>>> want to extend the kernel drivers BTF-layout, and e.g. add a
>>> RX-timestamp like [1].  Then BPF-prog can check if it knows module's
>>> BTF_ID and then extend via bpf_xdp_adjust_meta, and update BTF_ID in
>>> XDP-hints and call the helper (I introduced) marking this as origin
>>> "local" for kernel to know this is no-longer origin "module".
>>
>> Right, I realise that :)
>>
>> My point was that just knowing "this is a BTF ID coming from a module"
>> is not terribly useful; you could already figure that out by just
>> looking at the ID and seeing if it's larger than the maximum ID in vmlinux BTF.
>>
>> Rather, what we need is a way to identify *which* module the BTF ID
>> comes from; and luckily, the kernel assigns a unique ID to every BTF
>> *object* as well as to each type ID within that object. These can be
>> dumped by bpftool:
>>
>> # bpftool btf
>> bpftool btf
>> [sudo] password for alrua:
>> 1: name [vmlinux]  size 4800187B
>> 2: name [serio]  size 2588B
>> 3: name [i8042]  size 11786B
>> 4: name [rng_core]  size 8184B
>> [...]
>> 2062: name <anon>  size 36965B
>> 	pids bpftool(547298)
>>
>> IDs 2-4 are module BTF objects, and that last one is the ID of a BTF
>> object loaded along with a BPF program by bpftool itself... So we *do*
>> in fact have a unique ID, by combining the BTF object ID with the type
>> ID; this is what Alexander is proposing to put into the xdp-hints
>> struct as well (combining the two IDs into a single u64).

Thanks for the explanation. I think I understand it now, and I agree
that we should extend/combining the two IDs into a single u64.

To Andrii, what is the right terminology when talking about these two
different BTF-ID's:

- BTF object ID and BTF type ID?

- Where BTF *object* ID are the IDs we see above from 'bpftool btf',
   where vmlinux=1 and module's IDs will start after 1.

- Where BTF *type* ID are the IDs the individual data "types" within a
   BTF "object" (e.g. struct xdp_hints_common that BPF-prog's can get
   via calling bpf_core_type_id_kernel()).


> That's correct, concept was previously discussed [1]. The ID of BTF object wasn't
> exposed in CO-RE allocations though, we've changed it in the first 4 patches.
> The main logic is in "libbpf: factor out BTF loading from load_module_btfs()"
> and "libbpf: patch module BTF ID into BPF insns".
> 
> We have a sample that wasn't included eventually, but can possibly
> give a general understanding of our approach [2].
> 
> [1] https://lore.kernel.org/all/CAEf4BzZO=7MKWfx2OCwEc+sKkfPZYzaELuobi4q5p1bOKk4AQQ@mail.gmail.com/
> [2] https://github.com/alobakin/linux/pull/16/files#diff-c5983904cbe0c280453d59e8a1eefb56c67018c38d5da0c1122abc86225fc7c9
> 
(appreciate the links)

I wonder how these BTF object IDs gets resolved for my "local" category?
(Origin "local" is for BTF information stored in the BPF-ELF object file)

Note: For "local" BTF type IDs BPF-prog resolve these via
bpf_core_type_id_local() (why I choose the term "local").

--Jesper

p.s. For unknown reasons lore.kernel.org did match Larysa's reply with 
the patchset thread here[3].

  [3] 
https://lore.kernel.org/bpf/165643378969.449467.13237011812569188299.stgit@firesoul/#r



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-07-04 18:26 ` Jesper Dangaard Brouer
@ 2022-07-05 17:07   ` Larysa Zaremba
  2022-07-06 13:29     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 7+ messages in thread
From: Larysa Zaremba @ 2022-07-05 17:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf
  Cc: Toke Hoiland-Jorgensen, Andrii Nakryiko, brouer, xdp-hints,
	Lobakin, Alexandr

On Mon, Jul 04, 2022 at 08:26:15PM +0200, Jesper Dangaard Brouer wrote:
> 
> 
> On 04/07/2022 13.00, Zaremba, Larysa wrote:
> > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> > > 
> > > Jesper Dangaard Brouer <jbrouer@redhat.com> writes:
> > > 
> > > > On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
> > > > > Jesper Dangaard Brouer <brouer@redhat.com> writes:
> > > > > 
> > > > > > XDP BPF-prog's need a way to interact with the XDP-hints. This
> > > > > > patch introduces a BPF-helper function, that allow XDP BPF-prog's
> > > > > > to interact with the XDP-hints.
> > > > > > 
> > > > > > BPF-prog can query if any XDP-hints have been setup and if this is
> > > > > > compatible with the xdp_hints_common struct. If XDP-hints are
> > > > > > available the BPF "origin" is returned (see enum
> > > > > > xdp_hints_btf_origin) as BTF can come from different sources or
> > > > > > origins e.g. vmlinux, module or local.
> > > > > 
> > > > > I'm not sure I quite understand what this origin is supposed to be
> > > > > good for?
> > > > 
> > > > Some background info on BTF is needed here: BTF_ID numbers are not
> > > > globally unique identifiers, thus we need to know where it originate
> > > > from, to make it unique (as we store this BTF_ID in XDP-hints).
> > > > 
> > > > There is a connection between origin "vmlinux" and "module", which
> > > > is that vmlinux will start at ID=1 and end at a max ID number.
> > > > Modules refer to ID's in "vmlinux", and for this to work, they will
> > > > shift their own numbering to start after ID=max-vmlinux-id.
> > > > 
> > > > Origin "local" is for BTF information stored in the BPF-ELF object file.
> > > > Their numbering starts at ID=1.  The use-case is that a BPF-prog
> > > > want to extend the kernel drivers BTF-layout, and e.g. add a
> > > > RX-timestamp like [1].  Then BPF-prog can check if it knows module's
> > > > BTF_ID and then extend via bpf_xdp_adjust_meta, and update BTF_ID in
> > > > XDP-hints and call the helper (I introduced) marking this as origin
> > > > "local" for kernel to know this is no-longer origin "module".
> > > 
> > > Right, I realise that :)
> > > 
> > > My point was that just knowing "this is a BTF ID coming from a module"
> > > is not terribly useful; you could already figure that out by just
> > > looking at the ID and seeing if it's larger than the maximum ID in vmlinux BTF.
> > > 
> > > Rather, what we need is a way to identify *which* module the BTF ID
> > > comes from; and luckily, the kernel assigns a unique ID to every BTF
> > > *object* as well as to each type ID within that object. These can be
> > > dumped by bpftool:
> > > 
> > > # bpftool btf
> > > bpftool btf
> > > [sudo] password for alrua:
> > > 1: name [vmlinux]  size 4800187B
> > > 2: name [serio]  size 2588B
> > > 3: name [i8042]  size 11786B
> > > 4: name [rng_core]  size 8184B
> > > [...]
> > > 2062: name <anon>  size 36965B
> > > 	pids bpftool(547298)
> > > 
> > > IDs 2-4 are module BTF objects, and that last one is the ID of a BTF
> > > object loaded along with a BPF program by bpftool itself... So we *do*
> > > in fact have a unique ID, by combining the BTF object ID with the type
> > > ID; this is what Alexander is proposing to put into the xdp-hints
> > > struct as well (combining the two IDs into a single u64).
> 
> Thanks for the explanation. I think I understand it now, and I agree
> that we should extend/combining the two IDs into a single u64.
> 
> To Andrii, what is the right terminology when talking about these two
> different BTF-ID's:
> 
> - BTF object ID and BTF type ID?
> 
> - Where BTF *object* ID are the IDs we see above from 'bpftool btf',
>   where vmlinux=1 and module's IDs will start after 1.
> 
> - Where BTF *type* ID are the IDs the individual data "types" within a
>   BTF "object" (e.g. struct xdp_hints_common that BPF-prog's can get
>   via calling bpf_core_type_id_kernel()).
> 

AFAIK, that's the most correct way of distinguish one from another in 
conversation.

Would be still great, if Andrii could confirm that.

I should mention that out patch makes bpf_core_type_id_kernel() return 
u64 (BTF obj ID + BTF type ID), but your statement is true for current 
libbpf version.

> 
> > That's correct, concept was previously discussed [1]. The ID of BTF object wasn't
> > exposed in CO-RE allocations though, we've changed it in the first 4 patches.
> > The main logic is in "libbpf: factor out BTF loading from load_module_btfs()"
> > and "libbpf: patch module BTF ID into BPF insns".
> > 
> > We have a sample that wasn't included eventually, but can possibly
> > give a general understanding of our approach [2].
> > 
> > [1] https://lore.kernel.org/all/CAEf4BzZO=7MKWfx2OCwEc+sKkfPZYzaELuobi4q5p1bOKk4AQQ@mail.gmail.com/
> > [2] https://github.com/alobakin/linux/pull/16/files#diff-c5983904cbe0c280453d59e8a1eefb56c67018c38d5da0c1122abc86225fc7c9
> > 
> (appreciate the links)
> 
> I wonder how these BTF object IDs gets resolved for my "local" category?
> (Origin "local" is for BTF information stored in the BPF-ELF object file)
> 
> Note: For "local" BTF type IDs BPF-prog resolve these via
> bpf_core_type_id_local() (why I choose the term "local").
> 

Every program during CO-RE relocs sees a single local BTF obj, in which 
BTF type IDs start from 1 and correspond to all data types used in 
program. So local BTF obj and type IDs inside are valid only in single 
program, therefore u32 type ID returned by bpf_core_type_id_local() is 
enough.

Local IDs are not resolved, they are just assigned during compilation. 
After program load with CO-RE each local type gets a resolved
vmlinux/module BTF obj pointer and an ID of a type inside this BTF obj 
that is similar enough.

Both local and target type IDs are mainly needed just for comfortable 
iteration inside libbpf, so they are just a side product that is only 
patched in, if we use bpf_core_type_id_local/target() inside a program 
for testing purposes.

> --Jesper
> 
> p.s. For unknown reasons lore.kernel.org did match Larysa's reply with the
> patchset thread here[3].
> 
>  [3] https://lore.kernel.org/bpf/165643378969.449467.13237011812569188299.stgit@firesoul/#r
> 
> 

- Larysa

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-07-05 17:07   ` Larysa Zaremba
@ 2022-07-06 13:29     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2022-07-06 13:29 UTC (permalink / raw)
  To: Larysa Zaremba, Jesper Dangaard Brouer, bpf
  Cc: brouer, Toke Hoiland-Jorgensen, Andrii Nakryiko, xdp-hints,
	Lobakin, Alexandr, Netdev



On 05/07/2022 19.07, Larysa Zaremba wrote:
> On Mon, Jul 04, 2022 at 08:26:15PM +0200, Jesper Dangaard Brouer wrote:
>>
>>
>> On 04/07/2022 13.00, Zaremba, Larysa wrote:
>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>>
>>>> Jesper Dangaard Brouer <jbrouer@redhat.com> writes:
>>>>
>>>>> On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
>>>>>> Jesper Dangaard Brouer <brouer@redhat.com> writes:
>>>>>>
>>>>>>> XDP BPF-prog's need a way to interact with the XDP-hints. This
>>>>>>> patch introduces a BPF-helper function, that allow XDP BPF-prog's
>>>>>>> to interact with the XDP-hints.
>>>>>>>
>>>>>>> BPF-prog can query if any XDP-hints have been setup and if this is
>>>>>>> compatible with the xdp_hints_common struct. If XDP-hints are
>>>>>>> available the BPF "origin" is returned (see enum
>>>>>>> xdp_hints_btf_origin) as BTF can come from different sources or
>>>>>>> origins e.g. vmlinux, module or local.
>>>>>>
>>>>>> I'm not sure I quite understand what this origin is supposed to be
>>>>>> good for?
>>>>>
>>>>> Some background info on BTF is needed here: BTF_ID numbers are not
>>>>> globally unique identifiers, thus we need to know where it originate
>>>>> from, to make it unique (as we store this BTF_ID in XDP-hints).
>>>>>
>>>>> There is a connection between origin "vmlinux" and "module", which
>>>>> is that vmlinux will start at ID=1 and end at a max ID number.
>>>>> Modules refer to ID's in "vmlinux", and for this to work, they will
>>>>> shift their own numbering to start after ID=max-vmlinux-id.
>>>>>
>>>>> Origin "local" is for BTF information stored in the BPF-ELF object file.
>>>>> Their numbering starts at ID=1.  The use-case is that a BPF-prog
>>>>> want to extend the kernel drivers BTF-layout, and e.g. add a
>>>>> RX-timestamp like [1].  Then BPF-prog can check if it knows module's
>>>>> BTF_ID and then extend via bpf_xdp_adjust_meta, and update BTF_ID in
>>>>> XDP-hints and call the helper (I introduced) marking this as origin
>>>>> "local" for kernel to know this is no-longer origin "module".
>>>>
>>>> Right, I realise that :)
>>>>
>>>> My point was that just knowing "this is a BTF ID coming from a module"
>>>> is not terribly useful; you could already figure that out by just
>>>> looking at the ID and seeing if it's larger than the maximum ID in vmlinux BTF.
>>>>
>>>> Rather, what we need is a way to identify *which* module the BTF ID
>>>> comes from; and luckily, the kernel assigns a unique ID to every BTF
>>>> *object* as well as to each type ID within that object. These can be
>>>> dumped by bpftool:
>>>>
>>>> # bpftool btf
>>>> bpftool btf
>>>> [sudo] password for alrua:
>>>> 1: name [vmlinux]  size 4800187B
>>>> 2: name [serio]  size 2588B
>>>> 3: name [i8042]  size 11786B
>>>> 4: name [rng_core]  size 8184B
>>>> [...]
>>>> 2062: name <anon>  size 36965B
>>>> 	pids bpftool(547298)
>>>>
>>>> IDs 2-4 are module BTF objects, and that last one is the ID of a BTF
>>>> object loaded along with a BPF program by bpftool itself... So we *do*
>>>> in fact have a unique ID, by combining the BTF object ID with the type
>>>> ID; this is what Alexander is proposing to put into the xdp-hints
>>>> struct as well (combining the two IDs into a single u64).
>>
>> Thanks for the explanation. I think I understand it now, and I agree
>> that we should extend/combining the two IDs into a single u64.
>>
>> To Andrii, what is the right terminology when talking about these two
>> different BTF-ID's:
>>
>> - BTF object ID and BTF type ID?
>>
>> - Where BTF *object* ID are the IDs we see above from 'bpftool btf',
>>    where vmlinux=1 and module's IDs will start after 1.
>>
>> - Where BTF *type* ID are the IDs the individual data "types" within a
>>    BTF "object" (e.g. struct xdp_hints_common that BPF-prog's can get
>>    via calling bpf_core_type_id_kernel()).
>>
> 
> AFAIK, that's the most correct way of distinguish one from another in
> conversation.

Good to get confirmed that you agree with these terms.

> 
> Would be still great, if Andrii could confirm that.

Yes, it would :-)

> I should mention that out patch makes bpf_core_type_id_kernel() return
> u64 (BTF obj ID + BTF type ID), but your statement is true for current
> libbpf version.

It sounds useful that your patched bpf_core_type_id_kernel() returns u64
(BTF obj ID + BTF type ID).

I wonder if/how we need to deal with libbpf versions that only returns
the u32 BTF type ID ?

> 
>>
>>> That's correct, concept was previously discussed [1]. The ID of BTF object wasn't
>>> exposed in CO-RE allocations though, we've changed it in the first 4 patches.
>>> The main logic is in "libbpf: factor out BTF loading from load_module_btfs()"
>>> and "libbpf: patch module BTF ID into BPF insns".
>>>
>>> We have a sample that wasn't included eventually, but can possibly
>>> give a general understanding of our approach [2].
>>>
>>> [1] https://lore.kernel.org/all/CAEf4BzZO=7MKWfx2OCwEc+sKkfPZYzaELuobi4q5p1bOKk4AQQ@mail.gmail.com/
>>> [2] https://github.com/alobakin/linux/pull/16/files#diff-c5983904cbe0c280453d59e8a1eefb56c67018c38d5da0c1122abc86225fc7c9
>>>
>> (appreciate the links)
>>
>> I wonder how these BTF object IDs gets resolved for my "local" category?
>> (Origin "local" is for BTF information stored in the BPF-ELF object file)
>>
>> Note: For "local" BTF type IDs BPF-prog resolve these via
>> bpf_core_type_id_local() (why I choose the term "local").
>>
> 
> Every program during CO-RE relocs sees a single local BTF obj, in which
> BTF type IDs start from 1 and correspond to all data types used in
> program. So local BTF obj and type IDs inside are valid only in single
> program, therefore u32 type ID returned by bpf_core_type_id_local() is
> enough.

Sure it makes sense if only a single XDP-prog is running.

For the use-case of multiple XDP-progs (e.g. via libxdp) are running,
where they send info to each-other via metadata area. There it would be
valuable to get a BTF *object* ID associated with these "local" types.

Note that I believe that a TC ingress BPF-prog can also read the
metadata area.

> Local IDs are not resolved, they are just assigned during compilation.
> After program load with CO-RE each local type gets a resolved
> vmlinux/module BTF obj pointer and an ID of a type inside this BTF obj
> that is similar enough.

Yes, but only if __attribute__((preserve_access_index)) is defined on
the "local" BPF-prog struct will libbpf do this matching to kernel
structs, see[1].

  [1] 
https://github.com/xdp-project/bpf-examples/blob/18908873a7f48483ed8bab2d949e8760cff30810/AF_XDP-interaction/af_xdp_kern.c#L37-L40
  [2] 
https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction


> 
> Both local and target type IDs are mainly needed just for comfortable
> iteration inside libbpf, so they are just a side product that is only
> patched in, if we use bpf_core_type_id_local/target() inside a program
> for testing purposes.

In my use-case[2] I want to extract the "local" BTF ID and update the
BTF ID in metadata area, such that my AF_XDP program can see it.

The use-case is that I have a BPF-prog that want to extend the
kernel-module provided XDP-hints, with an XDP-software RX timestamp, but
only for packets containing HW timestamps. It will (load/setup time)
know BTF obj+type ID via
bpf_core_type_id_kernel(xdp_hints_i40e_timestamp) to match on, and then
extend metadata area, type-cast to "local" struct and record
bpf_ktime_get_ns().  It now need to update BTF ID in XDP-hints metadata
area, to tell AF_XDP userspace prog (or chained XDP/TC BPF-prog) that
layout format have changed.

- My question is: What BTF *object* ID should I use for my "local" u32
BTF type ID ? (returned by bpf_core_type_id_local())

--Jesper

>>
>> p.s. For unknown reasons lore.kernel.org did match Larysa's reply with the
>> patchset thread here[3].
>>
>>   [3] https://lore.kernel.org/bpf/165643378969.449467.13237011812569188299.stgit@firesoul/#r


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-07-01  9:10     ` Jesper Dangaard Brouer
@ 2022-07-01 12:09       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-07-01 12:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf
  Cc: xdp-hints, Alexander Lobakin, Alexander Lobakin

Jesper Dangaard Brouer <jbrouer@redhat.com> writes:

> On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
>> Jesper Dangaard Brouer <brouer@redhat.com> writes:
>> 
>>> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
>>> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
>>> with the XDP-hints.
>>>
>>> BPF-prog can query if any XDP-hints have been setup and if this is
>>> compatible with the xdp_hints_common struct. If XDP-hints are available
>>> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
>>> come from different sources or origins e.g. vmlinux, module or local.
>> 
>> I'm not sure I quite understand what this origin is supposed to be good
>> for? 
>
> Some background info on BTF is needed here: BTF_ID numbers are not
> globally unique identifiers, thus we need to know where it originate
> from, to make it unique (as we store this BTF_ID in XDP-hints).
>
> There is a connection between origin "vmlinux" and "module", which is
> that vmlinux will start at ID=1 and end at a max ID number.  Modules
> refer to ID's in "vmlinux", and for this to work, they will shift their
> own numbering to start after ID=max-vmlinux-id.
>
> Origin "local" is for BTF information stored in the BPF-ELF object file.
> Their numbering starts at ID=1.  The use-case is that a BPF-prog want to
> extend the kernel drivers BTF-layout, and e.g. add a RX-timestamp like
> [1].  Then BPF-prog can check if it knows module's BTF_ID and then
> extend via bpf_xdp_adjust_meta, and update BTF_ID in XDP-hints and call
> the helper (I introduced) marking this as origin "local" for kernel to
> know this is no-longer origin "module".

Right, I realise that :)

My point was that just knowing "this is a BTF ID coming from a module"
is not terribly useful; you could already figure that out by just
looking at the ID and seeing if it's larger than the maximum ID in
vmlinux BTF.

Rather, what we need is a way to identify *which* module the BTF ID
comes from; and luckily, the kernel assigns a unique ID to every BTF
*object* as well as to each type ID within that object. These can be
dumped by bpftool:

# bpftool btf
bpftool btf      
[sudo] password for alrua: 
1: name [vmlinux]  size 4800187B
2: name [serio]  size 2588B
3: name [i8042]  size 11786B
4: name [rng_core]  size 8184B
[...]
2062: name <anon>  size 36965B
	pids bpftool(547298)

IDs 2-4 are module BTF objects, and that last one is the ID of a BTF
object loaded along with a BPF program by bpftool itself... So we *do*
in fact have a unique ID, by combining the BTF object ID with the type
ID; this is what Alexander is proposing to put into the xdp-hints struct
as well (combining the two IDs into a single u64).

>> What is a BPF (or AF_XDP) program supposed to do with the
>> information "this XDP hints struct came from a module?" without knowing
>> which module that was? 
>
> For AF_XDP my claim is the userspace program will already know that
> driver it is are talking to because it need to "bind" to a specific
> interface (and attach to XDP-prog to ifindex). See sample code[2] for
> get_driver_name from ifindex.
> Thus, part of using XDP-hints already involves (resolving and) opening
> /sys/kernel/btf/driver_name.  So the origin "module" is enough for the
> API end-user to make the BTF_ID unique.

This will probably work in the most common cases, but offers no way to
verify that this "offline" resolution of module ID is actually correct.
Explicitly encoding the full unique ID will be more robust.

> Runtime the BPF-prog and kernel can find out what net_device the origin
> "module" refers to via xdp_buff->rxq->dev.  When an end-user/program
> attach XDP they also need to know the ifindex, again giving them
> knowledge that make origin "module" BTF_ID's unique for them,

Right, but then the BPF program needs to keep its own lookup table from
ifindex to BTF ID? If we just encode the full ID in the packet, it's a
simple check, and we can likely create a "magic" CO-RE macro that turns
a struct definition into the right ID check at load time...

>> Ultimately, the origin is useful for a consumer
>> to check that the metadata is in the format that it's expecting it to be
>> in (so it can just load the data from the appropriate offsets). But to
>> answer this, we really need a unique identifier; so I think the approach
>> in Alexander's series of encoding the ID of the BTF structure itself
>> into the next 32 bits is better? That way we'll have a unique "pointer"
>> to the actual struct that's in the metadata area and can act on this.
>
> I would really like an explanation from Alexander, how his approach
> creates unique identifier across all kernel modules.  I don't get it
> from reading the code.  To me it looks like some extra BTF "type"
> information about the BTF_ID.
>
> E.g. how do BTF "local" BPF-ELF object's get a unique identifier, that
> doesn't overlap with e.g. kernel modules?

See above: the kernel generates a unique (until the next reboot) ID for
every BTF object when it's loaded into the kernel.

>>> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
>>> and detect if compatible with common struct???
>> 
>> If we have the unique ID as mentioned above, I think the kernel probably
>> could resolve this automatically: whenever a module is loaded, the
>> kernel could walk the BTF information from that module an simply inspect
>> all the metadata structs and see if they contain the embedded
>> xdp_hints_common struct. The IDs of any metadata structs that do contain
>> the common struct can then be kept in a central lookup table and the
>> consumption code can then simply compare the BTF ID to this table when
>> building an SKB?
>
> I'm not against the idea for the kernel to keep track of these structs.
> I just don't like the idea of checking this runtime, especially as this
> approach for walking all other modules BTF struct's doesn't scale.

Yeah, we should optimise this. See below...

>> As for the validation on the BPF side:n
>> 
>>> +	if (flags & HINTS_BTF_UPDATE) {
>>> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
>>> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
>>> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */
>> 
>> If we use the "global ID + lookup table" approach above, we don't really
>> need to validate anything here: if the program says it's writing
>> metadata with a format given by a specific ID, that implies
>> compatibility (or not) as given by the ID. We could sanity-check the
>> metadata area size, but the consumption code has to do that anyway, so
>> I'm not sure it's worth the runtime overhead to have an additional check
>> here?
>
> As you know I hate "runtime checks", and try hard to push checks to
> "setup time".  Maybe we could have verifier (or libbpf) do the check at
> setup/load time, by identifying the helper call and check if provided
> BTF do match COMPAT_COMMON claim.
>
> For this to work, the verifier need to be able to resolve origin
> "module", which happens at BPF load-time, so we would need to set the
> ifindex (curr used for XDP-hardware-offload) at BPF load-time.

If we make the UAPI on the BPF side just accept a full BTF object+type
ID, and also require that the value being passed to the helper is a
compile-time constant (so it is visible to the verifier at verification
time), it is straight-forward for the verifier to just lookup the BTF
type, check if it contains the "hints_common" struct and if it does,
rewrite the helper call to set the right value of the "compat_common"
flag without exposing the flag itself as UAPI.

The driver code would probably still have to set this flag "manually",
but that's internal kernel API, so that's probably fine...

>> As for safety of the metadata content itself, I don't really think we
>> can do anything to guarantee this: in any case the BPF program can pass
>> a valid BTF ID and still write garbage values into the actual fields, so
>> the consumption code has to do enough validation that this won't crash
>> the kernel anyway. But this is no different from the packet data itself:
>> XDP is basically in a position to be a MITM attacker of the network
>> stack itself, which is why loading XDP programs is a privileged
>> operation...
>
> I agree, that we cannot stop the end-user from screwing up their
> BPF-prog to provide garbage in the fields, as long as it doesn't crash
> the kernel.  I do think it would improve usability for end-users if we
> can detect and report that their BPF-prog have gotten out of sync with
> the running kernel and their claim that their BTF layout are
> COMPAT_COMMON isn't actually true.  But I guess it is shouldn't block
> the code, as it's only an extra usability help.

Yeah, I agree this could be error prone; which is why I think not
exposing the flag itself as UAPI is a better solution ;)

-Toke


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-06-29 14:20   ` [xdp-hints] " Toke Høiland-Jørgensen
@ 2022-07-01  9:10     ` Jesper Dangaard Brouer
  2022-07-01 12:09       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2022-07-01  9:10 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, bpf
  Cc: xdp-hints, Alexander Lobakin, Alexander Lobakin

On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <brouer@redhat.com> writes:
> 
>> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
>> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
>> with the XDP-hints.
>>
>> BPF-prog can query if any XDP-hints have been setup and if this is
>> compatible with the xdp_hints_common struct. If XDP-hints are available
>> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
>> come from different sources or origins e.g. vmlinux, module or local.
> 
> I'm not sure I quite understand what this origin is supposed to be good
> for? 

Some background info on BTF is needed here: BTF_ID numbers are not
globally unique identifiers, thus we need to know where it originate
from, to make it unique (as we store this BTF_ID in XDP-hints).

There is a connection between origin "vmlinux" and "module", which is
that vmlinux will start at ID=1 and end at a max ID number.  Modules
refer to ID's in "vmlinux", and for this to work, they will shift their
own numbering to start after ID=max-vmlinux-id.

Origin "local" is for BTF information stored in the BPF-ELF object file.
Their numbering starts at ID=1.  The use-case is that a BPF-prog want to
extend the kernel drivers BTF-layout, and e.g. add a RX-timestamp like
[1].  Then BPF-prog can check if it knows module's BTF_ID and then
extend via bpf_xdp_adjust_meta, and update BTF_ID in XDP-hints and call
the helper (I introduced) marking this as origin "local" for kernel to
know this is no-longer origin "module".

  [1] 
https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction/

> What is a BPF (or AF_XDP) program supposed to do with the
> information "this XDP hints struct came from a module?" without knowing
> which module that was? 

For AF_XDP my claim is the userspace program will already know that
driver it is are talking to because it need to "bind" to a specific
interface (and attach to XDP-prog to ifindex). See sample code[2] for
get_driver_name from ifindex.
Thus, part of using XDP-hints already involves (resolving and) opening
/sys/kernel/btf/driver_name.  So the origin "module" is enough for the
API end-user to make the BTF_ID unique.

Runtime the BPF-prog and kernel can find out what net_device the origin
"module" refers to via xdp_buff->rxq->dev.  When an end-user/program
attach XDP they also need to know the ifindex, again giving them
knowledge that make origin "module" BTF_ID's unique for them,

Same way the origin "local" have meaning to the BPF-prog and user
loading it.

  [2] 
https://github.com/torvalds/linux/blob/v5.18/samples/bpf/xdp_sample_user.c#L1602

> Ultimately, the origin is useful for a consumer
> to check that the metadata is in the format that it's expecting it to be
> in (so it can just load the data from the appropriate offsets). But to
> answer this, we really need a unique identifier; so I think the approach
> in Alexander's series of encoding the ID of the BTF structure itself
> into the next 32 bits is better? That way we'll have a unique "pointer"
> to the actual struct that's in the metadata area and can act on this.

I would really like an explanation from Alexander, how his approach
creates unique identifier across all kernel modules.  I don't get it
from reading the code.  To me it looks like some extra BTF "type"
information about the BTF_ID.

E.g. how do BTF "local" BPF-ELF object's get a unique identifier, that
doesn't overlap with e.g. kernel modules?

>> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
>> and detect if compatible with common struct???
> 
> If we have the unique ID as mentioned above, I think the kernel probably
> could resolve this automatically: whenever a module is loaded, the
> kernel could walk the BTF information from that module an simply inspect
> all the metadata structs and see if they contain the embedded
> xdp_hints_common struct. The IDs of any metadata structs that do contain
> the common struct can then be kept in a central lookup table and the
> consumption code can then simply compare the BTF ID to this table when
> building an SKB?

I'm not against the idea for the kernel to keep track of these structs.
I just don't like the idea of checking this runtime, especially as this
approach for walking all other modules BTF struct's doesn't scale.

> As for the validation on the BPF side:n
> 
>> +	if (flags & HINTS_BTF_UPDATE) {
>> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
>> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
>> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */
> 
> If we use the "global ID + lookup table" approach above, we don't really
> need to validate anything here: if the program says it's writing
> metadata with a format given by a specific ID, that implies
> compatibility (or not) as given by the ID. We could sanity-check the
> metadata area size, but the consumption code has to do that anyway, so
> I'm not sure it's worth the runtime overhead to have an additional check
> here?

As you know I hate "runtime checks", and try hard to push checks to
"setup time".  Maybe we could have verifier (or libbpf) do the check at
setup/load time, by identifying the helper call and check if provided
BTF do match COMPAT_COMMON claim.

For this to work, the verifier need to be able to resolve origin
"module", which happens at BPF load-time, so we would need to set the
ifindex (curr used for XDP-hardware-offload) at BPF load-time.

> As for safety of the metadata content itself, I don't really think we
> can do anything to guarantee this: in any case the BPF program can pass
> a valid BTF ID and still write garbage values into the actual fields, so
> the consumption code has to do enough validation that this won't crash
> the kernel anyway. But this is no different from the packet data itself:
> XDP is basically in a position to be a MITM attacker of the network
> stack itself, which is why loading XDP programs is a privileged
> operation...

I agree, that we cannot stop the end-user from screwing up their
BPF-prog to provide garbage in the fields, as long as it doesn't crash
the kernel.  I do think it would improve usability for end-users if we
can detect and report that their BPF-prog have gotten out of sync with
the running kernel and their claim that their BTF layout are
COMPAT_COMMON isn't actually true.  But I guess it is shouldn't block
the code, as it's only an extra usability help.

--Jesper

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-06-28 16:30 ` [xdp-hints] [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
@ 2022-06-29 14:20   ` Toke Høiland-Jørgensen
  2022-07-01  9:10     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-06-29 14:20 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf; +Cc: xdp-hints

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
> with the XDP-hints.
>
> BPF-prog can query if any XDP-hints have been setup and if this is
> compatible with the xdp_hints_common struct. If XDP-hints are available
> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
> come from different sources or origins e.g. vmlinux, module or local.

I'm not sure I quite understand what this origin is supposed to be good
for? What is a BPF (or AF_XDP) program supposed to do with the
information "this XDP hints struct came from a module?" without knowing
which module that was? Ultimately, the origin is useful for a consumer
to check that the metadata is in the format that it's expecting it to be
in (so it can just load the data from the appropriate offsets). But to
answer this, we really need a unique identifier; so I think the approach
in Alexander's series of encoding the ID of the BTF structure itself
into the next 32 bits is better? That way we'll have a unique "pointer"
to the actual struct that's in the metadata area and can act on this.

> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
> and detect if compatible with common struct???

If we have the unique ID as mentioned above, I think the kernel probably
could resolve this automatically: whenever a module is loaded, the
kernel could walk the BTF information from that module an simply inspect
all the metadata structs and see if they contain the embedded
xdp_hints_common struct. The IDs of any metadata structs that do contain
the common struct can then be kept in a central lookup table and the
consumption code can then simply compare the BTF ID to this table when
building an SKB?

As for the validation on the BPF side:n

> +	if (flags & HINTS_BTF_UPDATE) {
> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */

If we use the "global ID + lookup table" approach above, we don't really
need to validate anything here: if the program says it's writing
metadata with a format given by a specific ID, that implies
compatibility (or not) as given by the ID. We could sanity-check the
metadata area size, but the consumption code has to do that anyway, so
I'm not sure it's worth the runtime overhead to have an additional check
here?

As for safety of the metadata content itself, I don't really think we
can do anything to guarantee this: in any case the BPF program can pass
a valid BTF ID and still write garbage values into the actual fields, so
the consumption code has to do enough validation that this won't crash
the kernel anyway. But this is no different from the packet data itself:
XDP is basically in a position to be a MITM attacker of the network
stack itself, which is why loading XDP programs is a privileged
operation...

-Toke

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-06 13:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-04 11:00 [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Zaremba, Larysa
2022-07-04 18:26 ` Jesper Dangaard Brouer
2022-07-05 17:07   ` Larysa Zaremba
2022-07-06 13:29     ` Jesper Dangaard Brouer
  -- strict thread matches above, loose matches on Subject: below --
2022-06-28 16:30 [xdp-hints] [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
2022-06-28 16:30 ` [xdp-hints] [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
2022-06-29 14:20   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-07-01  9:10     ` Jesper Dangaard Brouer
2022-07-01 12:09       ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox